Plaintiffs' Argument that Meta's Large Language Model is an Infringing Derivative Work Makes Judge's Head Explode 🤯

Get in touch

Brian Murphy

Partner

Get in touch

Brian Murphy

Partner

Today I report on another opinion out of the Central District of California that addresses whether generative AI tools infringe upon the rights of copyright owners. (For our coverage of other recent AI cases, see this post and this post.)

The plaintiffs (Richard Kadrey, Sarah Silverman, and Christopher Golden) are authors of best-selling books. Earlier this year, the trio sued Meta, claiming that its large language models (known as LLaMA) infringed upon their copyrights. Specifically, the plaintiffs alleged (among other claims) that (1) Meta was guilty of direct copyright infringement for using their books to train LLaMA, (2) the LLaMA models are themselves infringing derivative works, and (3) all outputs that LLaMA generates based on user prompts also are infringing derivative works. Meta moved to dismiss all counts except for the central (and juiciest) one, i.e., the plaintiffs' claim that Meta's use of their books in training constituted infringement. Resolution of that issue - and Meta's defense that the use of copyrighted content for training purposes constitutes a transformative fair use - must await another day. (See this post and this post.)

Today, I want to focus on the plaintiffs' claims that the LLaMA models themselves, and their output, constitute infringing derivative works based on the plaintiffs' books.

The right to prepare a derivative work is one of the exclusive rights granted to copyright owners. (17 U.S.C § 106.) The Copyright Act defines a derivative work as:

"A work based upon one or more preexisting works, such as a translation, musical arrangement, dramatization, fictionalization, motion picture version, sound recording, art reproduction, abridgment, condensation, or any other form in which a work may be recast, transformed, or adapted. A work consisting of editorial revisions, annotations, elaborations, or other modifications which, as a whole, represent an original work of authorship, is a 'derivative work.'" (17 U.S.C § 101; emphasis supplied.)

Does it make sense to say that the LLaMA models, and the output of those models, are derivative works because they are “based upon” or “recast, transform or adapt” the plaintiffs' copyrighted books (and the millions of other copyrighted works that were used in training)? District Judge Vince Chhabria certainly didn't think so, and he dismissed these claims in a terse, four-page opinion.

First, the court rejected as "nonsensical" the plaintiffs’ argument that the LLaMA models were derivative works simply because LLaMA could not “function without the expressive information extracted” from the plaintiffs’ books. As the court put it, ”[t]here is no way to understand the LLaMA models themselves as a recasting or adaptation of any of the plaintiffs’ books." At oral argument, Judge Chhabria was even blunter, noting that the plaintiffs' theory "would have to mean that if you put the LLaMA language model next to Sarah Silverman's book, you would say they're similar.… That makes my head explode when I try to understand that.” (Mine too - 🤯!)

Second, the plaintiffs did not allege in their complaint that any specific output constituted a derivative work based on their books. Instead, they argued that every output of the LLaMA models were derivative works since the output was “based on expressive information extracted from Plaintiffs’ Infringed Works.” That the books had been used in training was not, in and of it self, sufficient to make out a viable claim that all output qualified an infringing derivative works:

"The plaintiffs are wrong to say that, because their books were duplicated in full as part of the LLaMA training process, they do not need to allege any similarity between LLaMA outputs and their books to maintain a claim based on derivative infringement. To prevail on a theory that LLaMA’s outputs constitute derivative infringement, the plaintiffs would indeed need to allege and ultimately prove that the outputs incorporate in some form a portion of the plaintiffs’ books." (Cleaned up.)

The court continued:

"To the extent that [the plaintiffs] are not contending LLaMa spits out actual copies of their protected works, they would need to prove that the outputs (or portions of the outputs) are similar enough to the plaintiffs’ books to be infringing derivative works. And because the plaintiffs would ultimately need to prove this, they must adequately allege it at the pleading stage."

Judge Chhabria's statements from the bench during oral argument captured what he saw one of the flaws in the plaintiffs' theory: “When I make a query of LLaMA, I'm not asking for a copy of Sarah Silverman's book – I'm not even asking for an excerpt.” And, just as important, users are not likely to get output that includes a material quantity of protected expression because of the way large language models work. Moreover, many large language models, including Chat GPT, have implemented mechanisms designed to prevent output that includes verbatim copyrighted text. (Here's an experiment: ask Chat GPT to provide you with the first chapter of a book that is still under copyright. The response may surprise you.)

At the end of the day, Judge Chhabria is on the same page as his colleague Judge William H. Orrick who is presiding over Andersen v. Stability AI Ltd., a case brought by a group of visual artists against Stability AI, the developer of Stable Diffusion (a text-to-image model), DeviantArt, and Midjourney. In an opinion issued at the end of October (see this post), Judge Orrick dismissed the plaintiffs' claims alleging that the defendants' generative AI models created output images that were infringing derivative works based on the training images. While Judge Orrick (like Judge Chhabria in this case) gave leave to amend (despite the “strong case” for dismissing with prejudice), he made it clear that he thought the plaintiffs have a steep hill to climb:

"The other problem for plaintiffs is that it is simply not plausible that every Training Image used to train Stable Diffusion was copyrighted (as opposed to copyrightable), or that all DeviantArt users' Output Images rely upon (theoretically) copyrighted Training Images, and therefore all Output images are derivative images. Even if that clarity is provided and even if plaintiffs narrow their allegations to limit them to Output Images that draw upon Training Images based upon copyrighted images, I am not convinced that copyright claims based a derivative theory can survive absent “substantial similarity” type allegations."

Bottom line: arguments that generative AI tools and their output constitute infringing derivative works based on training content are getting no traction with the courts. To survive a motion to dismiss, a plaintiff will need to allege that specific output of the challenged model is based upon, and (substantially) similar to, specific copyrighted works used in training. Having facts to back up that allegation could pose a real challenge and make the plaintiffs 🤯 and/or 😭.

Kadry v. Meta Platforms, Inc., No. 23-cv-03417-VC (N.D. Cal. Nov. 20, 2023)