This browser is not actively supported anymore. For the best passle experience, we strongly recommend you upgrade your browser.

IP & Media Law Updates

| 6 minute read

JUDGE RAKOFF’S DECISION IN INTERCEPT MEDIA V. OPENAI COPYRIGHT CASE CREATES SDNY SPLIT

In a surprising split with recent Southern District of New York (SDNY) precedent, the court in Intercept Media, Inc. v. Open AI, Inc. has denied OpenAI’s motion to dismiss a lawsuit alleging that it improperly removed copyright management information (CMI) from news articles that were used to train ChatGPT.  This decision represents a departure from Judge Colleen McMahon’s ruling in Raw Story Media, Inc. v. OpenAI, Inc. from November 2024, where nearly identical claims were dismissed due to insufficient evidence of harm. While the court’s decision to allow the case to move forward in Intercept Media could serve as a blueprint for other content creators in their ongoing attempts to curtail the unauthorized use of their copyrighted works in the training of generative AI tools, proving such CMI claims may ultimately be challenging from an evidentiary standpoint. 

What is CMI?

In the wake of the explosion of generative AI in recent years, dozens of claims by content creators have been filed across the country against AI companies alleging improper and unauthorized use of their copyrighted materials to train AI tools.  While many of these lawsuits allege direct or indirect copyright infringement, other copyright holders have begun testing alternative theories of liability.  One such theory of liability is based upon the removal of CMI. 

CMI refers to data that accompanies a copyrighted work, typically identifying the author, copyright holder, and details about the terms of use or rights associated with the work. This information can be embedded directly in digital works or appear on accompanying documents (such as metadata, watermarks, or copyright notices). The Digital Millenium Copyright Act (DMCA) provides that  “[n]o person shall . . . intentionally remove or alter any [CMI],” or “distribute . . . works . . . , knowing that [CMI] has been removed or altered,” if they know or have “reasonable grounds to know, that it will induce, enable, facilitate, or conceal an infringement.”   Accordingly, the DMCA provides both civil and criminal penalties for the act of knowingly removing or altering CMI to conceal the identity of the copyright holder or the terms of the work's use, particularly if it leads to further infringement by others.

Case Background

In February of 2024, the Intercept, a nonprofit news and investigative journalism organization, sued OpenAI, the developer of ChatGPT, alleging that OpenAI used its copyrighted articles without authorization to train ChatGPT and intentionally removed CMI from those articles in the training process in violation of the DMCA.  The Intercept claims that as a result of OpenAI’s improper and intentional removal of its CMI, outputs generated by ChatGPT “regurgitate verbatim or nearly verbatim” its copyright-protected works without providing information regarding the author, title, copyright, or terms of use information contained in those works.

In April 2024, OpenAI moved to dismiss the Intercept’s complaint, arguing that it failed to sufficiently allege Article III standing and that it failed to state a claim under § 1202(b).  The Intercept was subsequently granted leave to amend its complaint, which it did in June 2024, and OpenAI submitted a supplemental memorandum of law in July 2024. In November 2024, SDNY Judge Jed Rakoff denied, in part, OpenAI’s motion to dismiss, allowing the Intercept’s CMI claim to move forward.  On February 20, 2025, Judge Rakoff issued his full opinion explaining the Court’s decision. 

The Decision 

Article III Standing

OpenAI argued that the Intercept failed to demonstrate a concrete injury required in order to have Article III standing.  OpenAI claimed that the Intercept failed to identify an actual or imminent injury because it did not allege that any of its copyrighted works have ever actually been displayed by ChatGPT to anyone other than The Intercept itself.  According to OpenAI, the Intercept’s evidence of the injury it has allegedly suffered was merely a result of its own manipulation of the inputs that it fed into ChatGPT.

Rejecting OpenAI’s arguments, the district court held that the Intercept sufficiently pleaded injury under the DMCA because it was similar to the property-based harm traditionally actionable in copyright, namely in that it concerns the protection of authorship and ownership information.  Despite the fact that the specific right at issue in this case, the inclusion of CMI, was not explicitly included in or contemplated by Section 106 of the Copyright Act as it was originally enacted, Judge Rakoff noted that authors’ interests in their works have “evolve[d] over time” to address technological advancements, and that Intercept’s alleged injury “implicates the same incentives to create that justify traditional copyright.”  According to Judge Rakoff, “[t]he increased possibility of infringement makes it more likely that The Intercept (or some other publication) will no longer find it worthwhile to create new articles.”  Thus, even though the specific right created by the DMCA is “comparatively new,” the alleged injury experienced by the Intercept has a close relationship with the property-based harms traditionally associated with copyright law and stems from the same kind of harm “long recognized in copyright suits.” In addition, the district court held that a copyright injury “does not require publication to a third party,” finding unpersuasive OpenAI’s argument that the Intercept failed to demonstrate a concrete injury because it had not conclusively established at the motion to dismiss phase that other ChatGPT users had experienced similar interactions with ChatGPT. 

Failure to State a Claim

Because of its finding that the Intercept had met its burden to establish Article III standing, the district court moved on to address OpenAI’s arguments that the Intercept failed to state a claim.  On the Intercept’s claim of unlawful distribution of its copyrighted articles, Judge Rakoff found that the complaint included no factual support for its allegation that OpenAI distributed the articles.  Although the complaint alleges that OpenAI trained ChatGPT using copies of the Intercepts works, and that OpenAI has shared its training-set data with third parties, the Intercept did not provide any evidence that OpenAI knowingly distributed copies of its articles after allegedly removing the DMCA-protected information.  The district court therefore granted OpenAI’s motion to dismiss the Intercept’s distribution claim. 

However, the district court denied OpenAI’s motion to dismiss the CMI claim, finding that the Intercept plausibly alleged that OpenAI intentionally removed CMI from its articles. As described in the district court’s decision, CMI claims impose a “double scienter requirement,” under which a party must prove that (1) CMI was intentionally removed from a copyrighted work, and (2) that the alleged infringer knew or had reasonable grounds to know that the removal of CMI would “induce, enable, facilitate, or conceal” copyright infringement.

The Intercept was able to support its CMI claim by identifying the specific training sets that it alleges OpenAI uses to train ChatGPT, including specific URLs from its website. Further, the Intercept also presented evidence that the algorithm that OpenAI uses to build its training sets can only capture an article's main text, which excludes CMI.  The more difficult question, according to Judge Rakoff, is whether OpenAI knew or had reasonable grounds to know that the removal of CMI would induce, facilitate, or conceal copyright infringement.  On this issue, the district court was persuaded by the Intercept’s argument that OpenAI knew when it allegedly removed its CMI that it could result in “downstream infringement” by ChatGPT users because (1) OpenAI knew that ChatGPT “‘regurgitate[s] verbatim ... copyright-protected works of journalism ’ without CMI, ‘[a]t least some of the time,’” and (2) because ChatGPT is promoted as a tool by which users can generate content for further use.  Accordingly, the Intercept was able to satisfy the “double scienter” requirement for its CMI claims. 

TAKEAWAYS 

Perhaps the most notable aspect of Judge Rakoff’s decision is its failure to address, or even mention, prior SDNY precedent recently established in Raw Story Media, Inc. v. OpenAI, Inc., In Raw Story, Judge Colleen McMahon reached the opposite conclusion on the issue of Article III standing, finding that the plaintiffs failed to provide sufficient support to demonstrate that their content was used to train ChatGPT or that such use would cause harm even if OpenAI did use the copyrighted works to train ChatGPT because “the likelihood that ChatGPT would output plagiarized content from one of Plaintiffs’ articles seems remote” based on “the quantity of information contained in the repository.” Accordingly, Judge Rakoff’s February 2025 decision creates a split within SDNY that could complicate the resolution of similar claims against AI platforms asserting the removal of CMI in the future.   

Having survived OpenAI’s motion to dismiss, the Intercept will continue to pursue its CMI claims.  If the Intercept’s CMI claims ultimately prevail, the court’s decision in this case could impact the way AI models are trained by changing how attribution to original sources needs to be provided. We will continue to monitor this case as it develops. 

 

 

Tags

ai, copyright, cmi, artificial intelligence, dmca