Writers VS AI: The Beginning of Copyright Lawsuits

US comedian and author Sarah Silverman, along with authors Christopher Golden and Richard Kadrey, is suing OpenAI and Meta for copyright infringement. They claim that their works were used without permission to train the AI models developed by the companies, namely ChatGPT and LLaMA.

Marija Stojadinović
12/07/2023

Reading Time: 2 minutes

Illustration: Lenka T.

TABLE OF CONTENTS

About the lawsuit

Why are writers angry at OpenAI and META?

What do META and OpenAI say about the lawsuit?

About the lawsuit

Last week, a US federal class-action lawsuit was filed by the Joseph Saveri Law Firm against OpenAI and Meta on behalf of authors including Sarah Silverman, Christopher Golden, and Richard Kadrey.

The lawsuit accuses the companies of illegally using copyrighted material to train their AI language models, such as ChatGPT and LLaMA. The lawsuits claim violations of the Digital Millennium Copyright Act, unfair competition laws, and negligence.

The Joseph Saveri Law Firm has previously filed similar lawsuits related to generative AI, including suits against GitHub Copilot and AI image generator companies.

The firm argues that AI models like ChatGPT and LLaMA are "industrial-strength plagiarists" that infringe upon the rights of book authors. The lawsuits demand permanent injunctive relief and jury trials, with the aim of forcing Meta and OpenAI to make changes to their AI tools.

The allegations suggest that OpenAI and Meta have utilized data sets that contain copyrighted materials without consent. For instance, it is claimed that ChatGPT was trained on books allegedly downloaded from “shadow library” websites like Library Genesis, Z-Library, Sci-Hub, and Bibliotik. Meta’s LLaMA is said to have been trained on a data set called ThePile, which is alleged to include copyrighted books from Bibliotik.

This case builds on a broader conversation about the ownership and understanding of the term “author” in the context of artificial intelligence, which is still an underregulated field, the so-called “legal limbo”, which is often abused by big tech companies.

Why are writers angry at OpenAI and META?

In other words, the lawsuit against OpenAI alleges that the authors’ copyrighted books were used as training material for ChatGPT without their consent. Similarly, the suit against Meta claims that the authors’ works appear in the dataset used to train LLaMA.

The exhibits in the OpenAI suit show that ChatGPT summarized three books when prompted: The Bedwetter by Silverman, Ararat by Golden, and Sandman Slim by Kadrey. The Meta suit references multiple works by Kadrey and Golden, as well as The Bedwetter, and highlights a Meta paper indicating the use of material from shadow libraries in LLaMA’s training datasets.

Joseph Saveri and Matthew Butterick, the lawyers representing the authors, state that they have received concerns from writers, authors, and publishers regarding ChatGPT’s ability to generate text similar to copyrighted material.

The lawsuits also question whether ChatGPT and LLaMA themselves are infringing derivative works based on copyrighted materials. Authors argue that the models demonstrate knowledge of specific works in their training data, as they can accurately summarize copyrighted books.

The removal of copyright-management information (CMI) is another point of contention. Authors allege that OpenAI intentionally removed CMI, allowing the models to produce summaries without citing copyright holders. This allegedly enables OpenAI to profit unfairly from unattributed reproductions of copyrighted works.

The lawsuits raise various legal questions and seek restitution for alleged lost profits. The authors express concern about companies profiting from their copyrighted materials without consent, and they aim to protect their rights and ensure proper credit and compensation for their work.

What do META and OpenAI say about the lawsuit?

Neither Meta nor OpenAI provided immediate comments regarding the lawsuits. The Saveri Law Firm emphasized that the suit represents a broader fight to preserve ownership rights for all artists and creators, as allowing such behavior to continue could lead to AI models replacing authors whose works power these AI products.

Moreover, the legal action against OpenAI also extends to false answers produced by AI models, known as “hallucinations.” A radio host from Georgia is suing OpenAI for defamation after a false statement was made by the AI.

Official statements by OpenAI and META on this problem, as well as proposals for a solution that would satisfy both sides, are expected in the upcoming days.

Ai Artificial Intelligence Chatgpt Meta Openai