Not a joke: Comedian and author Sarah Silverman is one of the lead plaintiffs in a pair of lawsuits against meta and OpenAI accusing the tech companies of illegally using copyrighted works to train their artificial-intelligence systems.
The books cited in the lawsuits include Silverman’s 2010 bestselling memoir “The Bedwetter: Stories of Courage, Redemption and Pee.” The federal lawsuits, filed Friday, July 7, allege that OpenAI’s ChatGPT and meta’s LLaMA both ingested text from “The Bedwetter” and other works to train their large language models (LLMs) — without the consent of (or compensation to) authors such as Silverman.
meta declined to comment. OpenAI did not respond to PvNew‘s request for comment.
Silverman is one of three authors named as plaintiffs, alongside novelist Christopher Golden (whose books include “Ararat”) and Richard Kadrey, author of the Sandman Slim supernatural noir series. The suits — filed in the U.S. District Court for the Northern District of California, San Francisco Division — seek class-action status and unspecified monetary damages. A copy of the lawsuit against meta is at this link and the suit against OpenAI is at this link. The lawyers representing the three authors, Joseph Saveri and Matthew Butterick, last month filed a similar lawsuit against OpenAI on behalf of authors Paul Tremblay and Mona Awad.
OpenAI introduced ChatGPT in November 2022. San Francisco-based OpenAI is a private research lab that develops AI technologies, founded in 2015 as a nonprofit organization by Elon Musk (who is no longer on the board of OpenAI) and CEO Sam Altman.
While OpenAI has not specified what is included in its datasets for ChatGPT, the lawsuit against the company alleged, the only “internet-based books corpora” that have ever included the volume of material believed to be used by OpenAI are “flagrantly illegal shadow libraries” (which purportedly contain the plaintiffs’ copyrighted work). The complaint alleged that when ChatGPT “was prompted to summarize books written by each of the Plaintiffs, it generated very accurate summaries… which means that ChatGPT retains knowledge of particular works in the training dataset and is able to output similar textual content. At no point did ChatGPT reproduce any of the copyright management information Plaintiffs included with their published works.”
In their lawsuit against meta, the plaintiffs’ attorneys alleged that to train the LLaMA (Large Language Model meta AI) language models, the company copied a massive books dataset that includes the works of the three named authors.
Silverman is a two-time Emmy-winning comedian, actor, writer and producer. In the spring of 2022, her off-Broadway musical adaptation of “The Bedwetter” had a sold-out run with the Atlantic Theatre Co. She currently hosts “The Sarah Silverman Podcast” and will host TBS’s upcoming “Stupid Pet Tricks,” an offshoot of the famous David Letterman late-night segment.
Now dive into VIP+’s expansive subscriber report …