Encyclopedia Britannica Sues OpenAI for Using Copyrighted Content

Encyclopedia Britannica has filed a lawsuit against OpenAI, alleging copyright infringement. The company claims that OpenAI used its copyrighted content to train its AI, then generated responses that were substantially similar to the original content. This includes full or partial verbatim reproductions and made-up content falsely attributed to Britannica.

Encyclopaedia Britannica and Merriam-Webster have jointly filed a lawsuit against OpenAI in Manhattan federal court, alleging massive copyright infringement . The publishers claim OpenAI scraped nearly 100,000 Britannica articles to train its large language models without permission, resulting in ChatGPT generating responses that contain full or partial verbatim reproductions of copyrighted content. The suit also alleges trademark violations under the Lanham Act, citing instances where ChatGPT fabricated hallucinated content and falsely attributed it to Britannica.

The lawsuit adds to a growing wave of legal challenges facing OpenAI from publishers including The New York Times, Ziff Davis, and dozens of newspapers across North America. Britannica argues that ChatGPT "starves web publishers of revenue by generating responses that substitute, and directly compete with" original content. OpenAI responded that its models are "trained on publicly available data and grounded in fair use," signaling another protracted legal battle over AI training data rights.

For investors, the case could set important precedents for the AI industry's content licensing costs. If courts side with publishers, companies like OpenAI and its major backer MSFT may face billions in retroactive licensing fees, potentially reshaping the economics of large language model development. The outcome could also affect GOOG, which faces similar scrutiny over its Gemini AI training practices.