Major publishing groups have asked a U.S. federal court for permission to join an existing lawsuit against Google, escalating legal pressure over how artificial intelligence systems are trained on copyrighted material. On January 15, publishers including Hachette Book Group and Cengage Learning filed a request in California seeking to intervene in a proposed class action that accuses Google of using protected works without authorization to develop its AI models. The move could significantly expand the scope and potential financial exposure of the case.
In their filing, the publishers alleged that Google copied large volumes of books and educational content to train its AI systems, calling the practice one of the most extensive instances of copyright infringement to date. They said works from both trade publishing and academic catalogs were allegedly used without consent or compensation. The case initially focused on claims brought by visual artists over the training of an AI image generation tool, but the publishers argue that text-based works raise distinct legal and evidentiary issues that warrant their direct participation.
The publishers’ trade group, the Association of American Publishers, said its members are uniquely positioned to address how large language models ingest, store, and reproduce written content. If allowed to intervene, the publishers plan to seek monetary damages on behalf of themselves and a broader class of authors and rights holders. Google did not immediately comment on the request, but has previously argued that its AI training practices fall under fair use principles.
The lawsuit is part of a wider wave of legal challenges confronting AI developers as generative tools move rapidly into mainstream use. Authors, artists, and media companies have increasingly questioned whether existing copyright law permits the large-scale ingestion of protected works for model training. Other AI firms have faced similar claims, with some opting to settle rather than test the issue in court, underscoring the high stakes surrounding data sourcing for AI systems.
A federal judge will now decide whether to allow the publishers to formally join the case. The outcome could shape how courts view copyright protections in the age of generative AI and influence how technology companies source data going forward. As AI becomes more deeply embedded across industries, the dispute highlights growing tension between content creators and platforms building models that rely on vast amounts of human-generated material.



