Nearly 400 Newspapers Just Sued OpenAI and Microsoft
The fight over who owns the data behind AI just got much bigger. Nearly 400 newspapers filed a joint lawsuit on Wednesday against OpenAI and Microsoft, accusing them of scraping published articles to train AI models without permission or payment. It is one of the largest copyright actions the AI industry has faced.
The fight over who owns the data behind AI just got much bigger. Nearly 400 newspapers filed a joint lawsuit on Wednesday against OpenAI and Microsoft, accusing them of scraping published articles to train AI models without permission or payment. It is one of the largest copyright actions the AI industry has faced.
The complaint goes to the heart of how these models are built. Large language models learn from enormous amounts of text pulled from the internet, including news archives, and the publishers argue that using their journalism this way is taking valuable work for free. The companies have generally leaned on fair use, the legal idea that some copying is allowed, while publishers say wholesale training crosses the line.
The scale is what makes this one matter. Earlier suits came from single outlets or small groups, but nearly 400 papers acting together turns a series of skirmishes into a unified front, with far more legal weight and bargaining power. A loss or a costly settlement here would set a benchmark for every other content owner weighing the same fight. Numbers change the leverage.
The money question sits underneath it all. If courts decide AI firms must license the content they train on, the cost of building models rises and a new revenue stream opens for publishers, news organizations, and eventually other creators. Some companies have already signed licensing deals rather than fight, a sign the industry senses the free-data era may be ending. Training data could become a line item.
The honest uncertainty is that the law is unsettled. Courts have not clearly ruled whether training on copyrighted text is fair use, cases are moving slowly, and outcomes have been mixed so far. A suit being filed is not a verdict, and these fights can take years and end in quiet settlements. What is clear is the direction, more pressure on AI firms to pay for their inputs.
So the data that powers AI is becoming a legal and financial battleground, and the publishers just escalated. Almost 400 papers, two of the biggest AI names, and the fair-use question at the center. The cost of training may not stay free much longer. Watch for early rulings and whether more licensing deals follow.