First major ruling on AI and fair use goes against the copyright industries, though with a silver lining relating to pirated training content

The first US court decision on the crucial question as to whether or not training a generative AI model with existing content constitutes fair use under American copyright law is in, and Californian judge William Alsup says that it is. Which means AI companies - in this case Anthropic - can make use of existing content when training models without getting permission from the relevant creators or copyright owners.

Anthropic was sued for copyright infringement by a group of authors over its copying of millions of books in order to train its AI model Claude. More then seven millions books were taken from piracy sites online, although the AI company later decided to buy and digitise a million books for its training dataset.

Given numerous US lawsuits between copyright owners and AI companies - including those filed by record labels and music publishers - swing on this fair use question, this is a big win for the AI sector.

Although it will almost certainly be appealed, especially if judges considering other similar cases in other US courts reach a different conclusion. And another aspect of Alsup’s ruling favours rightsholders and could force AI companies to the negotiating table. As far as the judge is concerned, Anthrophic initially training Claude on millions of pirated books was not fair use - meaning that activity was likely copyright infringement.

Nevertheless, Anthropic sees the ruling as a big win, honing in on Alsup’s conclusion that its use of existing books was “spectacularly transformative”, which is what justifies the fair use defence.

In a statement to Law360, the AI company said it was pleased that the court recognising that training AI models “was transformative - spectacularly so”. It then cited another line from Alsup’s ruling, that said it had made use of existing books “not to race ahead and replicate or supplant them” but instead “to turn a hard corner and create something different”.

However, because of Alsup’s other conclusion, that “Anthropic had no entitlement to use pirated copies” for its training data, the AI company still has a copyright infringement legal battle to fight.

Given US law allows copyright owners to claim up to $150,000 per infringed work - and with millions of pirated books copied - that’s potentially mega-damages. Which might make some sort of settlement and licensing deal with the authors and their publishers seem like a better option.

Although AI companies are often vague about where they source the content used to train their models, it’s assumed many have scraped content off the internet from illegitimate sources. Certainly its thought music AI start-ups Suno and Udio took that approach. For those companies, Ulsop’s conclusion on fair use and pirated training content is not good news.

The copyright obligations of AI companies has been a big debate all over the world. The copyright industries - including the music industry - are adamant that AI companies making use of existing content to train their models must first get permission from the relevant rightsholders.

But many AI companies argue that they do not need permission, because AI training is covered by specific copyright exceptions in some countries, or the concept of fair use in the US.

That fair use defence is being tested in numerous cases working their way through the US courts, including the lawsuits filed against Suno and Udio by the major record companies, and another lawsuit involving Anthropic filed by a group of music publishers.

Things previously seemed to be skewing towards the copyright owners in this domain, with the first judgement on copyright and AI - but not generative AI - concluding that AI training was not fair use. Then the US Copyright Office published a report saying that AI training might be fair use in some circumstances, but probably isn’t in others.

That conclusion by the Copyright Office might mean that a decision on fair use in one AI case doesn’t necessarily set a wide-ranging precedent in all cases - and instead the slightly complicated framework for assessing the fair use defence will have to be applied on a case by case basis.

Nevertheless, the music industry - and wider copyright industries - would prefer strong judgements against fair use and will be hoping this ruling is successfully appealed. And that the imminent judgement in a similar but different case involving Meta swings the other way.

Though, in the meantime, they will also take heart that Alsup’s ruling on the pirated training content issue will strengthen their position as they seek to force the AI companies to the negotiating table.

First major ruling on AI and fair use goes against the copyright industries, though with a silver lining relating to pirated training content

Warner Music boss outlines AI strategy of “legislate, litigate and license”, but only commits to artist opt-in on “name, image, likeness and voice”

Regulators must consider UMG’s AI deals in Downtown case, says IAO President

BBC Introducing criticised for featuring AI-generated track in Artist Of The Month spot