A group of ten law professors have submitted a so called amicus brief as part of the AI copyright legal battle between several authors and Meta setting out the arguments for why training AI models with existing content does not qualify as ‘fair use’ under American copyright law. It follows the recent submission of another amicus brief by four different professors who argued the opposite.
The new brief insists that Meta’s use of the authors’ work when training its Llama AI model is “a commercial use that takes the expression in those works”, is “not transformative” and does not qualify as “intermediate copying”. These factors “taken together”, they insist, “weigh conclusively against a finding of fair use”.
The four academics who make the opposite argument mainly do so by insisting that AI training does, in fact, qualify as “intermediate copying”, which they define as “internal copying designed to produce a non-infringing output”. Rejecting “existing precedent about the fairness of intermediate uses”, they write, would, “be harmful to the copyright system as a whole, which is designed to encourage the production of new knowledge and new works”.
Urging the judge considering the case to favour their view, the professors who argue that AI training is not fair use claim that Meta’s bid for a judgement endorsing its fair use defence is “a breathtaking request for greater legal privileges than courts have ever granted human authors”.
Meanwhile the other group of academics argue that - from “photocopying to home VCRs to the internet” - rightsholders have always tried to restrict the fair use principle when new technologies have emerged, even though those technologies have “routinely created new markets” for the same copyright owners.
Comedian Sarah Silverman and novelist Richard Kadrey are among the authors who sued Meta, accusing the tech company of copyright infringement for making copies of their books without permission when collating a training dataset for Llama.
Meta, like many AI companies, argues that AI training constitutes ‘fair use’ under American law, which means it didn't need permission to make copies of the authors’ books.
This is the core argument at the heart of numerous lawsuits filed in the US courts by copyright owners against AI companies, including the record industry’s lawsuits against Suno and Udio, and the litigation filed by various music publishers against Anthropic.
The Meta case is further along than many of the other cases, with the judge now considering two motions for summary judgement. The authors want summary judgement that their copyrights were directly infringed by Meta when it collated its dataset. Meta wants summary judgement that its use of the books without permission was legal because that use was fair use.
Given so many AI copyright cases swing on the fair use defence, both the copyright industries and the tech sector are watching this case closely.
Although a US court did recently reject an AI company’s fair use defence in a copyright legal battle with Thomson Reuters, that dispute related to an AI-powered search engine, not a generative AI model.
Therefore the Meta case is much more relevant to all the other cases against generative AI models like those developed by Suno, Udio and Anthropic.
The fair use doctrine
Fair use is a complicated principle at the best of times. As this Stanford University guide explains, there are four main factors judges must consider when trying to decide if the use of a copyright protected work is fair use: “the purpose and character of the use; the nature of the copyrighted work; the amount and substantiality of the portion taken; and the effect of the use upon the potential market”.
Precedents set in numerous previous fair use cases then inform how judges apply those four factors. If a use is commercial and would negatively impact on the value of the original work, it is less likely to be considered fair. If a use is transformative, so the material copied from the original work is transformed in some way, it is more likely to be considered fair.
Then there is “non-expressive use”, which the Authors Alliance defines as uses that “involve copying, but don’t communicate the expressive aspects of the work to be read or otherwise enjoyed”, and which may also be fair use. And “intermediate copying”, where copying is a step to creating a work that is transformative and non-infringing, and which is often deemed fair use as well.
The amicus briefs
All those things are considered in the new amicus brief that concludes AI training is not fair use. “Meta’s claim that its unauthorised copying of plaintiffs’ works to train its large language models is fair use is a breathtaking request for greater legal privileges than courts have ever granted human authors”, the academics write.
AI training is not a transformative use, they go on, because “using works for that purpose is not relevantly different from using them to educate human authors, which is a principal original purpose of all of plaintiffs’ works”. Plus, the aim is to train an AI model which will “enable the creation of works that compete with the copied works in the same markets – a purpose that, when pursued by a for-profit company like Meta, makes the use undeniably commercial”.
“Use of works to train large language models is also not a ‘non-expressive use’”, they go on, because it “incorporates the expressive choices of the authors of those works into the models”. And while they concede that “intermediate copying” to “enable the discovery of functional software interfaces has justifiably been held to be a fair use”, Meta’s copying here “has no such purpose”.
Of course none of these academics are directly involved in the Meta case and the judge isn’t obliged to consider their amicus briefs. However, if he does, the focus will likely be on the “intermediate copying” arguments, which is where the professors strongly disagree.
The earlier brief is adamant that the relevant case law, including previous precedent that is binding on the California court where this lawsuit has been filed, “holds that internal copying, made in the course of creating new knowledge, is a transformative use that is heavily favoured by fair use doctrine”. And that applies here, the four professors behind the earlier brief reckon.
They then urge the judge to ignore the doom and gloom predictions being made by the copyright industries regarding what is at stake in this dispute.
“Copyright owners have often predicted that new technologies, from photocopying to home VCRs to the internet, would create disasters for copyright owners and that fair use needed to be shrunk to protect them”, they write. But instead, they argue, “new technologies have routinely created new markets”.
“This history”, the academics conclude, should “caution against rejecting the many precedents supporting intermediate copying for the purpose of creating new and useful tools that millions of people use. Whatever the risks of AI - and there may be many - condemning the act of creating large-scale training datasets as copyright infringement is not the answer”.