No copyright exception for AI training in European law, says new report

MEP Axel Voss yesterday told the European Parliament’s legal affairs committee why he believes the European Commission should “urgently” assess whether or not copyright law in the European Union “adequately addresses the legal uncertainty and competitive effects” of AI companies using existing content to train their generative AI models.

The MEP was speaking at a meeting that followed the recent publication of a report commissioned by the committee which concludes that the much talked about ‘text and data mining’ (or TDM) exception in EU copyright law - which the UK government is considering replicating - was not, in fact, designed “to accommodate the expressive and synthetic nature of generative AI training”.

Many AI companies argue that the EU TDM exception - which originates in article four of the 2019 EU Copyright Directive - means they can make use of copyright protected works to train their models without getting permission, except where rightsholders have formally opted out.

But the report, written by Spanish law professor Nicola Lucchi, states that the TDM exception was intended to enable data mining that would extract “patterns, trends or factual correlations” from large datasets, mainly in the context of scientific research and data analysis. Generative AI systems, Lucchi writes, “diverge fundamentally from this intended use”.

Generative AI models do not just “extract semantic content” or “identify correlations among facts”, she explains, they “absorb, encode and recombine stylistic, structural, and expressive features of the works on which they are trained”. She then argues that context provided elsewhere in the directive confirms that article four was not intended to apply to generative AI.

Voss has written his own paper on copyright and AI which he discussed at yesterday’s committee meeting. He says he is keen to find an approach that balances the interests of copyright owners and companies developing AI models, however something bespoke is needed rather than relying on article four of the 2019 directive, which wasn’t written with AI in mind. That might involve revisited article four or devising a new AI-specific copyright exception, he adds.

Obviously creators and copyright owners, including those in the music industry, would prefer it if there was no exception for generative AI training, forcing AI companies to seek permission from - and pay licensing fees to - all the relevant rightsholders.

Lucchi’s conclusion that the existing TDM exception in EU law doesn’t apply to AI will be welcomed by the music community. Given the US Copyright Office's conclusion earlier this year that the similar principle under American law - ‘fair use’ - will not always apply to AI training either, these things strengthen the music industry's position as it seeks to force AI companies into licensing talks.

Lucchi’s report also includes a useful rebuttal of a common line used by AI companies seeking to defend their use of unlicensed content: that an AI model mining large datasets to learn how to generate text, images, videos, audio and music is no different than human beings consuming existing content and art in order to inform their own creative output.

That comparison “does not hold” under EU law, the professor writes. “Unlike human authors, who understand ideas and express them in new ways, AI systems do not ‘understand’ what they process”, she adds. Citing philosopher Luciano Floridi, she goes on, “AI acts without understanding - it follows statistical patterns rather than engaging with meaning”.

This difference matters legally, she insists. “A person can learn from a work and restate the ideas in their own words without infringing copyright. But an AI system must copy and recombine parts of protected works to function”.

“Even if the final output looks different from the training data”, she says, “the act of ingesting and using protected content is still legally considered reproduction - and may require permission from rightsholders unless a clear legal exception applies”.

No copyright exception for AI training in European law, says new report

Suno is a feral pig rampaging through a well-tended garden, say creator orgs

Merlin CEO Jeremy Sirota said “independent music is not raw material for tech companies to exploit without consent” before quitting to join AI music scraper-in-chief Suno

Anna’s Archive rolls out the red carpet for robots - and asks them to “persuade” their humans to make donations