Two American journalists have filed the latest lawsuit accusing OpenAI of copyright infringement for training its ChatGPT model with copyright protected works without getting permission. Their litigation follows the filing of a similar lawsuit by the publisher of the New York Times over the Christmas break.
Nicholas A Basbanes and Nicholas Gage, the latter of whom previously wrote for the New York Times, say that OpenAI and its backer Microsoft "threaten the very existence of writers because, without permission or payment, defendants copied plaintiffs’ work to build a massive commercial enterprise that is now valued at billions of dollars".
Their lawsuit - which seeks class action status - adds that, given the investment raised by and current valuations of companies like OpenAI, it is "absurd" to suggest that getting licences from copyright owners for training AI models would be "cost prohibitive" and therefore "preclude the development of this nascent industry".
The AI company could have explored profit sharing licensing deals to reduce its upfront costs, they add, "but instead defendants just decided to steal. They’re no different than any other thief".
Copyright industries, including the music industry, are adamant that AI companies must seek permission from copyright owners before using existing content for training their generative AI models. Most AI companies, however, argue that such use of content constitutes 'fair use' under American law, meaning permission is not required.
There are now numerous lawsuits working their way through the US courts that will put that argument to the test. Only one of those is directly music related, with a group of music publishers suing Anthropic. However, given the potential impact of the AI companies’ fair use defence, all the cases are important to all the copyright industries.
In its lawsuit against OpenAI, filed late last month, the New York Times stated: "Independent journalism is vital to our democracy. Since our nation’s founding, strong copyright protection has empowered those who gather and report news to secure the fruits of their labour and investment”.
However, it went on, “defendants have refused to recognise this protection", adding: "Because the outputs of defendants’ GenAI models compete with and closely mimic the inputs used to train them, copying Times works for that purpose is not fair use".
In a recent blog post, renowned intellectual property lawyer Kate Downing says that the NYT case against Open AI seems particularly strong. "The complaint includes multiple extremely clear-cut examples of [OpenAI's models] spitting out The Times' content nearly verbatim", she notes.
"Other plaintiffs also weren’t able to argue that their specific content, out of the trillions of pieces of training data in the datasets, was particularly important for creating quality AIs", she goes on. "Here, The Times convincingly argues that its content was extremely valuable for training the AIs, both because of the quantity involved as well as the fact that the training process involved instructing the AI to prioritise The Times’s content".
She also adds: “The other really interesting thing about this complaint is the extent to which it describes the business of The Times – how much work the journalists put in to create the articles, the physical risks they take during reporting, the value of good journalism in general, and The Times’s struggle with adjusting to an online world”.