Sony Group’s R&D division has published a research paper which explores how technology might be used to identify which songs and recordings have influenced a piece of AI-generated music. As AI-generated music moves from ‘gimmick’ to ‘mainstream’, that’s something that is potentially valuable for music companies who want to allocate and distribute income from AI licensing deals.
According to Japan’s finance paper Nikkei - which also owns the FT - Sony’s new wizardry “analyses which musicians’ songs were used in learning and generating music” and can then “quantify the contribution of each original work, such as ‘30% of the music used by The Beatles and 10% by Queen’”.
That framing has been widely picked up by music and tech publications, which have largely repeated Nikkei’s claims at face value. In reality, as with all emerging technology, things are a little more nuanced and a little less certain.
The proposed new technology works in two ways. If - and that’s potentially a big ‘if’ - the developer of an AI model is on board, the Sony tech can use data from the AI company’s platform to identify the component parts. But, “when cooperation is not attainable, the technology estimates the original work by comparing AI-generated music with existing music”, so effectively an AI-powered musicologist.
That all sounds pretty impressive. If the technology works as described, and makes the leap from academic experiment to real-world product, then it’s a potentially significant development. Product applications could include identifying unlicensed music in AI training datasets as well as building allocation systems that could underpin royalty streams generated by AI licensing deals. But the gap between the Nikkei headline and what the underlying research paper actually demonstrates is fairly big.
So what’s actually going on under the bonnet? The underlying research paper - available on arXiv as ‘Large-scale Training Data Attribution For Music Generative Models Via Unlearning’ - was presented at the NeurIPS 2025 conference, as part of the event’s ‘Creative AI’ track. The paper outlines a method based on a technique called “machine unlearning”, which is basically what it sounds like: you take something an AI model has learned as part of its training data, and then selectively make it forget or ‘unlearn’ things.
The way it works is actually somewhat counter-intuitive though. The obvious thing to ‘unlearn’ would be tracks in the model’s training data - so if a Beatles track or a Queen track is present, you’d expect the ‘unlearning’ to be about forgetting those tracks. The problem is you’d need to repeat that ‘unlearning’ process for every output, selectively forgetting each track in the training dataset - in this case 115,000 of them - which, say the researchers, is “computationally unfeasible”.
And that’s with Sony’s relatively modest research dataset. Apply the same approach to something like Suno, which was trained on “essentially all the music files of reasonable quality that are accessible on the open internet” - so tens of millions of recordings - and the task moves from unfeasible to impossible.
So, instead, the researchers do it in reverse. And this is where it’s all a bit confusing. Instead of forgetting things in the training data, the ‘unlearning’ forgets the output that has been generated by the model, and then measures how that act of forgetting affects the model’s knowledge of its training data. When an AI model is trained, it builds up a set of internal parameters that collectively encode everything it has learned. Those parameters are what the model draws on when it generates new music. For each track in its dataset you can test how well the model “knows” that track - essentially, how accurately it can reconstruct it. If you run that test, each track gets a score. Those scores are a direct reflection of the model’s parameters: change the parameters and the scores change too.
You can think of the parameters as the ‘internal wiring’ that underpins the generation - and they are shared and entangled across the dataset. They are like a web or a mesh - they have specific relationships with each other, and altering one part of the web can have a big impact in some areas while leaving others untouched. In other words, what encodes the model’s knowledge of one training track may also underpin its knowledge of the others.
To perform the ‘unlearning’ and make the model forget the output it has generated, you effectively change the parameters of the model - you’re not ‘deleting’ the generated track as such, but instead you’re altering that web of internal wiring so that it would no longer be able to produce it. If you ran the same prompt again, you’d get a different result.
But because the web of parameters is interconnected, pulling on one part of it inevitably moves other parts too. After the unlearning the researchers then re-score every track in the training data to find out which ones have been affected - and that shows you the tracks that were most influential in generating the output.
It’s a little bit like pulling a book off a tightly packed shelf: most of the other books will stay put, but a few will fall over, or fall off the shelf. The ones that fall over were leaning most heavily on the book that has been removed - those are the ‘most influential’ ones.
In short, it’s very complex - and the sort of thing that makes most sense to scientists, not record labels. It’s also worth noting that the method produces per-track attribution scores, not per-artist percentages. The leap from “this generated track was most influenced by training tracks 47, 2031 and 89744” to “30% Beatles, 10% Queen” involves a whole load of other hard science.
So much for the science. But what about the practical application? Well, where AI companies use unlicensed training datasets, and refuse to disclose what music was in their datasets, record labels and music publishers want to identify what music might have been used so that they can enforce their rights by threatening to sue the AI company for copyright infringement.
But there’s a second, arguably more pressing challenge where this new technology can possibly help. As AI companies start to agree licensing deals with music companies, the labels, publishers, distributors and other aggregators who have done those deals need a system to allocate the money to individual recordings and songs, in order to divvy up the money with their artists and songwriters.
A rash of companies are developing technical solutions to try and help with that process, though there remains much debate as to exactly how data and royalties from the AI licensing deals will be handled.
Even where the monitoring technology has access to an AI company’s own data, there’s a certain amount of discussion about how different elements of the resulting AI created work would be weighted and accounted for.
And where the tech platform doesn’t have access to that data, there’s considerable debate about how good any technology that claims to identify and disambiguate the elements that go into generative AI outputs can actually be - or whether it’s all just smoke and mirrors with a liberal garnishing of snake oil.
On that point, the Sony researchers’ paper is actually quite illuminating - if somewhat inadvertently. It compares its ‘white box’ method - where the tech has full access to the AI model’s internals - to ‘black box’ methods where this is not the case. In plain terms, the two different approaches produce very different answers about which tracks are influential for a particular output.
There’s also the question of computational overhead. In the researchers’ experiment, attributing a single track - working out which of the 115,000 training tracks contributed to it - took around five hours running on eight of NVIDIA’s top-tier H100 GPUs - which cost in the region of $25k a pop.
Even if you are performing this in the cloud, that’s about $250 to attribute a single track, based on an 8 GPU H100 cluster from somewhere like CoreWeave. Scale that up to a commercial AI music service generating thousands of tracks a day against a training set of millions of recordings and, once again, the maths becomes eye-watering.
That aside, Sony’s solution has been developed by Sony AI, which is part of the Sony Group’s R&D division, which was re-launched in 2023 to “undertake unprecedented disruptive research” to “support the transformation of Sony into an AI and data-driven company”. So there’s probably quite a lot riding on getting things right. .
Sony Music will doubtless be eagerly monitoring developments and exploring how it might use the technology - though Sony AI will presumably seek wider adoption - and it’s never a given that the Sony record label and music publisher will always prioritise technology developed elsewhere within the Sony Group.
There is also a third debate around all this: as competing companies develop technical solutions to help identify and analyse AI music, and allocate and distribute AI music income, does one system come to dominate, or will we need another set of industry standards that everyone can work to?
The practical question for the music industry is not whether training data attribution is theoretically possible - this paper and various other research publications confirm that it is - but whether any of the competing approaches can be made to work reliably, affordably, and on real world datasets of millions of tracks, rather than research datasets of a few tens of thousands.
Whether Sony’s research gets us any meaningful distance in that direction remains to be seen, and anyone describing the work of an R&D team operating at the bleeding edge as a working system for tracking and compensating music used in AI is getting rather ahead of themselves.
The distance between an exciting conference paper and the launch of a royalty allocation engine for AI generated music is about the same as the distance between humming a tune in the shower and writing a hit single.