And Finally Business News Digital Labels & Publishers

Deezer builds robot that eats swear words (or at least identifies them some of the time)

By | Published on Friday 1 May 2020


Deezer has revealed the results of research it undertook looking into building an automated system to detect explicit language in songs, using artificial intelligence.

Ahead of the publication of a full academic paper as part of the International Conference On Acoustics, Speech, And Signal Processing next month, Manuel Moussallam gave a summary of the project in a blog post. First question though: What’s wrong with the current system for dealing with sweary lyrics in top pop songs?

Marking music as ‘explicit’ has been going on since the 1980s, of course. Starting with the slapping of stickers on CD cases, in the streaming age it has evolved into the tagging on individual tracks. With varying degrees of control across differing services, tracks with rude words can then be filtered out by anyone who doesn’t want filth being piped into their homes.

Currently, the task of tagging each track is carried out by humans. Somewhere in the release process at a record label, it will be someone’s job to decide whether or not to apply the explicit tag to a track. At larger companies, there will likely be a set of internal guidelines upon what does or does not count as explicit. Elsewhere, it may be done on gut feeling.

That all sounds fine then. Why get robots involved? Well, says Moussallam, that system is fine, except when it isn’t. “When no tag is provided, it can mean that the song is suitable for all audiences”, he says. But it might also mean that the label releasing the record just didn’t consider the track’s explicitness. “There is a substantially large part of our catalogue that falls under this category”, he adds.

So, a lack of an explicit tag doesn’t necessarily mean that a track is not explicit. Also, deciding whether or not a track needs tagging is not quite as simple as just listening out for whether or not someone says ‘fuck’. “Having to decide which track should be tagged as explicit and which shouldn’t is a complex task”, he goes on. “It requires a high-level understanding of cultural expectations and involves a lot of subjectivity”.

If humans are often failing in this, could AI do any better? That’s what Deezer wanted to find out. The answer: No, not really. But the process of reaching that conclusion is interesting nonetheless.

Deezer considered two different types of AI. One where it was simply fed a list of explicit words and told to look for them. The other which was trained to look for explicit words in the hope that it would learn itself how to identify explicitness with more accuracy over time. Once it had been taught not just to tag all hip hop tracks as explicit, the latter proved most effective. But it still did not reach the levels of accuracy of a human moderator.

While computers are alright at identifying rude words, the issues of cultural expectations and subjectivity are difficult ones to programme. After all, a song can be offensive without using any words that might feature on a swear list.

So, no super-tastic robot-led decency protector has come out of this project. What a waste of time! Or not. While Deezer’s new AI is not going to take over from human filth hawks any time soon, it could assist them in their job, and maybe reduce the frequency with which the ‘explicit’ box is not ticked when it should have been.

“With our approach, we can not only detect the presence of explicit keywords but also know where they occur in the song”, concludes Moussallam. “We could, therefore, highlight some parts of the audio to an annotator to facilitate his task”.

Mmm, delicious facilitation. Read the full blog post here.