Spotify Patents ‘Spoken Words Analyzer’ Technology to Analyze Song Lyrics With AI

Spotify logo
  • Save

Spotify has officially patented a “spoken words analyzer,” which “generates tags and explicitness indicators for a set of tracks” – or, more concisely, an AI-powered function designed to analyze and classify songs based upon their lyrics and technical characteristics.

Spotify filed an application for the patent on August 26th of this year, according to the U.S. Patent and Trademark Office database. The application was approved today, December 17th. This “spoken words analyzer” patent arrives just a few months after the Stockholm-based streaming service caught up to Apple Music by rolling out a searchable-lyrics feature.

Spotify’s newest lyrics-centered patent is classified on its USPTO profile as “related” to the “lyrics analyzer” patent (10770044) associated with the search option. The “spoken words analyzer” patent description, which spans north of 20,000 words, revolves around a digital function that would evaluate a “plurality” of tracks and then pinpoint “topics summarizing the lyrics” of the works – i.e. tags – based upon a collection of pre-determined keywords and phrases.

On this front, the patent description later relays: “Tags, however, are generally added manually and are frequently inconsistent between songs, as tags may rely on the music producers to supply tags. Each producer may have inconsistent views on what constitutes a given tag or even over-tag songs in an effort to encourage more play.”

While “advancements have been made” to automatic tagging in recent years, the filing states, “technical challenges still remain” – and continue to prove especially significant on Spotify’s playlists, which the patent’s tags would also impact, the description specifically highlights. Worth noting here, in the context of playlists as well as the general difficulty associated with manually reviewing tags, is that approximately 300,000 new tracks make their way onto Spotify each week.

And in addition to potentially far-reaching implications in terms of enabling users to find songs that suit their mood and preferences, the patent emphasizes multiple filters designed to identify explicit lyrics. The process would begin with “training a classifier for determining whether a track is explicit,” per the document, before utilizing a series of “explicitness indicators” to identify just how explicit a song is.

“Currently, music providers rely on the determination of the music producers to label certain tracks as explicit,” continues the nearly 50-page-long breakdown. “There exists a need for a flexible, automatic method for training a system to classify music as explicit or not, based on a sample set.”

Plus, to bolster the efficiency of the classification system, the technology outlined in the patent could be utilized to remove punctuation and repeat words, change uppercase text to lowercase, and more, lyrics-wise.

Finally, the patent description details an “acoustic vector database” that would create classifications “based on the non-lyrics audio features within the tracks.” These vectors – which  “may be generated in a variety of ways now known or future developed and the details are not provided herein” – encompass descriptors such as “danceability,” or a scale (from zero to 1.0) for how well-suited a song is for dancing.

“Speechiness” (the presence of speech, spoken or sung), energy (“a perceptual measure of intensity and powerful activity”), “instrumentalness,” valence (“the musical positiveness conveyed by a track”), “acousticness,” and liveness (“the presence of an audience in the recording”) are also specified.

2 Responses