Remember that time an AI deepfaked Drake and The Weeknd, releasing a song that ultimately became a viral hit before being erased from streaming services? Or that other time that Spotify erased over 1000 AI generated songs from its library, only to reinstate them a week later? Like it or not, AI is here to stay, and for players across the music industry that means adapt or die. Surveys have revealed that a significant percentage of producers fear that AI music generators could potentially replace them, while an increasing number of musicians are embracing AI to create their music. Pop star Grimes has been one of AI’s most vocal supporters, releasing a programme that allows users to generate her voice over anything, and even offering to split royalty fees with producers who make tracks using her likeness.
Amidst this flurry of AI advancements, Meta has unveiled its latest creation, MusicGen—an AI generator with the ability to transform text into melodies. This innovative tool boasts a user-friendly interface, enabling users to input prompts describing the desired style or type of music they wish to generate. Additionally, MusicGen offers the option to modify existing tracks, allowing users to transform familiar melodies and compositions into unique creations. We gave it a try. Here are five of our take aways:
The generates samples are super short
Right now, MusicGen is only able to spit out samples that are between 10 and 15 seconds in length. Perhaps just enough for a snappy reel or TikTok. But, given the significant GPU power the interfaces uses to generate the samples, it’s a lot of work for very little payoff.
Quality is not really a thing here
The samples produced by MusicGen showcase harmonies that predominantly reside within simpler musical scales. There isn’t a lot of complexity going on, even if your text prompts are very detailed or specific. The music sounds at best like the sort of royalty free stock music you might find online or soundtrack a reality show on Netflix.
The interface is very user friendly
The interface is fairly simple to navigate. MusicGen allows you input description prompts via text, and also allows you to input a melody as a condition via an audio file upload or using a microphone. The actual music results from a deep learning language sourced from open-source technology, which Meta has extensively trained using a vast music library spanning over 20,000 hours. This expansive collection comprises 10,000 hours of licensed tracks and an impressive 390,000 instrumental tracks.
Expect long waiting times and less than ideal accuracy
We waited about twenty minutes for MusicGen to generate a 15 second clip from our prompts. This waiting time can vary depending on how many other users there are in the queue, but it’s still a lengthy wait. As for the accuracy of the generated sound clip, see MusicGen’s response to our prompt “90’s drum and bass beat with acid house melody and Burial style ambience at 130bpm” below.
Do we even need a tool like this?
Honestly? No. It’s difficult to think of any sort of purpose that MusicGen might fulfil that doesn’t take away from actual musicians and producers in a market that’s already spread thin. It might prove useful as a creative tool for producers looking to brainstorm or find their way out of a rut, and shows some promise as a tool for content creators to generate their own royalty free music for reels and TikTok content. Beyond that, it’s hard to imagine how a tool that lacks human intuition and creativity could possibly generate music you’d actually want to listen to, or buy.