I’ve been using this free AI song maker to create tracks — and the quality is surprisingly good

The AI music generation arena has been one of the few stable parts of the AI revolution over the past two years. The two dominant companies, Suno and Udio, have both established a well-deserved reputation and fanbase in the niche.

However this cosy status quo may be about to change dramatically. A new music generation platform called YuE, has just dropped, and it’s free, open source and produces surprisingly good music tracks.

YuE, which means ‘music’ and ‘happiness’ in Chinese, is actually a group of models which work together to deliver full songs.

The models cover lyrical production, instruments and genre. As with many of these new Chinese AI models, the open nature of YuE has encouraged a lot of homebrew development — mostly to to reduce the computing requirement so more people can take advantage of the tool.

The original project required a minimum of 24GB of video RAM, and the official recommendation to create full songs still remains set at 80GB. This is clearly way out of reach for normal home users, and is aimed at professionals, business and academia.

The good news is a lot of effort has been done to create smaller packages for the masses, including work done by the popular Pinokio platform, which lets anyone quickly and easily run open-source AI projects on Windows.

The trade-off

(Image credit: NPowell/YuE)

The trade-off with these small VRAM versions is the fact that the audio quality is definitely degraded, and generation times can be glacially slow.

Even using Pinokio the baseline VRAM requirement at 12GB was out of reach of all but the most powerful computers. However an enterprising user recently introduced a new super low memory version, which opened the door for me to jump in and have a play around using my paltry 8GB RTX GPU system.

Here’s what I made:

The first impression is of a very competent Gradio user interface. On the left side of the screen is the prompt box, below is a lyrics box for you to enter your words, and then the number of tracks you want to generate. It’s also possible to set the amount of RAM you want to use, which is tied to the length of the song and the number of verses.

Press the generate button and sit back and wait while the platform generates the track.

The developers claim that with a 16GB VRAM GPU, a one minute track will only take four minutes to create. Unfortunately that doesn’t seem to scale downwards, because with my 8GB it took 2 to 2.5 hours each to produce two tracks of 40 and 50 seconds.

The developers claim that with a 16GB VRAM GPU, a one minute track will only take four minutes to create.

But what amazing tracks they are. They may be short, and the audio quality may not be premium level, but the musicality is incredible.

The last time I tested AI music generation on my computer it sounded like a dirty arcade console from the ’90s. This is real music, with accurate prompt adherence, great vocals and the kind of instrumentation you’d expect from from a commercial AI service.

You can hear more results embedded here on SoundCloud:

Final Thoughts

So this project is still extremely rough and ready, and the computing resources you need are ridiculous.

Even if you have a decent computer, you’ll spend a lot of time waiting for tracks to appear. But — and it is a big “but” — despite all of these drawbacks, this is an incredible first attempt to produce an open product in this sector.

If this is the type of quality that open source AI music generation can produce now, it’s not going to be long before the commercial services like Udio and Suno start to feel real heat from the DIY community.

More from Tom’s Guide


Source link
Exit mobile version