Home > Media News > How Meta's new AI tool creates music and audio from text prompts

How Meta's new AI tool creates music and audio from text prompts
3 Aug, 2023 / 12:54 pm / Meta

Source: http://www.masahble.com

819 Views

Meta has recently launched AudioCraft, an innovative open-source AI tool designed to empower musicians and everyday users to create audio and music from simple text inputs, reports Reuters. This groundbreaking tool consists of three models: MusicGen, AudioGen, and EnCodec.

According to Business Today, MusicGen utilizes Meta's extensive music library to generate music based on text prompts, while AudioGen draws from public sound effects to produce audio from text inputs. Notably, the EnCodec decoder has been updated to deliver higher-quality music generation with fewer unwanted artefacts.

Meta is providing pre-trained AudioGen models, too, enabling users to create environmental sounds and sound effects, such as dogs barking, cars honking, or footsteps on different surfaces.

 Today we’re sharing details about AudioCraft, a family of generative AI models that lets you easily generate high-quality audio and music from text.https://t.co/04XAq4rlap pic.twitter.com/JreMIBGbTF

— Meta Newsroom (@MetaNewsroom) August 2, 2023
Moreover, they are publicly sharing the model weights and code for the AudioCraft tool, facilitating various applications, including music composition, sound effects generation, compression algorithms, and audio generation.

so i'm wondering if this is a good setup: a tracker programming language like the Music Macro Language generates chiptune tracks, and this combined with a text prompt is used to condition MusicGen to produce something with instruments https://t.co/SG8j0vOo4k

— mxddxx  (@0xmaddie_) August 2, 2023
The open-sourcing of these models aims to encourage researchers and practitioners to train their models using their unique datasets. While generative AI has revolutionised images, video, and text, audio development has lagged considerably.

Today we're sharing details on AudioCraft, a new family of generative AI models built for generating high-quality, realistic audio & music from text. AudioCraft is a single code base that works for music, sound, compression & generation — all in the same place.

More details 

— Meta AI (@MetaAI) August 2, 2023
the “llama moment” has come to audio research today! i can’t even imagine what we’ll see out of AudioCraft.

whatever you work on in music/audio, do consider using it, as much as you can. if you don’t know what to do, think what you can do with it and get a head start. https://t.co/xMBeKJkx9Q

— Keunwoo Choi (@keunwoochoi) August 3, 2023
AudioCraft addresses this gap, offering a more accessible and user-friendly platform for generating high-quality audio.

Meta highlights the challenges of creating realistic and high-fidelity audio, which involves complex signals and patterns at various scales. Music has varied compositions of local and long-range patterns, making it difficult for audio generation.

Another remarkable feature of this technology is its ability to produce high-quality audio over extended durations. By simplifying the design of generative models for audio, Meta enables users to experiment with existing models effectively.