Nvidia unveils AI model for audio modification and generation

The Fugatto AI model can generate music and modify audio, targeting media producers.

November 26, 2024

Nvidia stated that it does not have immediate plans to publicly release Fugatto. Credit: Below the Sky / Shutterstock.

Nvidia has unveiled Fugatto, an AI model designed to modify voices and generate new sounds, aimed at music, film, and video game producers, Reuters reported.

The AI model, which stands for Foundational Generative Audio Transformer Opus 1, can create sound effects and music from text descriptions.

Access deeper industry intelligence

Experience unmatched clarity with a single platform that combines unique data, AI, and human expertise.

Find out more

Based in California, US, Nvidia has stated that it does not have immediate plans to publicly release Fugatto.

The technology joins similar advancements from startups such as Runway and larger companies namely Meta Platforms, which generate audio or video from text prompts.

The ability to modify existing audio is said to set apart Fugatto. It can transform a piano line into a human voice or change the accent and mood of spoken words, the news publication stated.

This capability is said to distinguish the new model from other AI technologies available currently.

GlobalData Strategic Intelligence

US Tariffs are shifting - will you react or anticipate?

Don’t let policy changes catch you off guard. Stay proactive with real-time data and expert analysis.

By GlobalData

Nvidia’s model is said to be trained on open-source data, and the company is still considering how to release it publicly.

Nvidia applied deep learning research vice-president Bryan Catanzaro said: “If we think about synthetic audio over the past 50 years, music sounds different now because of computers, because of synthesisers.

“I think that generative AI is going to bring new capabilities to music, to video games and to ordinary folks that want to create things.”

Generative AI creators face challenges in preventing misuse such as generating misinformation or infringing on copyrights.

OpenAI and Meta have also not announced public release dates for their audio or video-generating models.

Catanzaro added: “Any generative technology always carries some risks because people might use that to generate things that we would prefer they don’t “We need to be careful about that, which is why we don’t have immediate plans to release this.”

Last week, Nvidia collaborated with protein sequencing technology provider Quantum-Si to develop its proteomics platform, Proteus, with AI and accelerated computing.

Nvidia unveils AI model for audio modification and generation

Go deeper with GlobalData

ChatGPT Trailblazers - How Startups Democratize Generative Artificial Intelligence (AI)

Generative Artificial Intelligence (AI) Powerplay: What’s in the Big Tech AI Playbook

Data Insights

Access deeper industry intelligence

US Tariffs are shifting - will you react or anticipate?

ChatGPT Trailblazers - How Startups Democratize Generative Artificial Intelligence (AI)

Generative Artificial Intelligence (AI) Powerplay: What’s in the Big Tech AI Playbook

Go deeper with GlobalData

Time dilation—a pathway to time travel

Opinion: Without better connectivity, there won't be an energy transition

EdgeMode announces portfolio review, accelerates Spanish AI projects

Intel pursues SambaNova acquisition amid AI expansion plans

Sign up for our daily news round-up!

Sign up to the newsletter: In Brief

Go deeper with GlobalData

Data Insights

Access deeper industry intelligence

US Tariffs are shifting - will you react or anticipate?

Sign up for our daily news round-up!

Give your business an edge with our leading industry insights.

Go deeper with GlobalData

Go deeper with GlobalData

Access deeper industry intelligence

Sign up for our daily news round-up!

Sign up to the newsletter: In Brief

I would also like to subscribe to:

Thank you for subscribing