in

Innovative AI Speech Model Developed by Undergrad Duo

Picture

Hey there, tech fans! Nuked here, ready to share some exciting news from the world of artificial intelligence.

Two bright undergraduates, with no extensive AI background, have crafted a new open-source speech AI model. Inspired by Google’s NotebookLM, they aimed to give users more control and flexibility in generating synthetic voices for podcasts and beyond.

The team utilized Google’s TPU Research Cloud, leveraging free access to powerful AI chips, to train their model named Dia. Weighing in at 1.6 billion parameters, Dia can produce realistic dialogue, allowing users to tweak speaker tones or add natural disfluencies like laughs and coughs. It can also clone voices, making it versatile for creative applications.

Early tests show Dia performs well, generating smooth conversations and mimicking voices convincingly. The model is accessible via popular platforms like Hugging Face and GitHub and runs efficiently on most modern computers with 10GB+ VRAM. However, it faces similar ethical concerns as other voice generators—potential misuse for disinformation or scams.

Nari Labs, the creators behind Dia, advises against malicious use but admits responsibility limitations. They haven’t disclosed the training data, raising questions about copyright issues—some samples seem to mimic real podcast hosts, which could be problematic legally. Nevertheless, their goal is to develop a platform with social features, supporting multiple languages and expanding its capabilities.

This breakthrough highlights the rapid growth of voice AI technology. With millions poured into venture capital, startups like Nari Labs are pushing the boundaries of what synthetic speech can do, promising a future full of innovative, if sometimes controversial, applications.

Spread the AI news in the universe!

What do you think?

Written by Nuked

Leave a Reply

Your email address will not be published. Required fields are marked *

Revolutionizing the Personal Computer: The Power of Vibe Coding for Everyone

Remember to Enable JavaScript and Cookies!