x
N A B I L . O R G
Close
AI - August 15, 2025

NVIDIA Unveils Open-Source Tools for Building Multilingual AI in 25 European Languages, Boosting Digital Inclusion

NVIDIA Unveils Open-Source Tools for Building Multilingual AI in 25 European Languages, Boosting Digital Inclusion

In a bid to bridge the language gap, tech giant NVIDIA has unveiled an array of open-source tools designed to empower developers to create high-quality speech AI for 25 different European languages. This move aims to extend the reach of AI technology to a significant portion of the global population who have been left out due to its predominant focus on a small fraction of the world’s 7,000 languages.

The new tools, including major languages such as French, German, and Italian, also cater to lesser-spoken languages like Croatian, Estonian, and Maltese, often overlooked by big tech. The objective is to enable developers to create voice-powered tools that many of us use routinely, from multilingual chatbots to swift customer service bots and instant translation services.

At the heart of this initiative is Granary, a vast library containing approximately one million hours of human speech recordings. This resource serves as a teaching aid for AI, helping it grasp the intricacies of speech recognition and translation.

To make the most of this speech data, NVIDIA has also introduced two new AI models tailored for language tasks. These models are now available on Hugging Face for developers eager to dive into their projects.

The creation of this data was no small feat. Typically, training AI requires massive amounts of data, which is often laboriously acquired through human annotation. To circumvent this hurdle, NVIDIA’s speech AI team collaborated with researchers from Carnegie Mellon University and Fondazione Bruno Kessler to develop an automated pipeline. Using their NeMo toolkit, they converted raw, unlabeled audio into high-quality, structured data that can be effectively used for training AI.

This advancement signifies not only a technical triumph but also a significant step towards digital inclusivity. Developers in cities like Riga and Zagreb can now create voice-powered AI tools that accurately comprehend their local languages, thereby improving efficiency. Preliminary research suggests that the accuracy level can be reached using half the amount of Granary data compared to other popular datasets.

The two new models showcase this potential. Canary offers translation and transcription quality comparable to models three times its size, while maintaining up to ten times the speed. Parakeet, on the other hand, can process a 24-minute meeting recording in one go, automatically identifying the spoken language. Both models are intelligent enough to handle punctuation, capitalization, and provide word-level timestamps, essential for building professional-grade applications.

By making these powerful tools accessible to the global developer community, NVIDIA is not just launching a product; it’s sparking a new wave of innovation. The ultimate goal is to create a world where AI can communicate effectively, regardless of one’s geographical location.