Meta’s open source speech AI recognizes more than 4,000 spoken language
Meta has developed an AI language model which is not a ChatGPT copy. The company’s Massively Multilingual Speech project (MMS) can recognize more than 4,000 spoken language and produce speech in more than 1,100. Meta, like most of its publicly announced AI project, open-sources MMS to preserve language diversity and to encourage researchers to build upon its foundation. The company said, \”Today we’re publicly sharing our code and models so that other researchers can build on our work.\” Through this work, our hope is to contribute a little to the preservation of the amazing language diversity around the globe.
Models for speech recognition and text to speech require thousands of hours worth of audio recordings with transcription labels. Labels are essential to machine learning because they allow the algorithms to categorize the data correctly and \”understand\”. Meta says that for languages not widely spoken in industrialized countries, many of which may disappear in the next few decades, \”this data does not exist.\”
Meta took a novel approach to collect audio data. It tapped into audio recordings of religious texts that had been translated. The company explained that they used religious texts such as the Bible which have been translated into many languages. These translations are well-studied for text-based translation research. The company said that these translations had audio recordings available of people reading the texts in different language.
Source:
https://www.engadget.com/metas-open-source-speech-ai-recognizes-over-4000-spoken-languages-161508200.html