Meta researchers bought round this downside by retraining an present AI mannequin developed by the corporate in 2020 that is ready to be taught speech patterns from audio with out requiring massive quantities of labeled knowledge, equivalent to transcripts.
They educated it on two new knowledge units: one which comprises audio recordings of the New Testament Bible and its corresponding textual content taken from the web in 1,107 languages, and one other containing unlabeled New Testament audio recordings in 3,809 languages. The group processed the speech audio and the textual content knowledge to enhance its high quality earlier than working an algorithm designed to align audio recordings with accompanying textual content. They then repeated this course of with a second algorithm educated on the newly aligned knowledge. With this methodology, the researchers had been capable of train the algorithm to be taught a brand new language extra simply, even with out the accompanying textual content.
“We can use what that mannequin discovered to then shortly construct speech techniques with very, little or no knowledge,” says Michael Auli, a analysis scientist at Meta who labored on the mission.
“For English, we now have heaps and plenty of good knowledge units, and we now have that for a couple of extra languages, however we simply don’t have that for languages which can be spoken by, say, 1,000 individuals.”
The researchers say their fashions can converse in over 1,000 languages however acknowledge greater than 4,000.
They in contrast the fashions with these from rival corporations, together with OpenAI Whisper, and declare theirs had half the error price, regardless of overlaying 11 instances extra languages.
However, the group warns the mannequin remains to be vulnerable to mistranscribing sure phrases or phrases, which might end in inaccurate or doubtlessly offensive labels. They additionally acknowledge that their speech recognition fashions yielded extra biased phrases than different fashions, albeit solely 0.7% extra.
While the scope of the analysis is spectacular, the usage of non secular texts to coach AI fashions might be controversial, says Chris Emezue, a researcher at Masakhane, a company engaged on natural-language processing for African languages, who was not concerned within the mission.
“The Bible has a variety of bias and misrepresentations,” he says.
Disqus Shortname not set. Please check settings