More

    Using AI to Protect India’s Rare Languages: Microsoft’s Project ELLORA

    Ajinkya Nair
    Ajinkya Nair
    Ajinkya is a writer by trade, tech geek by nature. He's got a thing for sleek gadgets, loud engines, and the quiet tick of mechanical watches. When not crafting words, he's either laying down beats in his home studio or conquering gaming worlds. Travel is his reset button - nothing beats discovering hole-in-the-wall eateries or stumbling upon breathtaking views. He collects experiences like some folks collect stamps, turning each adventure into a story worth telling. Whether it's dissecting the latest tech trends or debating the merits of manual transmissions, he's always up for a good chat.​​​​​​​​​​​​​​​​

    The Microsoft Research (MSR) lab in India is working on creating digital ecosystems for Indian languages that have a limited online presence. These efforts are part of Project ELLORA, launched in 2015, which aims to bring rare Indian languages to the digital world and preserve them for future generations.

    The team is creating language datasets by mapping out resources, including printed literature, to train AI models. They are also collaborating with the language communities to ensure the datasets are culturally relevant and accurate. Microsoft is currently working with the Mundari community, which speaks the Mundari language and is concerned about its longevity due to limited teaching in schools.

    English has been the dominant language of the internet since its inception, with only 8 out of nearly 6,000 languages preferred online. This means that 88% of the world’s languages don’t have enough of a presence, affecting 1.2 billion people who can’t use their language to navigate the digital world.

    Microsoft’s research team is working on a Hindi-to-Mundari text translation and speech recognition model to provide the community access to more content in their language. They have also developed a new technology called Interneural Machine Translation (INMT) to speed up the translation process. Apart from Mundari, Microsoft is also working with the Gondi and Idu Mishmi communities.

    Meta, the parent company of Facebook, is also working on a similar project. They have developed an AI translation tool that can convert unwritten languages, such as Hokkien, to spoken English. These efforts aim to bring underrepresented languages to the digital world and preserve them for future generations.

    LATEST ARTICLES

    RELATED ARTICLES

    LEAVE A COMMENT

    Please enter your comment!
    Please enter your name here

    spot_img