IIT Guwahati develops speech technologies for North Eastern languages

IIT Guwahati develops speech technologies for North Eastern languages

Hummingbird News Desk

GUWAHATI, 29 MAR: The North-East India hosts a myriad of languages that belong to three different language families and that have salient phonological inventories. Officially, of the 99 non-scheduled languages, 60 are spoken in North-East India. Among these 60, 34 languages have more than 50,000 speakers. However, these languages do not have enough linguistic resources for language technology development. Many of them do not even have detailed linguistic descriptions that would be helpful in understanding the challenges that lay ahead in building language technology based tools in the languages.

Inside Post Content (After X Paragraph)

Indian Institute of Technology (IIT), Guwahati is developing ‘Speech Technologies for North Eastern Languages’ to develop Speech Technology Tools for Healthcare Information Dissemination. The tools will enable retrieval of healthcare related information with the help of spoken keyword spotting (KWS) in seven North East Indian languages.

As part of the project a database of health-related information in seven languages spoken in North East India will also be created. This project is expected to facilitate the access of healthcare related information by the people in the far flung areas of North East India in their own native languages.

The Centre for Linguistic Science and Technology (CLST) at IIT Guwahati has got funding for this project from the Ministry of Electronics and Information Technology, Government of India, under its ‘National Language Translation Mission (NLTM): BHASHINI’ initiative.

Highlighting the unique aspects of this Project, Prof. T.G. Sitharam, Director, IIT Guwahati, said, “This work embodies IIT Guwahati’s commitment to work for the local languages and ethnicities of North East India. The interdisciplinary nature of the project and the focus on local languages reflect the spirit envisaged in the National Education Policy, 2020.”

This project involves building speech technology tools for healthcare information dissemination in Hindi, English, Assamese, Bangla, Bodo, Manipuri, Khasi, Mizo, Nagamese, and Nepali.

Elaborating on this project, Prof. Rohit Sinha, Principal Investigator of project, and Head, Department of Electronics and Electrical Engineering, IIT Guwahati, said, “The Institute is committed to developing tools that will facilitate last-mile connectivity and information dissemination to the various communities living in the NE area, in their own languages. This project will be a step towards achieving that aim.”

Prof. Sinha also mentioned that the Centre for Linguistic Science and Technology (CLST) was a unique and truly interdisciplinary centre that is devoted to the analysis and technology development in the languages of North East India, through research projects and its PhD programme.

The Spoken Keyword Spotting (KWS) systems developed in the project will be able to detect a list of predefined words in a given speech signal of one of the target languages of the project. The efforts will involve modelling speech with the deep neural network based state-of-the-art techniques.

The interdisciplinary team of CLST team comprises of Prof. Rohit Sinha, Prof. Priyankoo Sarmah, Sanasam Ranbir Singh and Ashish Anand from CLST, IIT Guwahati. This project is part of a larger consortium project titled Speech Technologies for Indian Languages, led by IIT Madras as the consortium leader.

For the North East specific project, the IIT Guwahati team will work together with research teams from CDAC-Kolkata, IIIT Sri City, and NIT Manipur.

Tags: #IITGuwahati #SpeechTechnologies #NorthEasternLanguages #CLST #KWS #CDACKolkata

Share and Enjoy !

Below Post Content

Leave a Reply

Your email address will not be published. Required fields are marked *