25/11/2024
**Arxiv paper alert**
We are excited to release our survey paper, "Direct Speech-to-Speech Neural Machine Translation: A Survey." This paper delves into the challenges of speech-to-speech translation (aka audio dubbing). The day is not far when AI will dub videos on the fly and make the content universally available in any language thus transforming education, entertainment, etc.
Abstract: Speech-to-Speech Translation (S2ST) models transform speech from one language to another target language with the same linguistic information. S2ST is important for bridging the communication gap among communities and has diverse applications. In recent years, researchers have introduced direct S2ST models, which have the potential to translate speech without relying on intermediate text generation, have better decoding latency, and the ability to preserve paralinguistic and non-linguistic features. However, direct S2ST has yet to achieve quality performance for seamless communication and still lags behind the cascade models in terms of performance, especially in real-world translation. To the best of our knowledge, no comprehensive survey is available on the direct S2ST system, which beginners and advanced researchers can look upon for a quick survey. The present work provides a comprehensive review of direct S2ST models, data and application issues, and performance metrics. We critically analyze the models' performance over the benchmark datasets and provide research challenges and future directions.
Link: http://arxiv.org/abs/2411.14453
Direct Speech-to-Speech Neural Machine Translation: A Survey
Speech-to-Speech Translation (S2ST) models transform speech from one language to another target language with the same linguistic information. S2ST is important for bridging the communication gap among communities and has diverse applications. In recent years, researchers have introduced direct S2ST...
15/11/2024
📣 **Dataset Release Alert**
Following our LREC-COLING work in 2023, we are excited to announce the release of a larger Indic-S2T dataset that can be used to build translation models for Eng to Indic.
𝗛𝗶𝗴𝗵𝗹𝗶𝗴𝗵𝘁𝘀:
1. Covers 15 indic languages
2. 6800 hours of speech data
3. 900GB +
𝗨𝘀𝗲𝗳𝘂𝗹 𝗳𝗼𝗿:
1. Subtitling
2. Dubbing
3. ASR
4. MT
5. S2T
Stay tuned for more updates.
Contributors:
Nivedita Sethiya Saanvi Nair Puneet Walia
hashtag hashtag hashtag hashtag hashtag
29/10/2024
Advertisement for admission to Ph.D program in the department of CSE at IIT Indore is out. Pl check the link to apply:
academic.iiti.ac.in
11/09/2024
"In a researcher's life, nothing could be a better day than seeing a paper-accepted email". The following paper has been accepted in SPECOM 2024 , a premier conference in the speech technology domain to be held in Crowne Plaza, Belgrade, Serbia.
Title: Cross-Lingual Summarization of Speech-to-Speech Translation: A Baseline
Why it matters: We don't have time to watch long videos/news/lectures. Can you summarize it in my local language, yes! Soon, we will have models doing it for you.
Till then, stay-tuned.
Thanks to our students Balaram Sarkar and Pranav Karande.
12/08/2024
Looking for a project assistant to work on "Translation of Medical Reports & Support systems for Underprivileged". For more details, pl see
www.iiti.ac.in
05/07/2024
*****Posting on behalf of my collaborator****
We are hiring a passionate researcher for one JRF position on a project interfacing health and data sciences at Birla Institute of Technology and Science, Pilani
The research team involves Vinti Agarwal, Assistant Professor, CSIS, BITS-Pilani, Pilani Campus, Rajasthan; Dr. Chandresh Kumar Maurya, Assistant Professor, IIT Indore, India
The project aims to utilize language technologies and deep learning techniques to enhance health literacy that empowers individuals with their health data, ensuring informed decision-making and equitable access to healthcare. hashtag hashtag
Click to Apply: https://lnkd.in/dPM2cDzx
Please feel free to apply if the project description aligns with your interests and expertise or share the opportunity with anyone in your network who might be interested.
Read More Details: https://lnkd.in/gEqT8nwx
Perks and benefits: INR 37K PM + hostel accommodation + health facility
Deadline: Until position is filled.
lnkd.in
02/05/2024
Our Indian patent on "A Scalable System And Method For Contact Tracing, Hotspot Detection, and Safe Route Recommendation" was granted this week. In that, we propose a scalable (linear time O(n) wrt to users) algorithm for contact tracing. Hotspot areas along the Google map path and safe route recommendations came as a side benefit. Related paper is https://arxiv.org/pdf/2105.15030
We worked on it during the first Covid lockdown.
Congratulations to all the authors and inventors.
Neetesh Mathur
15/04/2024
Admission for Ph.D. program in the CSE department at IIT Indore for Autumn 2024 is open. People interested in doing fundamental and applied research area of ML/DL/NLP/Speech can drop me an email. You will join the young researcher's team [AI group: https://lnkd.in/dBGNMWVM]
Link to admission form:https://academic.iiti.ac.in/phd_advertisement/2024/%5B2024_04_04%5D_PhD_Adv_for_CSE_Autumn_2024_2025.pdf
Last Date: 10th of May, 2024
academic.iiti.ac.in
17/03/2024
Open house symposium 2.0 concluded successfully at IIT Indore. Thanks to the student volunteers who tirelessly worked hard to make this event a grand success. Also, special thanks to our invited speakers from IITK, IITB, IITH, IITD and Supra.
02/03/2024
Look at the Google translate result between Sanskrit-Hindi and tell me what you think of the about?
Interestingly, the result of mBART on the same Sanskrit sentence is "मुझे भी छोटे कपड़े पसंद हैं।".
Such outputs show that AI models still have a lot to learn and understand.
Be aware of LLM experts selling you their products.
22/02/2024
Served as a panelist in a language documentation workshop at IITI where the discussion revolved around how important is to document Indian languages, especially endangered ones. (In India around 200 languages are categorized as endangered). This will save us from "Digital colonization" where we produce and consume data in a foreign language. Also, this step will enable us to develop technologies that will be more accurate than current ones (cf. YT auto-captions).
21/02/2024
अंतर्राष्ट्रीय मातृभाषा दिवस की शुभकामनाएँ!