Intellectual Property India Publishes Patent Application for 'Whisper AI And Vits-Driven Pipeline For Multimodal Speech Translation, Voice Cloning, And Temporal Alignment In Cross-Lingual Audio-Visual Synthesis' Filed by Srm Institute Of Science And Technology, Ramapuram Campus; Easwari Engineering College

Posted On: 2026-05-29 Patentwipo

MUMBAI, India, May 29 -- Intellectual Property India has published a patent application (202641061654 A) filed by Srm Institute Of Science And Technology, Ramapuram Campus; Easwari Engineering College, Chennai, Tamil Nadu, on May 15, for 'whisper ai and vits-driven pipeline for multimodal speech translation, voice cloning, and temporal alignment in cross-lingual audio-visual synthesis.'

Inventor(s) include K Shreya; K S Chakradhar Danesh; and Dr. K. Sujatha.

The application for the patent was published on May 29, under issue no. 22/2026.

According to the abstract released by the Intellectual Property India: "The present invention relates to an automated system and method for multilingual speech-to-speech video translation. The system is configured to extract audio from an input video, identify and separate multiple speakers, transcribe spoken content, translate the transcribed text into one or more target languages, and generate synthesized speech while preserving the original speaker's vocal characteristics. The invention further provides precise synchronization between the synthesized speech and the visual content of the video, including lip movements and scene timing. The disclosed system operates in an automated and scalable manner, enabling efficient multilingual video localization for applications including education, entertainment, accessibility, and public information dissemination."

Disclaimer: Curated by HT Syndication.

Category