MUMBAI, India, Feb. 6 -- Intellectual Property India has published a patent application (202541126602 A) filed by Vellore Institute Of Technology, Vellore, Tamil Nadu, on Dec. 14, 2025, for 'real-time speech emotion recognition system with speaker diarization.'
Inventor(s) include Dr. Iyappan P; Mr. Santana Kumar I; Mr. Monesh Dl; and Mr. Keerthivasan S.
The application for the patent was published on Feb. 6, under issue no. 06/2026.
According to the abstract released by the Intellectual Property India: "The present disclosure provides a real-time speech emotion recognition system comprising an audio input module configured to receive multi-speaker audio data, a speech-to-text transcription module configured to generate text transcripts from the audio data, a speaker diarization module configured to segment the audio data into speaker-specific temporal intervals, an emotion classification module configured to analyze the text transcripts and assign emotion labels to speech segments, and a processing pipeline configured to correlate speaker identifications with the emotion labels to generate speaker-specific emotional analysis. The system integrates OpenAI's Whisper model for speech-to-text transcription, PyAnnote pipeline for speaker diarization, and transformer-based models including RoBERTa and DistilBERT for emotion classification. The processing pipeline executes transcription, diarization, and emotion classification in parallel to achieve processing latency of approximately 1.5 seconds while maintaining emotion recognition accuracy of 94% and diarization error rate below 7.2%."
Disclaimer: Curated by HT Syndication.