Intellectual Property India Publishes Patent Application for 'Emotion-Aware Speech Recognition Using Hybrid Deep Learning With Intelligent Response Generation' Filed by Seshadri Rao Gudlavalleru Engineering College; Gangula Srikanth; Shaik Sabeeha; Vemireddy Thriveni; Shaik Khaja Babu; and Tadigadapa Harsha Vardhan

Posted On: 2026-05-01 Patentwipo

MUMBAI, India, May 1 -- Intellectual Property India has published a patent application (202641047846 A) filed by Seshadri Rao Gudlavalleru Engineering College; Gangula Srikanth; Shaik Sabeeha; Vemireddy Thriveni; Shaik Khaja Babu; and Tadigadapa Harsha Vardhan, Gudlavalleru, Andhra Pradesh, on April 15, for 'emotion-aware speech recognition using hybrid deep learning with intelligent response generation.'

Inventor(s) include Seshadri Rao Engineering College; Gangula Srikanth; Shaik Sabeeha; Vemireddy Thriveni; Shaik Khaja Babu; and Tadigadapa Harsha Vardhan.

The application for the patent was published on May 1, under issue no. 18/2026.

According to the abstract released by the Intellectual Property India: "The present invention relates to an emotion-aware speech recognition system using hybrid deep learning with intelligent response generation, designed to enhance human-computer interaction by enabling machines to understand and respond to human emotions. The proposed system processes speech input through a preprocessing unit that performs noise reduction, normalization, and extraction of acoustic features including Mel Spectrograms and Mel-Frequency Cepstral Coefficients (MFCC) along with their derivatives. The system employs a hybrid deep learning architecture integrating a Convolutional Neural Network (CNN) and a Bidirectional Long Short-Term Memory (BiLSTM) network with attention mechanisms. The CNN component extracts spatial and spectral features from Mel spectrogram representations, while the BiLSTM component captures temporal dependencies and emotional variations from MFCC features. The outputs from both branches are combined through a feature fusion module and passed to a classification layer that predicts multiple emotion classes including happy, sad, angry, fear, neutral, disgust, and surprise with high accuracy. Further, the system incorporates an Emotion-Aware Response Generation (EARG) module based on Transformer-based natural language processing models, which generates contextually relevant and empathetic textual responses corresponding to the detected emotional state. The final output is presented through a user interface displaying both the predicted emotion and the generated response in real time. The proposed system provides an efficient and scalable solution for applications such as virtual assistants, customer support systems, healthcare monitoring, and emotion- driven human-computer interaction, offering improved accuracy, adaptability, and user engagement compared to conventional emotion recognition systems."

Disclaimer: Curated by HT Syndication.

Category