MUMBAI, India, June 22 -- Intellectual Property India has published a patent application (202641069994 A) filed by Bangalore Technological Institute on June 04, 2026, for A System And Method For Mobile-Initiated Cloud-Offloaded Convolutional Neural Network Based Speech Emotion Classification.

Inventors include Shylaja D N; Divith K Rathod; Gagan Gowda S; and Girish S K.

The application for the patent was published on June 12, 2026, under issue no. 24/2026.

Abstract: The invention discloses a cloud-based Speech Emotion Recognition (SER) system that employs a Sequential Convolutional Neural Network to classify human speech into eight emotional states—Happy, Sad, Angry, Fear, Disgust, Neutral, Surprise, and Calm. The system overcomes the on-device computational bottleneck of mobile deep learning by deploying the trained CNN model on the Heroku cloud Platform-as-a-Service via a Flask API server, exposing a prediction endpoint through the Heroku API Gateway. A lightweight Android mobile application records or selects a speech audio file, encodes it to Base64, and transmits it to the cloud via a REST POST request using the Volley and OkHttp libraries. The Flask server decodes the audio, extracts a 162-dimensional acoustic feature vector—comprising Mel- Frequency Cepstral Coefficients, Chroma Short-Time Fourier Transform features, Root Mean Square energy, and Mel Spectrogram features—and passes it through the CNN inference engine. The Softmax output layer returns the predicted emotion as a JSON response to the Android client, delivering real-time classification results in under one minute on a standard Android smartphone. The CNN model is trained on a merged, augmented dataset of 12,161 audio files drawn from the RAVDESS, TESS, SAVEE, and CREMAD corpora, achieving a training accuracy of approximately 97% and a validation accuracy of approximately 68%, with 78 out of 100 arbitrary real-world audio samples correctly classified.

Disclaimer: Curated by HT Syndication.