MUMBAI, India, May 1 -- Intellectual Property India has published a patent application (202641047834 A) filed by Dr. Shiladitya Munshi; Mr. Utpal Madhu; Sujoy Kumar Goswami; Milton Samadder; Dr. Subhajit Roy; and Abhijit Mitra, Agartala, Tripura, on April 15, for 'a system and method for detecting synthetic speech using acoustic-physiological feature analysis.'
Inventor(s) include Dr. Shiladitya Munshi; Mr. Utpal Madhu; Sujoy Kumar Goswami; Milton Samadder; Dr. Subhajit Roy; and Abhijit Mitra.
The application for the patent was published on May 1, under issue no. 18/2026.
According to the abstract released by the Intellectual Property India: "The present invention discloses a computer-implemented system and method for detecting synthetic or AI-generated speech using acoustic-physiological feature analysis derived from human speech production mechanisms. An input audio signal is received and subjected to preprocessing, including decoding, normalization, and temporal segmentation. The system extracts a plurality of acoustic-physiological features comprising pitch contour variability, cycle-to-cycle frequency and amplitude perturbations, harmonic-to-noise ratio, breathing-related noise energy, temporal micro-pause statistics, spectral envelope characteristics, spectral flatness, and phase irregularities. The extracted features are aggregated into a fixed-length feature vector and processed by a trained machine learning classification model to determine whether the audio corresponds to human-generated or synthetic speech. The system further generates a confidence score and an explainable output indicating dominant contributing features. The proposed approach is independent of lexical or textual content and is operable across multiple languages, speaking styles, and recording conditions. By leveraging involuntary physiological variability inherent in natural speech production, the invention provides an interpretable, computationally efficient, and robust detection mechanism suitable for real-time and near real-time deployment via network-accessible interfaces. The invention is applicable to digital identity verification, fraud detection, media authentication, and forensic speech analysis, and provides a measurable technical effect in improving the reliability and transparency of synthetic speech detection."
Disclaimer: Curated by HT Syndication.