MUMBAI, India, April 17 -- Intellectual Property India has published a patent application (202641042412 A) filed by iamlokesh434; Lingeshkumar D P; and Manojkanna S, Chennai, Tamil Nadu, on April 2, for 'sign language recognition using machine learning - real-time gesture-to-text and speech conversion system.'

Inventor(s) include iamlokesh434; Lingeshkumar D P; and Manojkanna S.

The application for the patent was published on April 17, under issue no. 16/2026.

According to the abstract released by the Intellectual Property India: "Described herein is a Sign Language Recognition Real-Time Gesture-to-Text and Speech Conversion System (SLR-RTGS), a machine learning and embedded hardware-based autonomous system designed to recognize sign language hand gestures and convert them into simultaneous text and speech output in real time, enabling seamless communication between hearing- or speech-impaired individuals and the general population. The system integrates an ESP32 microcontroller as the central controller, a camera module for continuous hand gesture video capture, a Python-based machine learning software stack comprising OpenCV, MediaPipe, and TensorFlow for real-time hand landmark detection and gesture classification, a 16x2 I2C LCD display module for text output, and a DFPlayer Mini audio module connected to a speaker for synthesized speech output. Captured video frames are processed in real time through a multi-stage recognition pipeline comprising image preprocessing, MediaPipe-based hand landmark extraction of 21 key coordinates, TensorFlow KeyPointClassifier-based gesture classification, and simultaneous LCD text display and audio playback triggered by the ESP32 microcontroller. A detected gesture is classified against a defined vocabulary of sign language symbols and the corresponding text and audio output is generated without human operator intervention. The system achieves gesture recognition accuracy ranging from 90.0% to 96.6% across tested gestures and an end-to-end response time of approximately 540 milliseconds per recognition cycle, enabling reliable, real-time assistive communication in educational, healthcare, public service, and everyday social environments."

Disclaimer: Curated by HT Syndication.