MUMBAI, India, Sept. 12 -- Intellectual Property India has published a patent application (202421016169 A) filed by Tata Consultancy Services Limited, Maharashtra, on March 7, 2024, for 'proximity based cluster algorithm for varied character recognition.'
Inventor(s) include Vasudevan, Bagya Lakshmi; Abraham, Kuruvilla; and Som, Suvodip.
The application for the patent was published on Sept. 12, under issue no. 37/2025.
According to the abstract released by the Intellectual Property India: "Retail flyer information extraction is important as it can benefit both retail competitors and consumers. However, existing optical character recognition (OCR) engines often struggle when it comes to accurately detecting unaligned characters. Present disclosure provides method and system for recognizing unaligned characters present in image. The system first apply connected component analysis technique on input image for isolating individual segments. Then, system extract a segment height of each individual segment which are then used to create a segment array. Thereafter, system uses a proximity based clustering algorithm to group one or more segment heights present in the segment height array based on their proximity. Further, system applies OCR technique on individual segment images present corresponding to the grouped segment heights to identify one or more characters present in grouped segment heights. Finally, system concatenate identified characters based on grouped segment heights to obtain original text present in input image."
Disclaimer: Curated by HT Syndication.