MUMBAI, India, Oct. 11 -- Intellectual Property India has published a patent application (202411019685 A) filed by Birla Institute Of Technology And Science, Pilani, Rajasthan, on March 16, 2024, for 'a versatile contrastive learning-based ocr system and method for ultra-low-resource scripts through auto-glyph feature extraction.'

Inventor(s) include Prawaal Sharma; Poonam Goyal; Vidisha Sharma; and Navneet Goyal.

The application for the patent was published on Oct. 10, under issue no. 41/2025.

According to the abstract released by the Intellectual Property India: "The versatile contrastive learning-based OCR system (100) for ultra-low-resource scripts through auto glyph feature extraction comprises a scanning device (102) to capture document images; a preprocessing unit (104) to remove noise, deskewing the document images to correct orientation, enhancing contrast, and binarizing to convert into binary format; a segmentation unit (106) to segment images into pages, lines, words, characters, and symbols; an extraction unit (108) to extract shape, size, texture, and spatial relationship features from segmented characters or words; an annotation processing unit (110) to annotate extracted symbols into one or more categories using glyph features; a re-enforcement unit (112) to reinforce a labelled dataset through augmentation techniques; and an identification unit (114) to identify symbols using contrastive learning model, leveraging multiple positive and negative samples, and map image data into a latent representation space using a deep learning transformer model thereby formatting recognized text with proper spacing, punctuation, and alignment."

Disclaimer: Curated by HT Syndication.