MUMBAI, India, June 26 -- Intellectual Property India has published a patent application (202621048857 A) filed by Beyondata Solutions Private Limited on April 16, 2026, for Method And System For Multimodal Embedding Fusion For Visual-Semantic Document Validation.

Inventors include Nishant Singh Tomar; and Dipesh Prajapati.

The application for the patent was published on June 19, 2026, under issue no. 25/2026.

Abstract: The present disclosure relates to a system (114) and method for multimodal visual–semantic validation of textual content extracted from document images. The system (114) receives extracted textual tokens, associated bounding-box coordinates, and corresponding image patches generated by an optical character recognition engine (202) from a rasterized representation of a document. The system (114) processes the received outputs to verify correctness of OCR-extracted text using joint visual and semantic analysis, generates visual–semantic embeddings for each extracted textual token, evaluates visual fidelity and semantic congruence of each extracted textual token, computes a combined confidence value representing a degree of alignment between the extracted textual token and the corresponding visual content based on jointly evaluated visual fidelity and semantic congruence signals, and generates a structured validation trace map that links each extracted textual token with bounding- box coordinates, associated image patch, visual–semantic embeddings, computed confidence value, and a validation decision. FIG. 1

Disclaimer: Curated by HT Syndication.