MUMBAI, India, Jan. 2 -- Intellectual Property India has published a patent application (202541122203 A) filed by Vellore Institute Of Technology, Vellore, Tamil Nadu, on Dec. 4, 2025, for 'system for automated web content extraction and interactive analysis using vector embeddings.'
Inventor(s) include Dr Sarwesh P; Samir Shah; Siddhartha Pathak; and Aniket Kumar Shah.
The application for the patent was published on Jan. 2, under issue no. 01/2026.
According to the abstract released by the Intellectual Property India: "The present disclosure provides a computer-implemented method for automated web content extraction and interactive analysis using vector embeddings. The method extracts textual content from web sources using selectable HTML parsing or JavaScript rendering methods, processes content into chunks, generates summaries and vector embeddings using a large language model, stores embeddings in a vector database enabling semantic similarity searches, receives user queries through a conversational interface, retrieves semantically relevant embeddings based on queries, generates contextually aware responses using the language model and retrieved embeddings, and maintains conversation history for persistent context awareness. The system includes a web scraping module, text processing engine, language model interface, vector database (ChromaDB), and conversational interface enabling interactive content analysis."
Disclaimer: Curated by HT Syndication.