MUMBAI, India, Jan. 9 -- Intellectual Property India has published a patent application (202541133988 A) filed by Karpagam Academy Of Higher Education; Karpagam Institute Of Technology; V. Vadivu; and Karthieeswaran R, Coimbatore, Tamil Nadu, on Dec. 31, 2025, for 'text summarization and keyword extraction using nlp.'

Inventor(s) include V. Vadivu; and Karthieeswaran R.

The application for the patent was published on Jan. 9, under issue no. 02/2026.

According to the abstract released by the Intellectual Property India: "Text summarization and keyword extraction are essential tasks in Natural Language Processing (NLP) that aim to efficiently condense large volumes of text and highlight the most relevant information. The proposed system integrates advanced NLP and deep learning techniques to automate the process of generating concise summaries and extracting meaningful keywords from unstructured textual data. The system employs a hybrid summarization model that combines both extractive and abstractive approaches. Extractive summarization identifies and selects key sentences directly from the input text using statistical and graph-based algorithms such as TF-IDF, TextRank, and LexRank, ensuring that the generated summary preserves the factual integrity of the original content. In contrast, the abstractive component utilizes transformer-based architectures like T5, BART, and PEGASUS to generate semantically coherent and paraphrased summaries that resemble human-written content. For keyword extraction, the system adopts a hybrid methodology incorporating both traditional statistical techniques (TF-IDF and RAKE) and deep learning-based embedding models (BERT, Sentence-BERT) to identify and rank the most contextually significant keywords. The preprocessing module performs data cleaning, tokenization, lemmatization, and normalization to ensure the input text is suitable for analysis. A fusion and ranking module further refines the results by evaluating the semantic similarity between the generated summary and the original document using evaluation metrics such as ROUGE, BLEU, and BERTScore. The system's modular and scalable architecture allows for real-time summarization and keyword extraction across diverse document types and domains. Additionally, a user-friendly interface supports document uploads, interactive visualization, and export of results in multiple formats. By combining the strengths of statistical and neural network-based models, the proposed system offers a comprehensive, intelligent, and efficient solution for automated text summarization and keyword extraction, significantly improving information retrieval and content management processes."

Disclaimer: Curated by HT Syndication.