MUMBAI, India, April 10 -- Intellectual Property India has published a patent application (202421075283 A) filed by Tata Consultancy Services Limited, Maharashtra, on Oct. 4, 2024, for 'method and system for deployment of large language models (llm) in cloud instances.'

Inventor(s) include Krishnan, Ashwin; Pasumarti, Venkatesh; Inamdar, Samarth Sudarshan; Mondal, Arghyajoy; Nambiar, Manoj Karunakaran; and Singhal, Rekha.

The application for the patent was published on April 10, under issue no. 15/2026.

According to the abstract released by the Intellectual Property India: "Existing model deployment approaches have the disadvantage that they do not consider feasibility of cloud instances for hosting a given LLM model. Embodiments disclosed herein provide a method and system for deployment of LLMs in a plurality of cloud instances. The system checks feasibility of the plurality of cloud instances for hosting an LLM, based on size of the LLM and storage space in each of the cloud instances. Further, a latency value for a plurality of batch sizes is determined for a plurality of LLM-accelerator pairs, in each of the plurality of cloud instances identified as feasible based on the feasibility check, using a performance model. Furthermore, a recommendation of one of the plurality of cloud instances identified as feasible is generated, based on the determined latency, a measured cost of deployment, a user workload, an application type, a plurality of latency constraints, and an evaluated performance."

Disclaimer: Curated by HT Syndication.