MUMBAI, India, June 13 -- Intellectual Property India has published a patent application (202517041465 A) filed by Google Llc, Mountain View, U.S.A., on April 29, for 'visual transformers with sparse application of video kernels.'
Inventor(s) include Piergiovanni, Anthony J.; Angelova, Anelia; and Kuo, Wei-Cheng.
The application for the patent was published on June 13, under issue no. 24/2025.
According to the abstract released by the Intellectual Property India: "Provided are machine-learned models for performing video processing with improved efficiency. In particular, the machine-learned model can perform the sparse application of one or more video kernels to a set of video data to generate video tokens that can, for example, be provided as input to a visual transformer. Thus, example implementations of the present disclosure are directed to an approach which can turn a visual transformer (e.g., a ViT encoder) into an efficient video model. Furthermore, example implementations described herein can seamlessly work with both image and video inputs. Specifically, by sparsely sampling the inputs, the model is able to do training and inference from both inputs. The proposed model is easily scalable and can optionally be adapted to large-scale pre-trained visual transformers without requiring full finetuning."
The patent application was internationally filed on Nov. 22, 2023, under International application No.PCT/US2023/080947.
Disclaimer: Curated by HT Syndication.