MUMBAI, India, Aug. 8 -- Intellectual Property India has published a patent application (202517067038 A) filed by Google Llc, Mountain View, U.S.A., on July 14, for 'semi-supervised text-to-speech by generating semantic and acoustic representations.'
Inventor(s) include Kharitonov, Evgeny; Vincent, Damien; Borsos, Zaln; Marinier, Raphal; Pietquin, Olivier, Claude; Sharifi, Matthew; Tagliasacchi, Marco; and Zeghidour, Neil.
The application for the patent was published on Aug. 8, under issue no. 32/2025.
According to the abstract released by the Intellectual Property India: "Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an audio signal from input text. In one aspect, a method comprises receiving a request to convert input text into an audio signal, wherein the input text comprises multiple tokenized text inputs, generating, using a first generative neural network, a semantic representation of the tokenized text inputs comprising semantic tokens representing semantic content of the tokenized text inputs, each semantic token being selected from a vocabulary of semantic tokens, generating, using a second generative neural network and conditioned on at least the semantic representation, an acoustic representation of the semantic representation comprising one or more respective acoustic tokens representing acoustic properties of the audio signal, and processing the acoustic representation using a decoder neural network to generate the audio signal."
The patent application was internationally filed on Jan. 26, 2024, under International application No.PCT/US2024/013149.
Disclaimer: Curated by HT Syndication.