news

Meta unveils new artificial intelligence model Spirit LM to enhance voice experiences.


Meta announced the release of the new open-source artificial intelligence model “Spirit LM”, which aims to advance and naturalize voice processing. This model focuses on overcoming challenges related to multimodal models, with a focus on improving voice quality and providing a more expressive and realistic audio experience.

The “Spirit LM” model is based on a pre-trained language model containing 7 billion parameters, and it comes with a new technology that differs from traditional models relying on automatic speech recognition (ASR) techniques. Meta indicated that those traditional techniques lack the ability to accurately convey natural expressions in sound, reducing the realism of voice interactions.

To overcome these challenges, “Spirit LM” relies on phoneme symbols, tones, and pitch, giving it the ability to generate natural voices. The model also enables continuous learning to perform a variety of tasks such as speech recognition, text-to-speech conversion, and speech classification in an advanced way.

Meta unveiled this model through a detailed research paper that included audio samples demonstrating the capabilities of “Spirit LM” and its potential applications. This model is expected to be used in the future within Meta-owned apps such as WhatsApp, Instagram, and Facebook, allowing users to experience more complex and natural voice interactions.

“Spirit LM” is now available as an open-source project, allowing developers and researchers to leverage and develop its capabilities, representing a significant step towards improving the experiences of artificial intelligence in communication. modern.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button
error: Content is protected !!