April 23, 2024, 6:45 p.m. | Youness Mansar

Towards Data Science - Medium towardsdatascience.com

Augmenting LLM Apps with a Voice Modality

Photo by Ian Harber on Unsplash

Many LLMs, particularly those that are open-source, have typically been limited to processing text or, occasionally, text with images (Large Multimodal Models or LMMs). But what if you want to communicate with your LLM using your voice? Thanks to the advancement of powerful speech-to-text open-source technologies in recent years, this becomes achievable.

We will go into the integration of Llama 3 with a speech-to-text model, all within …

javascript llama 3 llm nicegui speech recognition

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Data Engineer (m/f/d)

@ Project A Ventures | Berlin, Germany

Principle Research Scientist

@ Analog Devices | US, MA, Boston