Open source TTS models Kokoro, Orpheus, and Piper are tested on symbols, abbreviations, and prosody with CER and MOS results.
What if you could transform hours of audio into precise, actionable text with just a few lines of code? In 2025, this is no longer a futuristic dream but a reality powered by innovative speech-to-text ...
Mistral launches Voxtral TTS, extending its model family into speech generation and enabling end-to-end voice workflows.
Google's AI Edge Eloquent app uses AI to edit out mid-sentence mistakes to provide you with a polished transcription of your audio.
Google LLC’s DeepMind artificial intelligence unit today rolled out a new text-to-speech model called Gemini 3.1 Flash TTS.
Tech giant Microsoft has announced a trio of advanced AI models that it claims offer high-quality performance. These models include MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2, now accessible to ...
Google has launched a new AI-powered dictation app for iPhone users that works even without an internet connection.
Microsoft is expanding its roster of in-house AI models, releasing a new speech-to-text system and making two existing models broadly available to developers for the first time. The moves by Microsoft ...
Gnani.ai has launched Vachana STT, a speech-to-text model built for Indian languages, under the IndiaAI Mission. The startup said the model has been trained on more than 1 Mn hours of real-world voice ...
Gautam Jha is the Co-Founder & CTO of Kalpa Labs, an SF-based YC backed startup building large scale Foundational speech models. Voice is quickly becoming a primary interface for enterprise software, ...