AI voice & speech models · directory

AI Models

Explore the open AI voice and speech models we host and document — try them live in your browser, then dive into a dedicated guide for each one. New models are added here as they launch.

Browse the models Open VoxCPM

The collection

Available AI models

Each model has its own page with a live demo, capabilities and an honest, hands-on guide.

Multilingual Text to Speech

OmniVoice

k2-fsa's open-source diffusion-LM TTS for 600+ languages, with zero-shot voice cloning and attribute-based voice design. Try the official Hugging Face demo embedded on Whisper Web.

600+ languagesZero-shot cloningVoice design24 kHz

Explore OmniVoice

Multilingual Text to Speech

VoxCPM

OpenBMB's tokenizer-free VoxCPM2 TTS model for 30 languages, voice design, controllable voice cloning and 48 kHz speech. Try the official Hugging Face demo embedded on Whisper Web.

30 languagesVoice designVoice cloning48 kHz

Explore VoxCPM

More models coming soon

We're adding more open AI voice and speech models to this directory. Check back soon, or start with VoxCPM today.

Also on Whisper Web

Looking for transcription?

These models pair well with our core browser-based speech-to-text workspace.

Miso One — Emotive TTS guide

Read our hands-on guide to the open MisoTTS model for lifelike, expressive speech and voice cloning.

Speech to Text

Upload or record audio and get accurate AI transcription, captions and notes right in your browser with Whisper Web.