Audio & Music Models
Generate music, speech, sound effects, and transcriptions with AI. Compare top audio models for text-to-music, text-to-speech, and speech-to-text.
meta / musicgen
A fast, controllable auto-regressive Transformer for high-fidelity music generation.
sound
LTX-2.3 Text-to-Audio
lightricks
Generate sound effects, ambience, and spoken-style audio from a text prompt, with duration you control down to the frame.
soundMiniMax Speech-02 HD
minimax
Turn text into natural, high-fidelity speech in 30+ languages with 300+ voices plus emotion, speed, pitch, and volume control.
sound
ACE-Step
ace-studio
Generate full songs or instrumental music from genre tags and optional lyrics, with duration you control up to 4 minutes.
musicElevenLabs Scribe v1
elevenlabs
Transcribe speech audio into accurate text with word-level timestamps, speaker labels, and audio-event tags across 99 languages.
speech-to-text
