
MiniMax Audio just leveled up with the new Speech-02 model! Get ultra-realistic Al voices (30+ langs, 99% similarity). Read Files/URLs & handle long text (200k chars). API available at: api@minimax.io.
MiniMax Audio just leveled up with the new Speech-02 model! Get ultra-realistic Al voices (30+ langs, 99% similarity). Read Files/URLs & handle long text (200k chars). API available at: api@minimax.io.
What sets MiniMax apart:
Zero-shot speaker cloning using raw audio (no transcripts required)
Flow-VAE model: no spectrograms needed, enabling faster and more natural speech
Multilingual and cross-lingual synthesis (supports Thai, Vietnamese, Cantonese, etc.)
MiniMax Audio's Speech-02-HD is now ranked #1 globally on Artificial Analysis Arena!