← All Models

Grok TTS

xAI Voice

xAI's text-to-speech model with 169 expressive voices across 36 languages. Browse and filter the catalog by language, gender, tone, and use case via the voices API. Supports inline speech tags ([pause], [laugh], <whisper>) for fine-grained delivery control, plus the original Grok voices (eve, ara, rex, sal, leo).

Specifications

ProviderxAI
CategoryVoice

Pricing (Starter tier)

1,950.00 credits / 1M input characters

Higher tiers get volume discounts. See tiers

Quick Start

curl -X POST https://query.genx.sh/api/v1/generate \\ -H "Authorization: Bearer YOUR_API_KEY" \\ -H "Content-Type: application/json" \\ -d &#39;{"model":"grok-tts","params":{"text":"Hello, welcome to GenX. How can I help you today?","voice_id":"eve","language":"en","codec":"mp3"}}'

Parameters

NameTypeRequiredDefaultDescription
text string required The text to convert to speech (max 15,000 characters). Supports inline speech tags such as [pause], [laugh], and <whisper>text</whisper>.
voice_id string optional eve Voice ID. Call GET /api/v1/models/grok-tts/voices to browse 169 voices across 36 languages, filterable by language, gender, tone, and use_case. Legacy IDs eve, ara, rex, sal, leo continue to work.
language string required en BCP-47 language code (en, es, fr, de, ja, ko, hi, ar, ru, pt, vi, sv-se, and many more — match the voice's native language for best results).
codec string optional mp3 Audio codec: mp3, wav, pcm, mulaw, alaw.
sample_rate number optional 24000 Sample rate in Hz: 8000, 16000, 22050, 24000, 44100, 48000.
bit_rate number optional 128000 Bit rate for MP3 only: 32000, 64000, 96000, 128000, 192000.

Try Grok TTS on GenX Router

500 free credits on signup. No credit card required.