Voice API
Add speech capabilities to your REPLR with text-to-speech (TTS) and speech-to-text (STT). TTS is powered by our voice synthesis engine for natural, human-like voice output. STT is powered by our speech recognition system for fast, accurate transcription with language detection.
https://api.replr.ai/v1Text-to-Speech
Convert text into natural-sounding speech. The response is a raw binary audio stream — set your client to handle the audio/mpeg Content-Type returned by the API.
/v1/voice/ttsAuth RequiredSynthesize speech from text. Returns an audio/mpeg binary stream.
Request Body
| Name | Type | Required | Description |
|---|---|---|---|
text | string | Required | The text to synthesize into speech. Maximum 5,000 characters per request. |
voice_id | string | Optional | The voice to use for synthesis. Defaults to "alloy" (REPLR's default voice). See the voice table below for all available options. |
Examples
curl -X POST https://api.replr.ai/v1/voice/tts \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"text": "Hello from REPLR!", "voice_id": "alloy"}' \
--output speech.mp3Note: The response Content-Type is audio/mpeg. There is no JSON wrapper — the body is the raw MP3 binary. Stream it directly to a file or audio player.
Speech-to-Text
Transcribe an audio file into text. Upload the file as multipart/form-data with the field name audio.
/v1/voice/sttAuth RequiredTranscribe an audio file to text. Accepts multipart/form-data.
Request Body
| Name | Type | Required | Description |
|---|---|---|---|
audio | file | Required | The audio file to transcribe. Supported formats: mp3, wav, ogg, webm. Maximum file size: 25 MB. |
Response
{
"text": "Hello from REPLR!",
"confidence": 0.97,
"language": "en"
}Examples
curl -X POST https://api.replr.ai/v1/voice/stt \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "audio=@recording.mp3"Available Voices
Pass any of these values as the voice_id parameter in the TTS endpoint.
| Voice ID | Description |
|---|---|
alloy | Neutral and balanced — REPLR's default voice |
echo | Warm and resonant |
fable | Expressive and animated |
nova | Bright and energetic |
onyx | Deep and authoritative |
shimmer | Soft and clear |
Supported Audio Formats
The STT endpoint accepts the following audio formats. TTS always returns MP3.
Rate Limits
| Endpoint | Free Tier | Pro Tier |
|---|---|---|
/v1/voice/tts | 100 requests / minute | 1,000 requests / minute |
/v1/voice/stt | 60 requests / minute | 600 requests / minute |
Rate limit headers (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset) are included in every response. If you exceed the limit you will receive a 429 Too Many Requests response.
Streaming TTS Example
For low-latency playback, stream the audio response directly instead of waiting for the full download.
import requests
response = requests.post(
"https://api.replr.ai/v1/voice/tts",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={"text": "Stream me in real time.", "voice_id": "shimmer"},
stream=True,
)
with open("stream.mp3", "wb") as f:
for chunk in response.iter_content(chunk_size=4096):
f.write(chunk)