Voice API

Add speech capabilities to your REPLR with text-to-speech (TTS) and speech-to-text (STT). TTS is powered by our voice synthesis engine for natural, human-like voice output. STT is powered by our speech recognition system for fast, accurate transcription with language detection.

Base URL

https://api.replr.ai/v1

Text-to-Speech

Convert text into natural-sounding speech. The response is a raw binary audio stream — set your client to handle the audio/mpeg Content-Type returned by the API.

POST/v1/voice/ttsAuth Required

Synthesize speech from text. Returns an audio/mpeg binary stream.

Request Body

Name	Type	Required	Description
`text`	string	Required	The text to synthesize into speech. Maximum 5,000 characters per request.
`voice_id`	string	Optional	The voice to use for synthesis. Defaults to "alloy" (REPLR's default voice). See the voice table below for all available options.

Examples

curl -X POST https://api.replr.ai/v1/voice/tts \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello from REPLR!", "voice_id": "alloy"}' \
  --output speech.mp3

Note: The response Content-Type is audio/mpeg. There is no JSON wrapper — the body is the raw MP3 binary. Stream it directly to a file or audio player.

Speech-to-Text

Transcribe an audio file into text. Upload the file as multipart/form-data with the field name audio.

POST/v1/voice/sttAuth Required

Transcribe an audio file to text. Accepts multipart/form-data.

Request Body

Name	Type	Required	Description
`audio`	file	Required	The audio file to transcribe. Supported formats: mp3, wav, ogg, webm. Maximum file size: 25 MB.

Response

{
  "text": "Hello from REPLR!",
  "confidence": 0.97,
  "language": "en"
}

Examples

curl -X POST https://api.replr.ai/v1/voice/stt \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "audio=@recording.mp3"

Available Voices

Pass any of these values as the voice_id parameter in the TTS endpoint.

Voice ID	Description
`alloy`	Neutral and balanced — REPLR's default voice
`echo`	Warm and resonant
`fable`	Expressive and animated
`nova`	Bright and energetic
`onyx`	Deep and authoritative
`shimmer`	Soft and clear

Supported Audio Formats

The STT endpoint accepts the following audio formats. TTS always returns MP3.

.mp3

.wav

.ogg

.webm

Rate Limits

Endpoint	Free Tier	Pro Tier
`/v1/voice/tts`	100 requests / minute	1,000 requests / minute
`/v1/voice/stt`	60 requests / minute	600 requests / minute

Rate limit headers (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset) are included in every response. If you exceed the limit you will receive a 429 Too Many Requests response.

Streaming TTS Example

For low-latency playback, stream the audio response directly instead of waiting for the full download.

import requests

response = requests.post(
    "https://api.replr.ai/v1/voice/tts",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={"text": "Stream me in real time.", "voice_id": "shimmer"},
    stream=True,
)

with open("stream.mp3", "wb") as f:
    for chunk in response.iter_content(chunk_size=4096):
        f.write(chunk)

Text-to-Speech

Convert text into natural-sounding speech. The response is a raw binary audio stream — set your client to handle the audio/mpeg Content-Type returned by the API.

POST/v1/voice/ttsAuth Required

Synthesize speech from text. Returns an audio/mpeg binary stream.

Request Body

Name	Type	Required	Description
`text`	string	Required	The text to synthesize into speech. Maximum 5,000 characters per request.
`voice_id`	string	Optional	The voice to use for synthesis. Defaults to "alloy" (REPLR's default voice). See the voice table below for all available options.

Examples

curl -X POST https://api.replr.ai/v1/voice/tts \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello from REPLR!", "voice_id": "alloy"}' \
  --output speech.mp3

Note: The response Content-Type is audio/mpeg. There is no JSON wrapper — the body is the raw MP3 binary. Stream it directly to a file or audio player.

Speech-to-Text

Transcribe an audio file into text. Upload the file as multipart/form-data with the field name audio.

POST/v1/voice/sttAuth Required

Transcribe an audio file to text. Accepts multipart/form-data.

Request Body

Name	Type	Required	Description
`audio`	file	Required	The audio file to transcribe. Supported formats: mp3, wav, ogg, webm. Maximum file size: 25 MB.

Response

{
  "text": "Hello from REPLR!",
  "confidence": 0.97,
  "language": "en"
}

Examples

curl -X POST https://api.replr.ai/v1/voice/stt \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "audio=@recording.mp3"

Voice ID

Description

alloy

Neutral and balanced — REPLR's default voice

echo

Warm and resonant

fable

Expressive and animated

nova

Bright and energetic

onyx

Deep and authoritative

shimmer

Soft and clear

Rate Limits

Endpoint	Free Tier	Pro Tier
`/v1/voice/tts`	100 requests / minute	1,000 requests / minute
`/v1/voice/stt`	60 requests / minute	600 requests / minute

Rate limit headers (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset) are included in every response. If you exceed the limit you will receive a 429 Too Many Requests response.

Streaming TTS Example

For low-latency playback, stream the audio response directly instead of waiting for the full download.

import requests

response = requests.post(
    "https://api.replr.ai/v1/voice/tts",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={"text": "Stream me in real time.", "voice_id": "shimmer"},
    stream=True,
)

with open("stream.mp3", "wb") as f:
    for chunk in response.iter_content(chunk_size=4096):
        f.write(chunk)