Voices & models
Two choices shape how your agent sounds and thinks: its voice and its model. Both are set per agent on the agent form and can be changed any time.
Voices
Televox uses Deepgram Aura-2 — natural, low-latency English voices. Eleven options:
| Voice | Gender | Notes |
|---|---|---|
| Thalia | Female | Recommended — warm, clear, neutral. A safe default for any business. |
| Andromeda | Female | The form’s default; friendly. |
| Luna | Female | Soft, calm. |
| Athena | Female | Confident, professional. |
| Hera | Female | Mature, measured. |
| Helena | Female | Bright, upbeat. |
| Aurora | Female | Gentle. |
| Orion | Male | Steady, professional. |
| Arcas | Male | Friendly. |
| Apollo | Male | Energetic. |
| Zeus | Male | Deep, authoritative. |
Choosing a voice:
- Match the brand and audience — a spa and a law firm want different energy.
- Listen before you commit. Use the agent’s share link and have a short call; tone reads differently spoken than on paper.
- All Aura-2 voices are fast, so voice choice doesn’t change latency — pick purely on feel.
Models
The model is the agent’s reasoning engine. Faster models reply quicker; more capable ones reason better.
| Model | Speed | Strengths | Use it when |
|---|---|---|---|
| GPT-4.1 Mini | Fast | Reliable instruction-following + action-calling (booking, etc.) | Default for almost everyone. |
| GPT-4.1 Nano | Fastest | Lowest latency/cost | Simple Q&A agents; test booking/actions first — it’s less reliable at them. |
| GPT-4o Mini | Fast | Solid all-rounder | A good alternative to 4.1 Mini. |
| GPT-4o | Moderate | Most capable reasoning | Complex, multi-step conversations where quality matters more than speed. |
Start with GPT-4.1 Mini. It’s the best balance of fast turns and dependable actions, and it’s tuned for prompt-cached, low-latency responses. Only move to GPT-4o if you genuinely need deeper reasoning, or to Nano if the agent is simple and you want maximum speed.
Latency — what actually makes a call feel fast
A natural conversation feels instant when each turn is well under a second. The biggest lever is the model’s time-to-first-token — which is exactly why the default is a fast model and why prompts are structured for caching. You can see the real breakdown for any call in Observability (the latency waterfall).
If turns feel slow:
- Make sure you’re on GPT-4.1 Mini (or Nano for simple agents).
- Keep the instructions short and let Knowledge carry the facts.
- Check the latency waterfall to see whether it’s the model or an action (e.g. a slow calendar) that’s costing time.