Skip to Content
ReferenceVoices & models

Voices & models

Two choices shape how your agent sounds and thinks: its voice and its model. Both are set per agent on the agent form and can be changed any time.


Voices

Televox uses Deepgram Aura-2 — natural, low-latency English voices. Eleven options:

VoiceGenderNotes
ThaliaFemaleRecommended — warm, clear, neutral. A safe default for any business.
AndromedaFemaleThe form’s default; friendly.
LunaFemaleSoft, calm.
AthenaFemaleConfident, professional.
HeraFemaleMature, measured.
HelenaFemaleBright, upbeat.
AuroraFemaleGentle.
OrionMaleSteady, professional.
ArcasMaleFriendly.
ApolloMaleEnergetic.
ZeusMaleDeep, authoritative.

Choosing a voice:

  • Match the brand and audience — a spa and a law firm want different energy.
  • Listen before you commit. Use the agent’s share link and have a short call; tone reads differently spoken than on paper.
  • All Aura-2 voices are fast, so voice choice doesn’t change latency — pick purely on feel.

Models

The model is the agent’s reasoning engine. Faster models reply quicker; more capable ones reason better.

ModelSpeedStrengthsUse it when
GPT-4.1 MiniFastReliable instruction-following + action-calling (booking, etc.)Default for almost everyone.
GPT-4.1 NanoFastestLowest latency/costSimple Q&A agents; test booking/actions first — it’s less reliable at them.
GPT-4o MiniFastSolid all-rounderA good alternative to 4.1 Mini.
GPT-4oModerateMost capable reasoningComplex, multi-step conversations where quality matters more than speed.

Start with GPT-4.1 Mini. It’s the best balance of fast turns and dependable actions, and it’s tuned for prompt-cached, low-latency responses. Only move to GPT-4o if you genuinely need deeper reasoning, or to Nano if the agent is simple and you want maximum speed.

Latency — what actually makes a call feel fast

A natural conversation feels instant when each turn is well under a second. The biggest lever is the model’s time-to-first-token — which is exactly why the default is a fast model and why prompts are structured for caching. You can see the real breakdown for any call in Observability (the latency waterfall).

If turns feel slow:

  1. Make sure you’re on GPT-4.1 Mini (or Nano for simple agents).
  2. Keep the instructions short and let Knowledge carry the facts.
  3. Check the latency waterfall to see whether it’s the model or an action (e.g. a slow calendar) that’s costing time.
Last updated on