AI Call Assistant

Overview

The AI Call Assistant is a voice agent that answers WhatsApp business phone calls, holds a real spoken conversation with the caller, can look things up or take actions mid-call (its skills), and records every call in Call Logs. You build and manage assistants under More → AI Call Assistant.

You use this page to set up the assistant and review its calls. The live conversation itself — turning the caller's voice into text, sending it to the AI, turning the AI's reply into speech, and playing it back — runs automatically in the background once a call comes in. The assistant simply uses the settings you save here.

Each assistant belongs to one workspace and is built with a five-step wizard: Identity, AI Brain, Skills, Voice, and Routing. Every assistant has a status (Live, Draft, or Paused) and an on/off switch; it only answers calls when it is switched on and set to Live.

Prerequisite — the admin sets the provider keys. The AI provider keys (OpenAI, Anthropic Claude, Google Gemini) and voice keys (ElevenLabs for the spoken voice, Deepgram for transcription) are set up once by the platform admin under Admin → AI & API Keys. You do not need to paste a key — every assistant uses the admin's keys automatically (see Your own keys vs admin keys). Add a per-assistant key only if you want this one assistant billed to your own account.
Plan-gated — two features. Building assistants needs the AI Voice Agent plan feature. The calls those assistants answer, plus the Call Logs viewer, need WhatsApp Calling — the same feature behind WhatsApp Calling. Playing back call recordings needs a third feature, Call Recording. If your plan doesn't include these, the menu items are hidden.

How a live call works

Knowing what happens during a call makes every wizard field easier to understand. Each call runs through these steps:

  1. WhatsApp delivers the call. An incoming call arrives for your number, tagged with the caller's details and which assistant should answer.
  2. The assistant loads your settings. It pulls in the assistant's configuration and the voice keys for your workspace. If either the transcription key (Deepgram) or the speech key (ElevenLabs) is missing, the call is declined up front so the voicemail fallback handles it instead of leaving the caller in silence.
  3. The call connects. The audio link to the caller is set up and answered.
  4. Greeting. As soon as the line is live, the agent speaks a greeting — picked at random from your greeting variations — so the caller never hears dead air.
  5. Listen. The caller's speech is transcribed to text in real time (using Deepgram). Each finished sentence becomes one transcript turn.
  6. Think. The agent takes the recent conversation (roughly the last 12 turns) plus your behaviour brief and asks the AI model you chose for a reply.
  7. Speak. The reply text is turned into a natural voice (using ElevenLabs) and played back to the caller.
  8. Loop / exit. Listen, Think, and Speak repeat. If the caller says a hangup keyword, the agent speaks the goodbye line and ends the call. When the call ends, recordings are saved and the final transcript is stored.

Transcript turns are saved as the call happens, which is why a call in progress already shows a growing transcript in Call Logs.

Creating an assistant

From the AI Call Assistant list, click New assistant. The wizard always saves the full assistant state at once, so you can jump between steps freely and click Save at any time. Saving a brand-new assistant returns you to its edit URL so you can keep refining it.

  1. Open the wizard (or open an existing assistant to edit it).
  2. Work through the five steps below. The side panel shows a live, text-only preview of how a call would read, plus the chosen AI provider and voice.
  3. Set the status. Leave it as Draft while building; flip the Live switch in Step 1 when you are ready for it to answer real calls.
  4. Click Save. The assistant is saved to your workspace, and all skills are saved exactly as shown in the form.
  5. Test from a real phone before relying on it — the in-app preview is text-only and doesn't test the voice or the actual phone call.

A brand-new assistant starts with the defaults shown in the reference tables below; you can change any of them.

Step 1 · Identity & persona

Defines who the agent is and how it opens a call.

FieldRequiredDefaultWhat it does
Agent nameYes (max 120)Shown in call logs and transcripts, e.g. "Riley · Acme Support".
Languages the agent speaksNoEnglishOne or more languages. The first one is used to transcribe the caller's speech for that call.
Starting personaNoSupportA preset that sets the tone — see the persona table below. You can override the wording in Step 2.
Greeting variationsNo (max 5)One default lineSeveral opening lines; one is picked at random per call so it doesn't sound robotic.
Live switchNoOnTogether with the status, controls whether the agent answers calls.

Persona presets

PersonaCharacter
SupportPatient, helpful, resolves issues calmly.
SalesFriendly, persuasive, qualifies before pitching.
SchedulerCrisp, calendar-aware, books slots end-to-end.
ConciergeWarm, contextual, remembers prior callers.

Step 2 · AI Brain

Chooses which AI model powers the agent and how it behaves. You can use any of three providers — pick one, then set the exact model name, an optional key of your own, the behaviour brief, an optional knowledge-page link, and three personality sliders.

FieldRequiredDefaultNotes
ProviderYesGeminiGemini (Google), OpenAI, or Anthropic.
ModelYes (max 80)gemini-2.5-flash-liteThe exact model name, typed in free-text, so you can switch to a newer model without waiting for an app update.
Your own keyNo (max 500)blankOptional; leave blank to use the admin's key. See Your own keys vs admin keys.
Behaviour briefNo (max 6000)blankInstructions that shape every reply — who the agent is and how it should act.
Knowledge source URLNo (max 500, must be a web link)blankA web page the agent can draw on for your own content.
Warmth / Formality / PaceNo60 / 50 / 50Sliders from 0–100. Pace controls how concise the agent is.

Model reference

These are the model names the wizard suggests for each provider. The default for a new voice agent is gemini-2.5-flash-lite — chosen because Gemini responds fastest on voice calls.

ProviderSuggested modelsUse it for
Gemini · Googlegemini-2.5-flash-lite (default), gemini-2.5-flash, gemini-2.5-proFastest on voice; flash-lite for high call volume, pro for the hardest reasoning.
GPT · OpenAIgpt-4o-mini, gpt-4o, gpt-4.1Best reasoning and use of skills.
Claude · Anthropicclaude-haiku-4-5-20251001, claude-sonnet-4-6, claude-opus-4-7Steadiest tone; haiku for fast cheap calls, opus for complex policy.

Behaviour brief & personality

  • Behaviour brief. State who the agent is, what to ask for, which skills to use, and when to hand off to a human. If you leave it blank, the agent uses a generic instruction: "you are a helpful voice assistant on a phone call — reply with one short sentence at a time so the caller can interject."
  • Knowledge source (web link). An optional page the agent can reference so it answers from your own content.
  • Warmth / Formality / Pace. Fine-tune the delivery beyond a single tone setting. Pace also keeps the agent concise so it doesn't over-talk.
  • Your own key. Leave blank to use the admin's key. If a key is already saved, the field shows "saved · leave blank to keep" and an empty box never erases it.

Step 3 · Skills (tools)

A skill lets the agent call out to one of your web services in the middle of a conversation — to look up an order, book a slot, or create a ticket. The agent is told "call this if the caller asks about X"; when it decides to, it sends a request to your web address with the details it picked up from the conversation, then uses the response in its reply. Each assistant can have up to 25 skills.

Skill fieldRequiredWhat it does
Function nameYes (max 80)How the agent refers to the skill, e.g. track_order.
Trigger keywordsNoPhrases the agent listens for; when heard, it pulls the details from the caller's words and runs the skill.
Request methodYesOne of GET, POST, PUT, PATCH, or DELETE (matching how your web service expects to be called).
URLYes (max 600, must be a web link)The web address to call.
HeadersNoOptional sign-in headers, as name/value pairs.
ParametersNoThe data to send with the request.

Skills can chain: the agent calls one, reads the response, then calls another in the same turn. You can use @placeholder tokens in skill values that are auto-filled from the caller's history. Every skill the agent uses is recorded in the tool-call timeline with its details and the time into the call.

Save behaviour. Saving stores the skills exactly as shown in the form — so removing a skill in the form removes it on save, and there are never leftover entries.

Step 4 · Voice & listening

Controls how the agent sounds (text-to-speech) and how it hears the caller (speech-to-text).

Text-to-speech engine (the voice)

EngineCharacter
ElevenLabs (default)Premium, natural-sounding — the most lifelike voices, and the one used today.
OpenAIFast, six built-in voices.
Deepgram AuraUltra-fast responses.

Speech-to-text engine (the listening)

EngineCharacter
ElevenLabs · Deep Analysis (default)Deep-analysis transcription.
Deepgram Nova-2Fastest — the engine used for the live call loop.
OpenAI WhisperMost accurate.
Google SpeechGoogle's speech recognition.

Voice ID, your own key & noise suppression

  • Voice ID (max 80) — optional specific voice; leave blank for the provider's default (ElevenLabs falls back to its "Rachel" voice).
  • Your own voice key (max 500) — optional; same "blank keeps the saved key" behaviour as the AI key.
  • Background noise suppression (default on) — strips traffic, fan, and keyboard sounds before transcription, improving accuracy on noisy lines.

Step 5 · Routing & recording

Decides what gets recorded, what happens on voicemail, and when to hand off to a person.

FieldDefaultWhat it does
Record agent audioOnSaves the AI voice side of the call.
Record caller audioOnSaves the caller's side. Also needs the Call Recording plan feature.
Auto transcriptOnSaves the turn-by-turn transcript.
Voicemail / no-answer behaviourLeave messageSee the table below.
Hangup keywordsbye, goodbyeWords that end the call; the agent speaks the goodbye line, then hangs up after about 1.5 seconds.
Hand off to teamblankWhen the AI is unsure, send the conversation to a Team Inbox queue. Blank means no handoff.
Goodbye line"Thank you for calling. Goodbye!"The closing message the agent speaks before hanging up (max 500).

Voicemail / no-answer behaviour

OptionWhat happens
Leave messageSpeaks the greeting and hangs up.
Retry in 1 hourSchedules a callback automatically.
Silent logHangs up; only logs the attempt.

Recordings and transcripts appear in Call Logs exactly as you set them here — the three recording switches are independent, so a call can have a transcript but no audio, or one side of the audio but not the other.

Your own keys vs admin keys

Both the AI key and the voice key are optional and stored securely. The platform decides which key to use at call time in this order:

  1. Your workspace's own key. If your plan allows your own keys and your workspace has an active key for that provider, your key is used and usage bills to you.
  2. Admin's key. Otherwise the admin's key for that provider (from Admin → AI & API Keys) is used, and usage counts toward your workspace's monthly AI usage limit.
  3. Neither. If no key is available, you'll see a clear error — and on a live call, the call is declined so voicemail handles it rather than the caller hearing silence.

In practice: leave both key fields blank and the assistant uses the admin keys automatically. Add a key only when this specific assistant needs a different voice or its own quota.

Managing assistants

From the list you can:

  • Pause / Go live — switch an assistant between Live and Paused without opening the wizard.
  • Duplicate — copies the full configuration and all skills into a new draft named "(copy)", handy for spinning up a variant.
  • Delete — removes the assistant but keeps its call logs readable. Calls already in progress continue; new calls go unanswered.

The list shows per-status counts (all, live, draft, paused) and, for each assistant, its model, voice provider, skill count, and how many calls it handled in the last 24 hours. A "Call logs" shortcut sits next to "New assistant".

Cost cautions

A live voice agent bills three AI meters per call — the AI model (tokens in and out), speech-to-text (seconds of audio transcribed), and text-to-speech (characters spoken) — on top of your WhatsApp/WABA calling minutes. Premium voices and larger models cost noticeably more per minute.
  • Keep assistants in Draft until tested. A badly written brief that makes the agent ramble inflates both the AI and voice spend on every call.
  • Prefer the lite / mini / haiku models and a concise Pace for high call volume.
  • Watch per-call cost on the call detail page, which breaks out the three metered parts: AI usage, transcription, and the spoken voice.
  • When you use admin keys, AI usage counts toward your plan's monthly usage limit; hitting the limit blocks further AI calls until you upgrade or switch to your own key.

Troubleshooting

SymptomLikely cause & fix
The assistant never answers callsIt must be switched on and set to Live. Check the Step 1 Live switch and the status badge in the list. Also confirm your plan includes WhatsApp Calling and a number is routed to this assistant.
Caller hears a click then silence, or the call is declinedThe call is declined when the transcription (Deepgram) or voice (ElevenLabs) key is missing for your workspace. Have the admin set both under AI & API Keys; the voicemail fallback handles the caller meanwhile.
The agent talks but cannot hear the callerThe transcription key is missing or the line is very noisy. Check the Deepgram key and turn on Background noise suppression in Step 4.
A skill never firesTighten the trigger keywords and confirm the web address can be reached. Check the tool-call timeline in Call Logs to see whether the agent tried the call.
The menu item is missing entirelyYour plan lacks the AI Voice Agent feature. Upgrade or move to a plan that includes it.
A saved key seems to disappearKeys are stored securely and never shown back; the field reads "saved · leave blank to keep". An empty box keeps the saved key — type a new value only to replace it.
WaDesk Documentation