Case Study · Georgia AI SDR · 2026

A voice agent,
ten orchestrated
scenarios,
one outbound
pipeline.

Built by
Jacob Oliker
Role
AI GTM / Customer Solutions
Stack
Retell AI · Make.com · Twilio · Google Workspace · Anthropic · OpenAI
Status
● Live in production
10Make.com scenarios orchestrated
11Conversation nodes in the agent
4+7Calls + emails per prospect
2LLMs working in concert
§01 · The system

Four platforms, one nervous system.

Each platform owns one capability and stays in its lane. The intelligence lives in how they hand off: Retell carries the voice, Make does the orchestration, Twilio handles telephony and SMS, Google Workspace is the system of record. Replace any one of them and the rest still works.

§ 01·a — Voice

Retell AI

Hosts the agent. An 11-node conversation graph runs on GPT-4.1 Mini for live dialogue, with Claude Sonnet 4.6 doing structured extraction the moment the call ends.

GPT-4.1 Mini · Claude Sonnet 4.6 · custom analysis schema
§ 01·b — Orchestration

Make.com

Ten scenarios chained end-to-end. One paces the dialer. One handles every post-call event. Eight more govern the email cadence — triggered, branched, and idempotent.

10 scenarios · 4 routers · webhook + scheduled triggers
§ 01·c — Telephony

Twilio

Outbound calls go over a SIP trunk to Retell. Confirmation SMS flows back through a Messaging Service. Brand registered, A2P 10DLC compliant, CNAM verified end-to-end.

SIP trunk · A2P 10DLC · CNAM · Messaging Service
§ 01·d — System of record

Google Workspace

Sheets are the database — Call Queue plus an append-only log. Docs auto-generate per-call qualified-lead reports from a template. Drive holds collateral the emails attach.

Sheets API · Docs templates · Drive folders
§02 · Architecture

One lead, traced through every system.

From the moment a row is added to the lead sheet to the morning the team lead gets a text saying a meeting was just booked — every hop, every API call, every state mutation. The accent path is the happy case.

SOURCE OF TRUTH ORCHESTRATION VOICE LAYER EXTERNAL SERVICES OUTPUT Lead Sheet CALL QUEUE TAB [ apollo · csv ] new row Enrichment SCENARIO 01 → Serper · OpenAI Serper GOOGLE SEARCH OpenAI GPT-4o-MINI · JSON writes summary + keywords Cold Email · E1 SCENARIO 02 Outbound Dialer SCENARIO 03 · CALLS 1-4 [ 4-route router ] POST Retell · Georgia GPT-4.1 MINI · 11 NODES [ live conversation ] Twilio SIP TRUNK · VOICE Anthropic SONNET 4.6 · EXTRACT call_analyzed event Post-Call Webhook SCENARIO 04 · INSTANT [ 4 outcome routes ] append + update Twilio SMS CONFIRM + ALERT Prospect / Lead Google Docs CALL REPORT GEN Email Cascade SCENARIOS 05 — 10 [ E2 · E3 · E4 · E5 · E6 · E7 ] state-driven · no orchestrator claude · per-lead phrase Gmail SEND / REPLY
scroll to pan

The accent paths trace the qualified-lead flow. The agent never has to know about the email cascade or the report generator — every system reads from the sheet, mutates one column, and exits. Composability falls out naturally.

§03 · The agent

Eleven nodes. Two mandatory outcomes.

Georgia never ends a call without either (a) a meeting on the calendar or (b) a specific callback day and time. The conversation graph enforces this with three global nodes that any state can route into — objection handling, callback, not-interested — and four terminal nodes that close the call cleanly.

HAPPY PATH IN ACCENT · GLOBAL ROUTES DASHED · TERMINALS BELOW opener CONFIRM IDENTITY opener_pitch 27-SEC FRAME qualifying 3 QUESTIONS close DAY · TIME · PHONE BRANCH A · saw email · skip qualifying objection_handling ★ GLOBAL · 1 PUSH MAX callback ★ GLOBAL · DAY+TIME not_interested ★ GLOBAL · 1 SOFT ASK TERMINAL · END CALL · SET call_outcome end_call_qualified → MEETING BOOKED happy path end_call_callback → TRY BACK LATER callback queue end_call_not_interested → SUPPRESS do-not-contact wrong_person → APOLOGIZE · END filtered out Always lead with the prospect’s problem · never push back more than once · disclose the AI
scroll to pan
Entry / linear flow Conversation node Global · routable from anywhere Terminal · ends call
§04 · The pipeline

Ten scenarios, two weeks of contact.

Each scenario owns one stage of the lead's lifecycle. They share state through the lead sheet, so any one of them can be paused, edited, or replaced without touching the others. This is the entire outbound cadence as it actually runs in production.

01
DAY 0 · ON ARRIVALCompany Enrichment
Each new lead is hit with a Google search via Serper, then summarized by GPT-4o-mini into a five-to-fifteen-word "your X" phrase plus a plural industry keyword. Both write back to the row.
SheetsSerperGPT-4o-miniJSON output
02
DAY 0 · 30 MIN AFTERCold Email · E1
First touch. Subject line is a single lowercase word. Body weaves the enriched phrase into a four-line note that ends with a soft ask. Captures the Gmail thread ID for every reply that follows.
Gmailthread capture
03
DAY 2 — 12 · CALLS 1–4Outbound Dialer
One scenario covering all four call attempts via a 4-route router. The first module evaluates "is this row eligible for call N?" across N=1..4, and the matching route stamps the right call_id, status, and date columns. Runs every 15 minutes during outbound hours.
Retell APITwilio SIProuter · 4 paths~20 calls/day
04
EVENT · CALL_ANALYZEDPost-Call Webhook
Fires the moment Retell finishes its post-call extraction. Branches on the agent's call_outcome field: qualified routes to SMS confirmation + Doc generation + an alert text to the internal team lead; callback writes a separate report; not-interested and email-requested log silently. Four routes, no fallback.
webhook · instantrouter · 4 outcomesTwilio SMSDocs from template
05
CALL 1 → VM · DAY 4Post-Call 1 Email · E2
If Call 1 hit voicemail or IVR, drop a three-sentence reply on the original Email 1 thread. A second filter on the AI SDR Leads tab confirms the call's outcome before the email actually fires.
Gmail · replycross-tab filter
06
CALL 2 → VM · DAY 8Post-Call 2 Email · E3
Second voicemail escalation. Replies on the same thread, but this time attaches the case study PDF — fetched fresh from Drive at send time so the asset can be swapped without redeploying.
Drive · fetchattachment
07
CALL 3 · DAY 101st Non-VM Email · E4
Pivots to a fresh thread. Calls Anthropic Claude Sonnet 4 to generate a per-prospect personalization phrase, then sends the cold email and captures the new thread ID. From here the cadence chains off this thread, not Email 1.
Anthropic Sonnet 4new threadthread capture
08
CALL 4 · DAY 142nd Non-VM Email · E5
A short referral ask, replied on the Email 4 thread. Anthropic generates a one-sentence bridge between the apology opener and a case-study reference; the body asks "is there someone else on the team?" and lets it breathe.
Anthropic Sonnet 4reply
09
DAY 18 · NEW THREADNew Thread · E6
Last attempt to surface in the inbox after prior threads went quiet. Fresh subject, fresh thread, thumbs-up-or-thumbs-down ask. Captures one more thread ID for the break-up reply.
Gmail · new thread
10
DAY 22 · FINALBreak-up Email · E7
Replies on the Email 6 thread with the case study attached. Closes with a 30-day re-engagement promise. After this the lead exits the active cadence and only re-enters on manual reset.
Drive · fetchattachmentterminal
§05 · LLMs in concert

Two models, three jobs.

Each model is picked for the job it does best, not for the resume. Retell runs GPT-4.1 Mini for live conversation because latency is the constraint that matters. Sonnet 4.6 handles structured extraction post-call where accuracy on edge cases is what matters. A separate Sonnet 4 instance writes per-prospect copy in the email cadence. The cheapest option, GPT-4o-mini, handles the bulk enrichment job where simple JSON is the whole product.

Layer 01

Live conversation & extraction

GPT-4.1 Mini in-call · Claude Sonnet 4.6 post-call

Georgia speaks on GPT-4.1 Mini — fast, cheap, fluent enough to hold a live cold call without dead air. The post-call extraction is a different problem: pull thirteen structured fields out of a free-form transcript, including nuanced ones like user_sentiment and objections_raised. Sonnet 4.6 takes that pass.

Both run inside Retell. The only thing my code has to do is write the agent prompt and define the extraction schema.

Layer 02

Bulk enrichment · pre-call

OpenAI GPT-4o-mini · JSON-mode · 200 tokens

Every new lead gets searched on Serper, then handed to GPT-4o-mini with strict instructions: return JSON with two fields, a "your X" distinctive phrase and a plural industry keyword. The same fields get used by the cold email, the dialer's prompt to Georgia, and three of the email-cadence scenarios.

FIELD 1 — distinctive_phrase Start with "your" followed by a possessive descriptor. 5 to 15 words only. No period. Example: "your community-focused approach to craft brewing" If you cannot find enough information, write: "your work in this space"
Layer 03

Per-prospect personalization · in-cadence

Anthropic Claude Sonnet 4 · 80–120 tokens

Two of the seven email scenarios call Sonnet 4 inline, mid-pipeline, to generate a single sentence sized exactly to the email. The 1st Non-VM email gets a "your X" phrase that bridges into the body; the 2nd gets a 15–30 word sentence that connects a voicemail apology to a case study reference.

Strict prompt rules: no hype words, no geography, no questions, no quotes. Output drops in clean.

Layer 04

The fail-safe

Static fallbacks · enforced by prompt

Each prompt has an escape hatch. If GPT-4o-mini can't find enough about a company, it writes "your work in this space" and "local businesses". If Sonnet returns junk, the email body still uses the static enriched phrase from the row, not the fresh generation — the AI is layered on, not load-bearing.

The system survives a model failure. That was the point of the design.

§06 · The stack

Picked, not collected.

Twelve services. Each one earns its place. No "we already had it" in this list — it's all decisions.

Voice agentRetell AIBest-in-class outbound voice with a node-based conversation editor and webhooks where they need to be.
In-call LLMGPT-4.1 MiniCheap and fast enough to keep up with live human latency. The conversation does not need a frontier model.
Post-call LLMClaude Sonnet 4.6Most reliable on structured extraction with edge-case handling. Pulls 13 fields from messy transcripts.
OrchestrationMake.comFaster to build than a backend, easier to hand off than custom code, with router primitives that map to outbound logic 1:1.
TelephonyTwilioSIP trunk to Retell, Messaging Service for SMS. A2P 10DLC and CNAM registered end-to-end.
EmailGmail APIReal inbox, real thread IDs, real deliverability. Reply-on-thread is a first-class operation.
DatabaseGoogle SheetsThe lead sheet is the database. Every scenario reads and mutates the same row. Anyone can audit it.
ReportsGoogle DocsPer-call qualified-lead reports auto-generate from a token-replaced template, dropped into Drive folders by outcome.
Bulk enrichmentGPT-4o-miniPennies per lead. JSON-mode means the output drops directly into the sheet without parsing.
PersonalizationClaude Sonnet 4Better-than-generic copy in two of seven email scenarios. Strict prompt rules keep the voice consistent.
Web searchSerper.dev$0.001 per query. Three results is enough. The structured response means no scraping.
StorageGoogle DriveHosts the case study attachment and the report folder tree. Fetched fresh per send so assets are swappable.
§07 · The build

Six choices I can defend.

The kinds of decisions that don't show up in a feature list, but separate "I wired some tools together" from "I built a system."

i.

One dialer scenario for four call attempts

The first instinct is four scenarios — Call 1, Call 2, Call 3, Call 4. I built one with a 4-block OR filter and a 4-route router. The filter evaluates the row's state to find which attempt is due; the router stamps the right column. Three fewer scenarios to maintain, one place to change cadence.

ii.

State in the sheet, not the orchestrator

Every scenario reads the row, decides if it should run, mutates one column, and exits. No central state machine, no queue. Adding a new touchpoint means writing a filter, not editing a workflow. The whole pipeline can be re-built one scenario at a time.

iii.

Webhook on call_analyzed, never call_ended

call_ended fires before Retell finishes the LLM extraction. By the time the webhook arrived, half the fields would still be empty. call_analyzed waits for the post-call pass to complete, so the router has every field it needs on first contact.

iv.

The agent discloses she's an AI, every time

Two reasons. Legally, it's the safe path. Practically, it's the right answer to the most common objection ("Are you a robot?"), so the script handles it warmly and converts it into the pitch. Hiding it would create the very objection the rest of the call has to absorb.

v.

The model picks the agent's day-of-week

The dialer passes call_day and call_date as dynamic variables on every call, so Georgia can say "today's Tuesday" without the agent prompt knowing what today is. Same agent prompt, different conversation, every day.

vi.

Two LLMs in concert, not in competition

GPT for the live call. Sonnet for the structured extraction. Sonnet again for personalized copy. GPT-4o-mini for the cheap bulk job. None of the choices were "we already use OpenAI" — each one was the cheapest model that met the bar for that job.

§08 · End of case study

Same shape, different domain.

Georgia is outbound for a B2B services firm. I built the same shape for individual athletes — different stack, different audience, same problem of running personalized outreach at scale.

Or return to the index for contact and other work.