BACKGROUND

A short bio,
in five entries.

2004 → PRESENT

Subeta

Virtual pet site I started as a teenager. I emancipated, and that's where I learned how to learn.

2013

Code for America

Year in NYC. Exposed to civic problems at every scale — from a single block to the whole city.

2014 → 2017

Nava PBC

Lead engineer on a rewrite of healthcare.gov, and helped stand up Nava's integrated-benefits practice. I know what it's like to transform a million forms.

2017 → 2024

Glitch → Fastly

Wanted to make a place where anyone could build the internet. Saw the first wave of AI chatbots there. After Fastly's acquisition, started to see what agents browsing the internet look like to a CDN.

2025 → NOW

Propel

AI residency. Responding to HR1 at the state level — SNAP and Medicaid.

THE RESIDENCY · WHAT I DO

A year on one question.

How should states answer HR1 for SNAP and Medicaid — and where does AI actually help the people running those programs?

THE ROLE

Resident, not staff.

A researcher embedded inside Propel — free to poke around, publish, and follow the question.

THE FOCUS

HR1 → state programs.

What changes for SNAP and Medicaid when the policy actually lands at the state level.

THE RESOURCE

Propel, on tap.

The largest benefits app in the US — real users, real data, real ground truth when I need it.

OCTOBER 2025 · A RAPID RESPONSE

When the
shutdown hit.

Government shutdown.
SNAP benefits in limbo.
People needed food now.
No central, current, machine-readable directory of where to go.

OCTOBER

2025

DAY 1 · SNAP IN LIMBO

> user query, Oct 14:
> "where do i go to get food this week"

RESPONSE

What we built
between Friday & Friday.

STEP ONE

Source
directories

Thousands of orgs, scattered across hundreds of stale sites.

→

STEP TWO

Multi-layer
AI pipeline

Claude searches, Google Places locates, Jina parses — cross-validated before going live.

→

STEP THREE

National
food bank DB

Live, validated, machine-readable. In production days later.

Days,

not quarters.

A year ago this would have been an RFP.
Today it's a long week.

METHOD

The web already
has the structure.

Jina.ai: URL in, clean markdown out.
Follow the natural link graph between directories.
Stop thinking scraper. Start thinking reader.

Mindset shift: the web isn't N sites to parse. It's one giant linked document.

RAW HTML · before

<div class="content-wrap"><div class="row"><div class="col-md-8">
<h2 class="heading-primary">St. Anthony's Foundation</h2>
<p class="lead"><span style="font-weight:bold">Address:</span>
150 Golden Gate Ave, SF, CA 94102<br/><span>Hours:</span> M-F
11:30-12:30</p><p>Serves: hot meals, no ID required</p>...

↓

CLEAN MARKDOWN · after

## St. Anthony's Foundation
**Address:** 150 Golden Gate Ave, SF
**Hours:** M–F 11:30–12:30
**Serves:** hot meals, no ID required

↓

→ STRUCTURED, VALIDATED, IN THE DB

SHUTDOWN, PART TWO · STATE NOTICES

Mail wasn't fast enough.

01 · SCRAPE

Every state's
SNAP dept.

Sites, Facebook, Twitter — wherever notices actually showed up.

→

02 · DELIVER

Straight into
the app.

In each state's own words. We didn't rewrite a thing.

→

03 · REUSE

Same pipeline,
new policy era.

Now tracking post-shutdown program changes, state by state.

Speed. Not editorial.

Propel as editor of state government is a worse problem than any state's tone.

HOW WE TESTED · HR1 WORK REQUIREMENTS

Thousands of conversations,
in a couple of weeks.

01 · RECRUIT

Push notif
to the segment

via the Propel app

02 · INTERVIEW

AI voice agent
runs the call

thousands / day

03 · FLAG

Transcripts
scored & sorted

edge cases surface

flagged cases

04 · HUMAN RESEARCHERS

Deep follow-up on flagged cases.

Where the empathy work actually happens. The AI sorts; people listen.

AN UNEXPECTED FINDING

People felt more heard
by the voice bot.

Even knowing it was AI.

More willing to share what was wrong. No feeling of trauma-dumping on another person.

SCOPE OF THE FINDING

What we know vs. what we suspect.

WE KNOW

Low-income SNAP recipients
English-first
Sample meaningful, not huge
Strong preference for voice

WE SUSPECT

It generalizes — but untested for:
Spanish, Cantonese, Vietnamese
Older users
Users with disabilities

→ IF ANYONE WANTS TO RUN THE PILOTS, I'LL HELP.

CONSEQUENCES · SERVICE DESIGN

The redesign question
changes.

FROM

FORM

household income *

hours worked / week *

other earned income *

"how do we make the form less painful?"

CONVERSATION

so — tell me how this month has been going.

honestly? rough. they cut my hours again.

okay. let's figure out what that means for benefits.

"what does the conversation we'd want to have look like?"

Voice as a default alongside text, never instead of it.

ANTICIPATED CONCERNS

Three questions
you're already asking.

Cost.

Per-conversation economics. What does this look like at scale?

Error handling.

Wrong answers about benefits hurt people.

§§

PII & retention.

Recording, access, how long we keep it.

These are the right questions. The next few slides are how we answer them.

HOW WE ANSWER THEM

All three concerns
are really one question.

Can we trust the system?

ANSWER COMES IN THREE LAYERS

LAYER 1

Organizational

The Gateway. PII strip, prompt-injection defense, audit logs.

→

LAYER 2

Ecosystem

Open protocols. MCPs & Skills work across providers.

→

LAYER 3

Personal

Workflow discipline. Plan before the agent moves.

THE EMPLOYEE TOUR

The fear is legitimate.

I trained the whole company on Claude this year. I spent the most time with our government team and customer support.

People whose entire job is the currency of trust.

WHAT TRUST-WORK LOOKS LIKE

Correct answers. Defensible answers. Answers in language people can use.

WHAT AI INTRODUCES

A system they can't validate. Asked to trust on faith.

The right answer is not to reassure people. It's to give them structural reasons to trust.

LAYER 1 · ORGANIZATIONAL

The Propel Gateway,
and a paper trail.

USER / APP

▢

→

THE GATEWAY

PII strip
+ prompt-injection defense

→

LLM / SOURCES

Workspace · Amplitude · internal data

LOGGED & ATTRIBUTED

AUDIT LOG

every action —
attributable, reviewable, defensible.

That's the language of audit. That's the language government already speaks.

LAYER 2 · ECOSYSTEM

The tooling is portable.

MCPs and Skills are open protocols.
They work across every frontier LLM.
You can switch providers — and keep your investment.

The opposite of the vendor-lock pattern that has burned gov IT for 20 years.

YOUR MCP / SKILL

↙

↓

↘

Claude

GPT

Gemini

…the next one

SAME TOOLING. ANY PROVIDER.

ON TIMING

Yes, it's the
wild west.

The practices are still settling. Granted.

The private sector is already locked into last year's stack & last year's mistakes.

The public sector can skip a generation of bad patterns — but only by engaging now.

LAYER 3 · PERSONAL DISCIPLINE

Plan before
the agent moves.

github.com/obra/superpowers

STEP 01

Design doc

What we're building & why.

✓ HUMAN REVIEW

→

STEP 02

Impl doc

How, in concrete steps.

✓ HUMAN REVIEW

→

STEP 03

Agent executes

Against something I can point at & steer.

"The AI did it" is not an answer when you're accountable to the public.

LAYER 3 · IN PRACTICE

Two short docs.
Ten minutes each.

DESIGN DOC

voice-opt-out/design.md

# Voice opt-out

## Why

Some recipients don't want AI in their interactions. Need a fast, obvious opt-out.

## How

One-tap "switch to text" on every voice screen. Honor system-wide.

## Open questions

Default opt-in or opt-out? Partner integrations?

## Out of scope

The voice agent itself. New languages.

What. Why. Edges. Reviewed before any agent moves.

IMPL DOC

voice-opt-out/impl.md

# Voice opt-out — impl

## Approach

Boolean on users.voice_opt_out. Check before any voice path runs.

## Steps

1. migration · add column
2. UI · toggle + inline switch
3. server · requireVoiceConsent()
4. sweep voice helpers · fail closed
5. tests · 0 voice paths for opt-outs

## Verify

Manual run through. Test: opt-out → text everywhere.

How. In order. Reviewed before the agent executes.

Plain markdown. No platform to buy. The discipline lives in the habit of writing them, not in the templates.

LAYER 3 · CONTINUED

Five minutes,
after every task.

github.com/DrCatHicks/learning-opportunities

Interrupts the fluency illusion.
Pauses after meaningful work.
Asks you to sketch the answer first.
Refuses to explain until you've tried.

A direct counter to the slow atrophy of senior judgment that comes from over-relying on the tools.

$ learning-opportunities run

▸ Would you like to do a quick learning

exercise on the voice opt-out migration?

▸ Explain this component as if you were

onboarding a new developer.

$ _

WHAT'S NEXT · 01

Evals for contested ground.

Most AI evals score against a single right answer.
Benefits determinations don't always have one.
Same case, three defensible readings: legal aid, eligibility worker, program director.
Surface the disagreement. Don't collapse it.

THE WHOLE POINT

When a model says "95% sure," that should mean something.

In benefits work, it mostly doesn't.

WHAT'S NEXT · 02

When the user is the hard part.

AI plays a real applicant — guarded, confused, withholding.
The chatbot under test has to figure out what's actually going on.
Standard eval: does it know the income limits? Easy.
Real failures: the ones where the user is the hard part.

THE WHOLE POINT

Before you put AI in front of someone applying for benefits — you better know what it does when the user is confused, guarded, or wrong.

WHAT'S NEXT · 03

Conversation,
not bureaucratic UI.

Proof-of-concept layer between AI agents and state benefits portals.
Agent web-drives the portal on the user's behalf.
One application experience. Any state. From any tool.

> "Claude, open my SNAP renewal for California — tell me what we need to prepare."

CLOSEST TO WHAT SF WOULD ACTUALLY BUILD

PULLING IT TOGETHER

AI is leverage,
not magic.

Use it to reduce friction for the people you serve.
Use it to increase your own discipline — not replace it.

IF YOU START TOMORROW

Four moves,
in this order.

Pick one high-friction service interaction.

Prototype a voice version of it.

Build the audit trail from day one — not as a retrofit.

Call me. Genuinely — I'm happy to help.

keith@keithkurson.net

OPEN FLOOR · 22 MIN

Questions?

That was a lot. I left twenty minutes on purpose.

▸ Need a prompt? I have a few I expect.

IF THE ROOM IS SHY

Q1 Cost at scale?
Q2 Handling hallucination on benefits guidance?
Q3 Where SF starts — procurement & vendor evaluation?
Q4 Residents who don't want a bot?
Q5 Caseworkers & call center staff?
Q6 Accessibility — Deaf/HoH, language access?
Q7 Data retention & subpoena risk on voice recordings?

KEITH KURSON

keith@keithkurson.net

keith.is

Leverage, not magic:
AI in government.

A short bio,
in five entries.

A year on one question.

AI's value in government is removing friction between people and services.

When the
shutdown hit.

What we built
between Friday & Friday.

The web already
has the structure.

Mail wasn't fast enough.

People would rather talk
than type to a service.

Thousands of conversations,
in a couple of weeks.

People felt more heard
by the voice bot.

What we know vs. what we suspect.

The redesign question
changes.

Three questions
you're already asking.

All three concerns
are really one question.

The fear is legitimate.

The Propel Gateway,
and a paper trail.

The tooling is portable.

Yes, it's the
wild west.

Plan before
the agent moves.

Two short docs.
Ten minutes each.

Five minutes,
after every task.

Evals for contested ground.

When the user is the hard part.

Conversation,
not bureaucratic UI.

AI is leverage,
not magic.

Four moves,
in this order.

Questions?

Leverage, not magic: AI in government.

A short bio,in five entries.

A year on one question.

AI's value in government is removing friction between people and services.

When theshutdown hit.

What we builtbetween Friday & Friday.

The web alreadyhas the structure.

Mail wasn't fast enough.

People would rather talk than type to a service.

Thousands of conversations,in a couple of weeks.

People felt more heard by the voice bot.

What we know vs. what we suspect.

The redesign questionchanges.

Three questionsyou're already asking.

All three concernsare really one question.

The fear is legitimate.

The Propel Gateway,and a paper trail.

The tooling is portable.

Yes, it's thewild west.

Plan beforethe agent moves.

Two short docs.Ten minutes each.

Five minutes,after every task.

Evals for contested ground.

When the user is the hard part.

Conversation,not bureaucratic UI.

AI is leverage, not magic.

Four moves,in this order.

Questions?

Leverage, not magic:
AI in government.

A short bio,
in five entries.

When the
shutdown hit.

What we built
between Friday & Friday.

The web already
has the structure.

People would rather talk
than type to a service.

Thousands of conversations,
in a couple of weeks.

People felt more heard
by the voice bot.

The redesign question
changes.

Three questions
you're already asking.

All three concerns
are really one question.

The Propel Gateway,
and a paper trail.

Yes, it's the
wild west.

Plan before
the agent moves.

Two short docs.
Ten minutes each.

Five minutes,
after every task.

Conversation,
not bureaucratic UI.

AI is leverage,
not magic.

Four moves,
in this order.