How We Built a Clinic Chatbot That Handles 70% of Front-Desk Calls

<i>A multi-provider clinic in Baldwin Park was drowning in routine phone calls. We built a knowledge-base chatbot that answers most questions on its own and knows exactly when to hand off to a human. Here’s how it works — and what it saved.</i>

(Client details are anonymized and some specifics composited at the client’s request.)

I got a call from a clinic administrator in Baldwin Park last year. She was polite, but I could hear the exhaustion in her voice. “We have three providers, two front-desk staff, and about 60 missed calls a day,” she said. “Patients leave voicemails, we return them the next day, and by then half the people have already gone somewhere else.”

That’s the kind of problem that doesn’t show up on a balance sheet until you actually look at the leak. Missed calls mean lost appointments. Lost appointments mean lower revenue and frustrated patients. The clinic was losing an estimated $4,500 a month in bookings that never happened because the phone rang and nobody picked up.

They’d tried an off-the-shelf chatbot before. It was a generic widget that could only say “please call us” to every question. Patients hated it. The front desk hated it. It lasted two weeks.

So when they reached out to me, they were skeptical. “We don’t want another robot that can’t answer anything,” the administrator said. “We need something that actually helps.”

That’s the brief I took. Build a knowledge-base chatbot that can handle the bulk of routine questions — hours, insurance, directions, prescription refills — but also knows when to stop guessing and pass the conversation to a real person. No friction. No frustration. Just a clean handoff.

The Situation: What Was Breaking

The clinic had three providers — two family medicine doctors and a nurse practitioner — seeing about 80 patients a day combined. The front desk had two people working staggered shifts. Between checking patients in, answering phones, handling paperwork, and managing the schedule, they were drowning by 10 a.m. every day.

We did a two-week audit of the incoming calls. Out of roughly 200 calls a week, here’s what we found:

40% were asking about office hours, location, or directions
25% were about insurance acceptance or billing questions
15% were requesting prescription refills
10% were appointment scheduling or rescheduling
10% were clinical questions that required a nurse or provider

That first 80% — hours, insurance, refills, scheduling — are all things a well-built chatbot can handle. The last 10% (clinical questions) absolutely should go to a human. The middle 10% (scheduling) is trickier: a bot can show available times, but some patients want to talk to a person to pick the right slot.

Here’s the thing: the clinic didn’t need to replace the front desk. The front desk was spending 70% of their time on questions that could be answered by a well-organized knowledge base. They were burning out, and patients were getting worse service because the people who could actually help were stuck on the phone repeating the same answers all day.

What They’d Tried Before — And Why It Didn’t Work

The previous chatbot was a “rule-based” system built by a web development agency. It had a decision tree with about 15 branches. You’d type “hours,” and it’d show the hours. You’d type “insurance,” and it’d show a list of accepted plans. But the second a patient asked something slightly different — like “do you take Blue Cross?” instead of “insurance” — the bot would say “I don’t understand” and offer to email the clinic.

Patients hated it. They’d type the same question three different ways, get three “I don’t understand” responses, and then call the front desk anyway — but now they were annoyed. The clinic pulled the plug after two weeks because the bot was actually making things worse.

The lesson’s straightforward: a chatbot that can’t handle natural language is worse than no chatbot at all. It raises expectations and then fails to meet them. So we took a different approach.

The AI Work: What We Actually Built

We built a knowledge-base chatbot using retrieval-augmented generation (RAG). If you’re not familiar with the term, here’s the plain-English version: instead of programming every possible question and answer, we gave the bot access to the clinic’s actual documents — their FAQ, insurance list, provider bios, prescription refill policy, and appointment scheduling rules. When a patient asks a question, the bot searches those documents for the most relevant information and then writes a natural-language answer based on what it finds.

Here’s the technical stack we used:

Vector embeddings: We converted all the clinic’s documents into numerical representations (vectors) stored in a vector database. When a question comes in, we convert it to a vector and find the most similar document chunks.
Large language model (LLM): We used a GPT-4-class model to read the retrieved chunks and generate a conversational answer. We specifically chose one that’s good at following instructions about when to hand off.
Whisper transcription: For voice input (we added a “talk” button on the chat widget), we used OpenAI’s Whisper to transcribe speech to text so the bot could handle spoken questions too.
Handoff logic: The key piece. We wrote a set of criteria that trigger a handoff to a human. The bot checks each answer against these criteria before sending it to the patient.

The handoff criteria are simple but effective:

Clinical questions: If the bot detects keywords like “symptom,” “pain,” “diagnosis,” “medication side effect,” or any phrase that suggests a medical decision, it immediately says, “That’s a question for our clinical team. Let me connect you.”
Confidence threshold: The retrieval system returns a similarity score for each document chunk. If the top chunk’s score is below 0.7 (on a scale of 0 to 1), the bot says, “I’m not entirely sure about that. Let me get someone who knows for sure.”
Patient frustration: If the patient types something like “this isn’t helping” or “I need to talk to a person,” the bot hands off immediately. No loops, no “let me try again.”
Scheduling complexity: For appointment scheduling, the bot can show available slots and even book a simple appointment. But if the patient asks for a specific provider, a specific time, or has multiple constraints, the bot offers to hand off to a human scheduler.

We also added a “human in the loop” feature for the first month. Every time the bot was about to answer a question, a real person (a front-desk staff member) could review the answer before it went out. That sounds slow, but it was only for the first 50 or so questions a day. The staff member could approve or edit the answer, and we used those edits to improve the bot’s responses. After a month, the bot was accurate enough that we turned off the review requirement for most question types.

Where We Kept a Human in the Loop — And Why

There are two places where we deliberately kept a human involved, even after the bot was running smoothly.

First, clinical questions always go to a human. The bot is trained to recognize medical questions and hand them off. We don’t want the bot guessing about symptoms or treatments, even if it has a high confidence score. That’s a liability risk, and it’s just bad medicine. The clinic’s nurses and providers are the only ones who should answer those questions.

Second, the front desk can override the bot at any time. If a patient types something that the bot handles correctly but the staff member thinks needs a personal touch — say, a long-time patient with a complicated history — they can jump into the conversation. The bot is designed to be a helper, not a gatekeeper.

We also set up an escalation dashboard that shows every conversation where the bot handed off to a human. The clinic administrator reviews this dashboard weekly to see if there are patterns — questions the bot should’ve been able to answer but didn’t, or handoffs that could’ve been automated. That feedback loop is how we keep improving the bot without needing constant developer attention.

The Measured Results

We launched the chatbot in late November 2024. By the end of January 2025, the clinic’d run it for two full months, and we had solid numbers.

Call volume dropped 70%. The front desk went from about 200 calls a week to about 60. The bot was handling the rest.
Missed calls went from 60 per day to under 10. Most of those were after-hours calls that the bot handled anyway (patients could leave a message for a callback the next day, but the bot answered their question right away).
Front-desk staff reported saving 12 hours per week. That’s the equivalent of one part-time employee. They used that time to focus on patient check-in, follow-up calls, and administrative tasks that actually needed a human.
Patient satisfaction scores improved. The clinic started sending a quick post-chat survey after every bot interaction. On a scale of 1 to 5, the average rating was 4.3. Patients specifically commented that they liked getting answers instantly, especially for simple things like hours and directions.
Appointment bookings increased by about 15%. Because the front desk had more time to actually help patients schedule, and because the bot could book simple appointments directly, the clinic was filling slots that used to go empty.

One thing that surprised me: the bot handled about 200 conversations in the first week, and only 12 of those resulted in a handoff to a human. That’s a 94% containment rate. I expected more handoffs, especially in the beginning. But patients seemed to trust the bot’s answers, and the ones who needed a human found the handoff smooth — the bot’d say “Let me connect you with someone who can help,” and within 30 seconds a front-desk staff member would pick up the chat.

“We were skeptical at first, but after the first week, the front desk was actually asking us to add more features to the bot. They realized it wasn’t taking their jobs — it was taking the boring part of their jobs.” — Clinic Administrator

What We’d Do Differently

No project’s perfect. Here are a few things we’d change if we were doing it again.

We underestimated the document cleanup. The clinic had alot of PDFs, Word docs, and even handwritten notes about policies. We spent about 20 hours just organizing and deduplicating the content before we could vectorize it. In hindsight, we should’ve done a formal AI readiness assessment first to identify which documents were actually useful and which were outdated. That would’ve saved us a week of work.

The voice input feature was less popular than we expected. We added a “talk” button because we thought patients would want to speak their questions, especially while driving. But only about 15% of users ever clicked it. The chat interface was fast enough that typing felt easier. If I were building this again, I’d skip the voice input and put that effort into better handoff logic instead.

We should’ve trained the front desk on the escalation dashboard sooner. The dashboard is powerful, but it took a few weeks for the staff to get in the habit of reviewing it. Now they check it every morning and flag any conversations where the bot gave a wrong answer. That feedback loop is the key to keeping the bot accurate over time.

Lessons for Other Small Businesses

This project worked because the clinic had a clear problem — too many routine calls — and a clear willingness to change how they worked. Honestly, not every business is ready for that. If you’re thinking about a similar project, here are a few things to consider.

Start with a knowledge base audit. Before you build a chatbot, you need to know what questions people are asking and whether you have good answers for them. I’ve seen businesses spend thousands on a chatbot that fails because their FAQ is incomplete or contradictory. A readiness assessment can help you figure out if your data is ready.

Plan for the handoff from day one. The most common mistake I see is companies building a chatbot that tries to answer everything, then scrambling to add a human handoff when the bot fails. Design the handoff into the system from the start. Decide what the bot will never answer, and make sure the patient knows how to reach a human.

Measure the right things. Don’t just count how many conversations the bot handled. Measure call volume, missed calls, staff time, and patient satisfaction. Those are the numbers that matter to your bottom line.

Expect to iterate. The first version of this bot was good, but not great. We made about 30 tweaks in the first month — adding new documents, adjusting the handoff criteria, fixing edge cases. A chatbot isn’t a set-it-and-forget-it tool. It needs ongoing attention, at least at first.

What’s Next for the Clinic

The clinic is now looking at expanding the bot to handle voice calls directly — instead of just answering chats, the bot would answer the phone and handle the same routine questions over the phone. That’s a bigger technical challenge, but the foundation’s already there. The knowledge base is built, the handoff logic is tested, and the staff’s comfortable working with the bot.

They’re also considering using the same approach for internal knowledge management — letting providers and staff ask the bot questions about clinic policies, billing codes, and referral procedures. That’s a natural next step once the patient-facing bot is running smoothly.

Look, if you’re a small business owner in Central Florida and you’re dealing with the same kind of problem — too many routine questions, not enough time to answer them — I’d love to talk. The technology is ready, and it doesn’t have to be complicated. You can reach out anytime.

"We were skeptical at first, but after the first week, the front desk was actually asking us to add more features to the bot. They realized it wasn't taking their jobs — it was taking the boring part of their jobs."

Frequently asked questions

How long did it take to build the chatbot?

The initial build took about three weeks: one week for document cleanup and knowledge base creation, one week for development and testing, and one week for the human-in-the-loop training period. Ongoing tweaks continue as we learn from real conversations.

Did the chatbot replace any front-desk staff?

No. The goal was never to replace staff — it was to reduce their workload so they could focus on higher-value tasks. The front desk team reported saving 12 hours per week, which they used for patient check-in, follow-up calls, and administrative work.

What happens if the chatbot gives a wrong answer?

The clinic reviews the escalation dashboard every morning to catch any mistakes. If a wrong answer slips through, the staff can correct it and update the knowledge base so the bot doesn't repeat the error. We also have a confidence threshold that forces a handoff if the bot isn't sure.

Can this work for a business that isn't a clinic?

Absolutely. The same approach — knowledge base + RAG + handoff logic — works for any business that gets a high volume of routine questions. We've done similar projects for law firms, real estate agencies, and service companies. The key is having organized, accurate documentation to start with.

How much does a project like this cost?

Costs vary depending on the complexity of the knowledge base, the number of handoff criteria, and whether you need voice integration. A basic chatbot like this one typically ranges from $5,000 to $15,000 to build, plus ongoing hosting and maintenance fees. Contact us for a specific quote.

Do patients mind talking to a chatbot?

In our surveys, patients rated the bot an average of 4.3 out of 5. Most appreciated getting instant answers, especially for simple questions. The handoff to a human is smooth and fast, so patients who need a person can get one quickly. The key is setting expectations — the bot clearly states it's an AI assistant and offers to connect to a human at any time.

Ready to talk it through?

Send a one-line description of what you are trying to do. I will reply within one business day with a plain-English next step. Email or use the form →