AI in Medicine 4 min read

The Lemonade Machine Problem

Automation in medicine should not be judged against an ideal clinician. It should be judged against real human variance, machine variance, supervision, and the cost of failure.

Listen to this post

The Lemonade Machine Problem

0:00
Whimsical lemonade stand scene comparing a child-made lemonade stand with an automated lemonade machine

Source note: This post was prompted by a LinkedIn post from Kevin Maloy, MD, an innovation and clinical AI physician in the Washington DC-Baltimore area, affiliated with Georgetown University School of Medicine and clinicians.build.

Johnny’s lemonade stand has a quality control problem.

That is not an insult.

Johnny is seven.

He has a folding table, a pitcher, a hand-lettered sign, three plastic cups, and a dog who keeps walking through the supply chain.

Then someone wheels in the machine.

“Theirs is better and cheaper,” Johnny says.

Of course it is.

The machine has scale.

Johnny has a bedtime.

The machine has throughput.

Johnny has sticky fingers, uneven sleep, and the emotional bruise of the last customer who said, “Too sour.”

The machine has optimized margins.

Johnny has mood, fatigue, distraction, and a sugar scoop that becomes more theoretical as the afternoon goes on.

Then everyone panics because the machine gives you water 5 percent of the time and caffeinated coffee 1 percent of the time.

That panic is not irrational.

It is just incomplete.

Humans are not variance-free either.

The Variance Trade

This is the uncomfortable truth underneath our current AI conversation in medicine.

We often compare artificial intelligence to an idealized human clinician.

Not the real clinician.

Not the tired one at 5:37 p.m. after fourteen consults.

Not the physician interrupted six times while writing the note.

Not the resident covering three services overnight.

Not the specialist trying to remember whether the last patient needed a Doppler, a growth scan, or a delivery recommendation before the next call comes in.

We compare the machine’s flaws to the human’s best day.

That is not serious analysis.

The real question is not whether AI will make mistakes.

It will.

The real question is whether we understand the pattern of those mistakes, the frequency of those mistakes, the cost of those mistakes, and whether the system has enough human oversight to catch them before they reach the patient.

Johnny’s lemonade varies with the hour, the weather, the sugar scoop, his energy level, and whether he remembered to wash the pitcher.

The machine’s lemonade varies differently.

It may be consistent most of the time.

But when it fails, it may fail strangely.

Confidently.

At scale.

With a pleasant interface.

That is the variance trade.

Human systems fail through fatigue, distraction, memory limits, time pressure, handoff gaps, and communication failures.

AI systems fail through brittle assumptions, incomplete context, hallucination, automation bias, hidden data problems, and the dangerous appearance of certainty.

Neither failure mode is theoretical.

Neither failure mode is acceptable just because the other one exists.

Johnny Still Matters

So the serious work is not asking, “Can AI replace Johnny?”

The serious work is asking better questions.

What should Johnny keep doing?

What should the machine do?

What should never be automated?

Where does supervision belong?

Who is accountable when the lemonade is actually coffee?

In medicine, the goal should not be artificial perfection.

The goal should be safer systems.

Systems where clinicians are less overloaded.

Patients are less likely to fall through the cracks.

Documentation is less burdensome.

Decision support is more timely.

Errors are easier to detect.

Human judgment remains close to the point of care.

Because sometimes the value is not just the lemonade.

Sometimes the value is Johnny.

And sometimes the “perfect” lemonade machine wakes you up at 11 p.m. with a cup of accidental espresso.

Doctors who code need to think beyond efficiency.

We need to design for the variance trade.

Share X / Twitter Bluesky LinkedIn

Related Posts

Physician-developer sitting with a laptop at night, surrounded by visible signs of technical friction and clinical responsibility
Technology Featured

Burnout Is Not From Working Too Hard. It Is From Working on the Wrong Things.

For physician-developers, burnout often comes from low-value technical friction. The answer is not more endurance. It is better delegation to systems, automation, and agents.

· 7 min read
automationphysician-developerburnout
Physician-developer in scrubs working at a multi-monitor clinical workstation at night with imaging, code, and notes visible on screen
Clinical + Code Featured

Deep Work for Physicians: Protecting the Cognitive Core of Medicine

Deep work is not a productivity hack for doctors. It is a clinical safety issue, a training issue, and a systems design issue. Physicians need to protect deep thinking and automate the shallows.

· 9 min read
deep workphysician-developerclinical workflow
Clinical AI interface contrasting a polished answer with evidence layers, uncertainty warnings, and a physician review gate.
AI in Medicine

Fluent Answers Are Not Clinical Judgment

Language models can make uncertain medical information sound finished. The problem is not fluency. The problem is mistaking fluency for accountable clinical reasoning.

· 8 min read
clinical-ailarge-language-modelsclinical-judgment
Chukwuma Onyeije, MD, FACOG

Chukwuma Onyeije, MD, FACOG

Maternal-Fetal Medicine Specialist

MFM specialist at Atlanta Perinatal Associates. Founder of CodeCraftMD and OpenMFM.org. I write about building physician-owned AI tools, clinical software, and the case for doctors who code.