Engineered Confidence
David J. Cox PhD MSB BCBA-D, Ryan L. O'Donnell MS BCBA
Note: All names used in Chiron are fictitious. Additionally, this is the second of eight episodes in which we build a story arc using the same characters. At the end, you will find a character cheat sheet to help keep everyone straight from episode to episode.

The following Monday felt strangely normal. Conference season ended. Flights landed. Expense reports submitted. The excitement that always followed a major gathering of behavior analysts was beginning to fade into the background noise of everyday clinical work.
At least that was how it appeared on the surface.
Inside the organization, clinicians moved through their routines. Treatment plans were being updated. Parent meetings were being scheduled. Authorization requests were being submitted. Supervision appointments again filled calendars.
The software remained unchanged.
Mira sat at her desk balancing a coffee in one hand and scrolling through documentation in the other. The AI-assisted platform generated another session summary.
-
Targets.
-
Prompting hierarchy data.
-
Caregiver participation.
-
Treatment integrity notes.
-
Recommended supervision topics.
The entire report appeared in less than three seconds. She skimmed it. Made a few minor wording changes. Clicked approve. The process repeated itself dozens of times throughout the morning. By lunch, she had completed documentation that would have consumed most of her afternoon only a year earlier. Across the hallway, Rowan watched younger clinicians comparing workflow tips.
“You still edit all of yours?” one asked.
Another laughed.
“Not unless something looks weird.”
“Same.”
“The platform gets me most of the way there.”
“Honestly, it’s usually right.”
One of the most important things behavior analysts should understand about AI is that modern AI systems are not truth machines. They are prediction machines. Large language models, recommendation systems, and many of the AI tools entering healthcare environments operate by identifying statistical relationships within large datasets and generating outputs based on those probabilistic relations. The systems cannot determine whether a probabilistic relation is objectively true. They generate what appears probable given patterns that exist in the training dataset.
This distinction sounds technical, but it has enormous practical implications.
Human expertise develops through repeated contact with contingencies. Clinicians encounter thousands of treatment decisions, supervision interactions, assessment outcomes, implementation challenges, and environmental variables throughout their careers. Over time, patterns emerge. Certain interventions tend to work under certain conditions. Certain environmental arrangements produce predictable outcomes. Certain forms of feedback consistently improve performance. Professional judgment gradually develops through contact with these regularities.
Mathematical models attempt to describe similar patterns. They formalize exactly how inputs and outputs relate, generate predictions that can be compared to what actually happens, and the parameters are updated to make the predictions as close to reality as possible. In that sense, people and models are sensitive to patterns. The difference is that one system directly contacts environmental contingencies and it is unclear exactly how learning physically occurs. The other is a mathematical representation of relationships (i.e., a system of equations) derived from the available data captured surrounding those contingencies, and where the “learning mechanisms” are completely known (i.e., how parameters update in a model). When applied to AI systems whose output is text, this means learning the sequence of words most probable based on the underlying training dataset, and where the “known output” the system of equations updates toward is what the user prefers and that will increase their engagement with the system. As text-based generative AI systems improve, the outputs become increasingly fluent, persuasive, and professional. The danger is not that the systems become obviously wrong. The danger is that they become correct often enough that humans stop distinguishig between confidence and accuracy.
Given the near decade of the use of reinforcement learning with human feedback, most hallucinations people encounter today are not as bizarre as they used to be. Many are not as easy to detect. All are statistically plausible outputs that have been shaped to sound like something a competent professional might write. And that is where the real challenge begins.
Later that week, Juniper attended a quality assurance review meeting, and the numbers looked excellent.
Chiron: The AI Literacy Series for ABA Professionals
A weekly newsletter exploring how ABA professionals can develop essential AI literacy skills to ensure ethical and effective practice in a rapidly changing field.