Feedback Is the Feature, Not a Bug

Feedback Is the Feature, Not a Bug

When an AI system in a GP surgery does something wrong, the instinct is to treat it as evidence that the technology is flawed. When it does something right, it is quietly taken for granted. This asymmetry shapes the entire public conversation about AI in primary care, and it misses something fundamental about how these systems actually work.

The negative feedback that AI voice systems in GP surgeries have received is not a problem to be managed. It is the most valuable data those systems have ever generated. The question is not whether the feedback exists. It is whether anyone is listening to it, and what happens when they do.

How AI Systems Actually Improve

There is a common misconception that AI products are finished when they are released. In reality, the most significant improvements in any deployed AI system happen after release, not before it. The reason is straightforward. No amount of testing in a controlled environment replicates the full complexity of real-world use. Real patients, with real accents, real anxieties, real names that do not appear in any training dataset, and real situations that no developer anticipated, are the only source of the data that makes a system genuinely better.

This is not a justification for releasing systems that are not ready. It is an explanation of why real-world deployment, done responsibly and with robust feedback mechanisms, is an essential part of the development process rather than the end of it. A system that has only ever been tested in a lab has not been tested at all, not really. A system that has been running in dozens of real NHS GP surgeries, handling hundreds of thousands of real patient calls, and feeding what it learns back into its own development, is a fundamentally different proposition.

QuantumLoopAI has been operating EMMA in real NHS environments for over two years. The phonetic alphabet requirement that early patients found frustrating was identified through real-world feedback and removed. Accent recognition has improved through exposure to real patient voices from across England. Call handling times have fallen. Each of these improvements came from the same source: listening to what the system's actual performance revealed, and acting on it. That is not a development process that ends at launch. It is one that never ends, because the standard being aimed for is not a fixed specification but the continuously evolving needs of real patients.

What Good Feedback Loops Look Like

Not all feedback mechanisms are equal. Collecting complaints is not the same as having a feedback loop. A feedback loop requires several things to work: a consistent way of capturing what is going wrong, the analytical capability to identify patterns rather than individual incidents, the organisational commitment to act on what the analysis reveals, and the technical infrastructure to translate those actions into product improvements quickly enough to matter.

In healthcare, this is harder than in most sectors because the stakes are higher and the regulatory requirements are more demanding. Every change to a clinical system requires validation. Every new language added to EMMA, for example, goes through clinical sign-off before it is released, because a healthcare tool that has not been properly validated is not a tool. It is a liability. That process takes longer than it would in a consumer app. It should. The alternative, moving fast and breaking things in a system that patients depend on for healthcare access, is not acceptable.

But rigour and speed are not opposites. A well-designed feedback loop in a healthcare AI system can identify a problem, validate a solution and deploy an improvement in a timeframe that would have been unthinkable in the pre-digital era of healthcare improvement. The NHS 10 Year Health Plan, published in July 2025, commits the government to investing £10 billion in NHS technology and digital transformation by 2028 and identifies AI as one of five transformative technologies for the health service. That investment will only deliver value if the systems it funds are built with the kind of feedback architecture that allows them to keep getting better.

The Difference Between Complaints and Intelligence

There is an important distinction between treating patient feedback as a complaints management exercise and treating it as product intelligence. The first approach asks: how do we resolve this individual complaint and stop it generating negative attention? The second asks: what does this complaint, alongside thousands of others, tell us about where our system is falling short and what we should do about it?

The NHS has historically been better at the first than the second. Written complaints to the NHS reached a record high of 241,922 in 2023 to 2024, with communication representing the single largest category at 17.1 per cent. That is an enormous amount of patient intelligence about where the system is failing to communicate effectively. The question is whether it is being systematically analysed and acted upon or whether it is being processed, responded to individually and filed.

AI systems offer something the traditional complaints process does not: the ability to analyse patterns across large volumes of interactions at a speed and scale that human review cannot match. Every call that EMMA handles generates data. Calls where patients had to repeat themselves, calls where the system failed to understand a name, calls that ended without a successful booking, calls where the patient asked to speak to a human. That data, aggregated and analysed, reveals failure patterns that no individual complaint would surface on its own. It is, in the right hands, one of the most powerful quality improvement tools in primary care.

What Patients and the NHS Should Expect

The willingness to be held accountable to feedback is one of the clearest signals of whether an AI company in healthcare should be trusted. A company that treats criticism as a threat to be managed is not going to improve. A company that treats it as information to be acted on will.

Patients should be able to expect that when they have a bad experience with an AI system in their GP surgery, something changes as a result. Not necessarily immediately, not always visibly, but systematically. The feedback they provide should flow into an improvement process that makes the next patient's experience better. That is not a luxury or an aspiration. In a healthcare context, it is a minimum standard.

The NHS 10 Year Health Plan is the most explicit commitment to AI-enabled healthcare transformation in the health service's history. It will create enormous opportunities for companies that have built the feedback and improvement infrastructure to grow with it. It will also create significant accountability for those that have not. The difference between the two will not be visible at launch. It will be visible two years later, in whether the systems deployed at scale are genuinely better than they were at the start, or whether they have simply got larger.

The technology that earns its place in the NHS of the next decade will be the technology that listens.


Sources: