I never met Brian Thompson. And I never will since he is dead now. Only a few days ago, the then CEO of United Healthcare was murdered in Midtown Manhattan when he was on his way to attend the company’s annual investor conference. This happened after Anthem, another US healthcare provider had to reverse a policy that would limit anesthesia coverage in some states if a surgery or procedure exceeded a set time limit.
Murder is always wrong. But do we have a name for, when one’s livelihood, career, and future are stripped away not in swift accidental motion but slowly and gradually milled down into nothingness by the relentless grindstones of the faceless middle management in Finance & Insurance companies? Over the last decade, algorithmic decision-making has changed the human face of healthcare decision-making, and it looks like not for the better. Therefore, when we are now starting to build the next generation of artificial agentic systems we need to make sure that we have proper governance mechanisms in place.
Full disclosure. I have built for many years and continue to build AI systems in the regulated Finance industry. From 2012 to 2020, I implemented an AI for credit decision-making that makes about 200,000 decisions per year where the AI can automatically decide the positive case, and if the AI is uncertain hands it over to a human. In some countries, the auto-approval rate, i.e., the rate where only the AI approves (never declines), reached 80%. As of December 2024, Mercedes-Benz Finance has never been fined, faced regulatory actions, or faced any compliance findings in any of the markets I was responsible for. In lending, handing out the money is the easy part. Getting it back is hard — and also sucks as a customer experience. When I designed my AI systems I ensured that we understood our customers' true financial conditions and adjusted the financial burden accordingly.
However, not all companies in that space were equally thoughtful about their implementations of cost-cutting programs thinly veiled as digital transformation. This post is about corporate responsibility, AI safety, and failing at it, as shown by brief studies of BMW Finance, Volkswagen Finance, and United Healthcare (UHC).
In 2016, BMW Finance Australia had to compensate 15,000 customers harmed by their compliance failures to the total amount of AUD 77 million.
With:
$14.6 million in remediation payments;
$7.6 million in interest rate reductions on current contracts; and
$50 million in loan write-offs.
They had implemented an approval process where an applicant quickly gets a loan without getting properly checked. This led to unemployed or people on welfare getting access to luxury cars that they should not have.
In October 2024, Volkswagen Finance was fined £5.4 million ( £21.5 million in redress to around 110,000 customers ) for repossessing cars aggressively from customers without any consideration of other options. This caused significant hardship to customers who relied on their vehicles to travel to work or even worse were needed to actually perform their work. If you read the wording of the Financial Conduct Authority note, you will realize that their brutal collections practice had happened after taking the human out of the loop through templated and automated communications.
Both cases also highlight how failure to automate correctly can exacerbate hardship for individuals who are already facing hard challenges. While these scenarios put people into financial hardship, it is easy to see how devastating the consequences can be for patients denied the health care they desperately need. Maybe that’s why Brian Thompson is dead. While the police are hunting the suspect, this should still make us stop and think if we, driven by the desire to build revenue, develop innovative cultures, and - yes - cut costs, are building systems that ultimately harm society.
The Meteoric Rise of United HealthCare
United HealthCare Group is a subsidiary of United Health Group (UHC) that has experienced such unprecedented growth even putting mighty Apple to shame. Over the last 40 years, its stock price returned an astonishing 435,000%! Although most of it came after 2010.
source
Since then, revenues in healthcare have exploded. In fact, the revenues of the six largest US insurers -- Anthem, Centene, Cigna, AVS/Aetna, Humana, and UnitedHealth -- have quadrupled from 2010 to now USD 1.1 trillion. Combined revenues of only the three largest -- UHC, CVS/Aetna, and Cigna -- have even quintupled. It is apparent that this “insane” growth story of the last decade has come at the expense of the patients they set out to serve.
So it is no surprise that reports of massive claim denial rates paint a picture of a healthcare system designed to protect profits rather than provide care. If you look at the chart below, UHC alone declined 1 out of 3 claims, double the industry average, that their customers had relied on in an already extremely expensive and inefficient healthcare system.
This raises uncomfortable questions about the ethical responsibility of such organizations and the algorithms they deploy
nH Predict
The nH Predict algorithm was originally developed by NaviHealth (a subsidiary of UnitedHealth) and is an algorithm that is designed to estimate the duration of post-acute care a patient might need after hospital stays, such as time spent in skilled nursing facilities or in-home care. How it was used though is to deny claims algorithmically. And that’s the problem.
The nH Predict algorithm was trained on data from more than a million patients and predicts how much post-acute care a patient “should” need based on a patient’s diagnosis, living situation, age, gender, physical function, living situation, admission date, and other information. There is also a tool around the core algorithm that generates reports which is then used by physicians and company managers for evidence-based decision-making to decide on a rehabilitation plan for the patient.
Evidently, the prediction algorithm is a core operational decision-making tool and significantly contributes to the final claim decision. It is estimated about 26,000 Americans die annually from lack of insurance coverage.
This is a snippet from the lawsuit “Estate of Gene B. Lokken et al. v. UnitedHealth Group, Inc. et al.” :
The nH Preduct algorithm is proprietary, but as a practitioner in the field, I find it hard to believe that the technology used is neither a scorecard with policy rules nor a decision tree with the respective weights being adjusted based on the patient data.
“The outcome report provided by nH Predict provides a sort of profile of each patient that includes a score for a few of the patient’s functions also based on the data of similar patients analyzed in the past. The profile includes scores on the patient’s basic mobility, such as wheelchair skills or ability to take the stairs, cognitive abilities, such as memory and communication, and daily activity (for example, dressing and bathing). The profile report produced by nH Predict includes a total average score for the patient that is based on a combination of single scores.” source
Of course, United Health is denying to use the algorithm to harm their customers. I mean who would? But let’s be frank, unintended consequences are also consequences. Objectively, the results of this algorithm can be seen in the way denials are handled within the healthcare system. Based on a class action lawsuit filed by two families on UHC Medicate plans showed the UHC even pressured medical employees to follow nH Predict’s forecasted length of stay thus enforcing to keep patients’ rehab stays within the prediction window of the algorithm. But the algorithm can be wrong. This example shows that in California, up to 80% of all health insurance denials were reversed by independent medical reviewers.
And that’s the point, isn’t it? The more copilots, agents, agent swarms, and crews, seep into our daily lives with the promise to make everything nice and easy, we will be faced with situations where an AI has taken a decision on behalf of a human and we will not be in any position to understand or even challenge it.
We should never be in a situation where an algorithm becomes a gatekeeper of our success.
Governance Frameworks
Most AI systems are expected to be deterministic. That means given a set of data A you expect output B for every time you run the algorithm. But there is more. I would never build systems that auto-decline or auto-deny anything. Even if people are known fraudsters or blacklisted, then this application should always be double-checked by a human for the eventual case of mistaken identity or spelling mistakes in identity card numbers. There should always be a mechanism that allows a human to override the human decision. But this for obvious reasons can’t be any reason. the way I implemented it in the past is by using a template where an analyst can add points or deduct points based on manual intervention. However, for all of these cases, proper documentation is required.
The Human Cost of Algorithmic Failures
In 2024, more than 200 million Americans relied on private health insurance. Yet once they are churned out by the harsh realities of a brutal system, they are tossed aside, handed crippling medical bills, and are unable to source treatment. UHC’s failure is emblematic of a broader failure to balance financial ambitions with humanity’s basic needs.
This is not about failing with good intentions.
In my opinion, nH Predict is a hellspawn of misaligned incentives. Not unlike the causes of 08’s financial crisis the problem is not the algorithm per se. It’s the policies, cultures, and practices that we build around them. As exposed through recent lawsuits, troubling ethical breaches that might implicate the use of nH Predict to systematically prioritize the profits of people are remarkably easy to see.
That is a problem. Especially if you are providing a service that is essential to the well-being of a people. At the heart of these failures lies a tragic reality: for every claim denied or customer brought into financial hardship, there is a person left to face the consequences of this algorithmic apathy. Addressing these governance issues requires that we build systems that augment our human needs and not reduce them. Our systems should help us reach a “yes” faster and be able to explain why the “no” meant “no clearly”. In my opinion, any system that can influence human decision-making by providing suggestions or informing the public need to be explainable, and easily externally auditable.
For the systems we build, so models we train, for all intents and purposes we need to think deeply about how they might affect 1, 15,000, 26,000, 100,000, or millions of people negatively and how we can govern the processes around them.
I hope with the three cases I provided I have provided enough incentive to do so.