The Score You Never See: AI Bias on the Psychiatric Ward

There is a particular kind of powerlessness that belongs only to the acute psychiatric ward. You may have arrived in the back of a police car. You may not be entirely sure where you are, or why, or for how long. The door is locked from a side that is not yours. The people who decide whether you eat, sleep, leave, or are held down and injected are strangers in lanyards, and the version of events that ends up in your file is theirs, not yours. Now imagine that somewhere in that file, beneath the clinical notes and the medication chart, a statistical model has quietly run the numbers on you and produced a score. The score says you are likely to become violent. You will never see it. You will probably never be told it exists. But it may shape, in ways no one will ever fully reconstruct, whether the next few days of your life involve a conversation or a set of restraints.
This is no longer a thought experiment. In March 2026, a team led by researchers at the Centre for Addiction and Mental Health (CAMH) in Toronto published a study in the journal npj Mental Health Research that did something the field had largely avoided doing: it took a machine learning model of the kind increasingly proposed for psychiatric wards, trained it on real hospital data, and then asked not whether it worked, but who it worked against. The answer, reported in April 2026 by outlets including News-Medical and MSN, was uncomfortable in a way that should travel far beyond Toronto. The model systematically overestimated the risk of aggression for Black, Middle Eastern, and Indigenous patients relative to white patients with comparable clinical pictures. It was, in the most literal sense, more suspicious of some people than of others, and its suspicion fell along the oldest fault lines in medicine.
The Machine That Watches the Ward
To understand why this matters, you first have to understand what these systems are and what they are being asked to do. Predicting aggression in acute psychiatric care is one of the oldest and most fraught tasks in the speciality. Clinicians have always had to make a guess, often within minutes of meeting someone, about whether a patient poses a danger to themselves or others. Get it wrong in one direction and someone is hurt. Get it wrong in the other and you have subjected a frightened, unwell person to force they did not need. For decades that guess relied on structured checklists and clinical instinct. The promise of machine learning is that a model trained on tens of thousands of past cases might do better, spotting patterns a human under pressure would miss.
The CAMH study made the mechanics concrete. The researchers, with Yifan Wang as lead author alongside senior scientists including Laura Sikstrom and Marta Maslej, trained a model on structured electronic health records from 17,703 unique patients across ten inpatient units at the hospital, covering 42,719 observation days between January 2016 and May 2022. The model itself was a random forest, an ensemble of decision trees, and on conventional measures it performed respectably, returning an area under the receiver operating characteristic curve of around 0.81. By the usual yardstick of predictive accuracy, in other words, it was the sort of result that gets a tool greenlit for a pilot.
That is precisely the problem. A model can hit its accuracy target overall while distributing its errors with grotesque unevenness. The CAMH team did not stop at the headline number. They broke the model's mistakes down by race and ethnicity, by gender, by housing status, by whether the patient had been brought in by police, and by the intersections between those categories. What they were measuring, in the language of algorithmic fairness, was the false positive rate: the proportion of people who were flagged as likely to become aggressive but who did not. A false positive is not an abstraction here. It is a person marked as a threat who was never going to be one.
The disparities were stark and they were patterned. The model's false positive rate sat at roughly 0.040 for white patients and 0.032 for Asian patients. For Indigenous patients it rose to 0.055, for Black patients to 0.069, and for Middle Eastern patients to 0.080, the highest of any group. Read those numbers slowly. A Middle Eastern patient was being wrongly flagged as a future aggressor at roughly twice the rate of a white patient with no greater propensity for violence. Layer gender on top and it sharpened further: Middle Eastern men carried a false positive rate of around 0.093. The single largest driver the researchers found was not skin colour in isolation but admission mode. Patients brought to hospital by police had a false positive rate of about 0.094, far above any other group, and unstable or absent housing pushed the figure to roughly 0.083. The model had, in effect, learned to treat contact with the criminal justice system and poverty as proxies for danger, and those proxies map onto race because the society that generated the data made them map that way.
Garbage In, Prejudice Out
The instinct of the technically minded is to reach for a fix. If the model is biased, debias the model. Reweight the data, add fairness constraints, strip out the offending variables. But the CAMH findings, and a companion paper published months earlier, point at something the engineering instinct struggles to grasp: the bias is not a bug in the algorithm. It is a faithful transcription of the world.
Consider where the training data comes from. An aggression label in a psychiatric record is not a measurement in the way a blood pressure reading is. It is a human judgement, recorded by a clinician, about whether a patient was threatening, agitated, or violent. That judgement is made by people working in a system with a long and documented history of perceiving danger differently depending on who is in front of them. When the model learns to predict aggression, it is not learning to predict an objective event in the world. It is learning to predict who a hospital's staff, over six years, decided to write up as aggressive. If those decisions were skewed, the model inherits the skew and launders it through the authority of mathematics.
That history is not subtle, and it is not ancient. In 1851 the American physician Samuel Cartwright coined the term drapetomania, a supposed mental illness whose symptom was the desire of enslaved people to escape captivity. It was pseudoscience in service of subjugation, and it established a template that has proven remarkably durable: the pathologising of Black resistance as madness. A century later, during the civil rights era, the diagnosis of schizophrenia in American psychiatry shifted in its public face from an affliction associated with white middle class women to a condition projected onto angry, protesting Black men, a phenomenon the psychiatrist Jonathan Metzl traced in his book The Protest Psychosis. The legacy persists in the present tense. Black patients in the United States are diagnosed with schizophrenia at well over twice the rate of white patients, a gap that studies have repeatedly failed to explain by any difference in actual illness.
If clinicians have historically been quicker to see Black and Indigenous patients as dangerous, disordered, or threatening, then the records they generate encode that quickness. A model trained on those records does not see the centuries of context. It sees a correlation, and it optimises for it. This is the deeper meaning of the companion research. In a paper published in Scientific Reports on 1 December 2025, a team including several of the same CAMH researchers, with lead author work by Vejandla and colleagues including Sikstrom, Ratto, Zaheer, and Maslej, examined how biased AI recommendations actually influenced human decision making during simulated mental health emergencies. They found that systems trained on health records overestimated violence risk for marginalised groups, with the AI in their biased conditions recommending police intervention for between 50 and 90 per cent of vignettes depicting at-risk groups, including Black patients, men, unhoused patients, and those with severe mental illness, compared with about 20 per cent of vignettes depicting no-risk groups, and with secondary analyses finding the disparity statistically significant only for vignettes depicting Black as opposed to white individuals. The disparity, in other words, was not a quirk of one model at one hospital. It was a property of what happens when you train statistical systems on data produced by an unequal system and then put their outputs in front of human beings.
When the Pop-Up Becomes the Patient's Reality
The most troubling finding of the December 2025 study was not the existence of the bias but the failure of the obvious remedies. The researchers tested cognitive forcing interventions, techniques designed to slow clinicians down and make them think independently before deferring to the machine. They tried delaying the AI's recommendation, asking participants to commit to an initial judgement first, and making the AI optional rather than automatic. In other domains, such nudges have helped people resist automated advice. Here, they largely did not. People exposed to a biased recommendation tended to absorb the bias regardless of the procedural speed bump in their way.
One variable did seem to offer some protection, and it is a quietly damning one. Participants who scored high on a psychological measure called need for cognition, essentially a disposition to enjoy and engage in effortful thinking, were more resistant to the discriminatory pull of the AI. The implication is that the safeguard against an unjust algorithm was not the system design at all but the individual intellectual temperament of whoever happened to be reading the screen. That is not a safeguard a hospital can rely on at three in the morning on an understaffed ward.
This is where the abstraction of false positive rates collides with the body. An algorithmic flag does not stay on a dashboard. In an acute setting, a prediction of imminent violence is an invitation to act pre-emptively, and the tools of pre-emption are physical. A patient deemed high risk is more likely to be watched more closely, escalated more quickly, and ultimately subjected to the interventions the ward keeps for danger: physical restraint, seclusion, and chemical sedation. These are not neutral acts of caution. They are among the most harmful things a hospital can lawfully do to a person.
The Hands That Hold You Down
It is worth being unsparing about what these interventions involve, because the language of clinical guidelines tends to sand off their reality. Physical restraint means several staff members holding a person down, often face down, sometimes binding their limbs to a bed. Seclusion means locking a distressed person alone in a bare room. Chemical restraint, sometimes euphemised as rapid tranquillisation, means the forced injection of sedating drugs into someone who has not consented and may be physically resisting. None of these are rare or marginal practices. They are the standard repertoire of the acute ward, applied many thousands of times a year across the world's psychiatric systems, and they are precisely the actions that an algorithmic risk flag is designed to make more probable.
The harms are documented and they are severe. A systematic review of physical harm and death in the context of coercive psychiatric measures found that death was the single most frequently reported adverse outcome, with mechanisms including cardiac arrest from chest compression during prone restraint, asphyxiation, and pulmonary embolism. The wider literature catalogues aspiration, rhabdomyolysis, blood clots, musculoskeletal injury, falls, and post-traumatic stress. Research on high dose sedation for acute behavioural disturbance has found that loading patients with more medication does not produce faster or better sedation but does produce more adverse effects, including cardiac problems and dangerous drops in blood oxygen. Beyond the physical, forced medication is associated with worse mental health outcomes and a lasting erosion of trust in treatment, with patients reporting stronger disapproval of their care months later. In 2023 the World Health Organization and the United Nations human rights office issued joint guidance calling for an end to coercive practices in mental health services altogether, language that frames restraint and seclusion not as regrettable necessities but as human rights violations.
Now lay the algorithm's bias over this landscape, and the stakes become clear. The interventions that an over-predicting model makes more likely are interventions that already fall unequally. A 2022 study in Psychiatric Services by Colin Smith and colleagues, examining 12,977 emergency psychiatric encounters, found that Black patients had significantly higher odds of being restrained even after adjusting for clinical factors: an adjusted odds ratio of 1.35 for physical restraint and 1.33 for chemical restraint, meaning roughly a third more likely in each case. The research on restraint-related deaths repeatedly notes the disproportionate presence of Black patients among the dead. An algorithm that overestimates the dangerousness of Black, Indigenous, and Middle Eastern patients does not introduce a new disparity into a fair system. It pours accelerant on a fire that has been burning for a very long time, and it does so while wearing the lab coat of objectivity.
The Vanishingly Small Print of Consent
What makes the psychiatric context distinct from almost every other arena where algorithmic bias has been studied is the near-total collapse of the patient's capacity to push back. When a biased model denies someone a loan or filters them out of a job applicant pool, the harm is real and the recourse is limited, but the person typically remains a free agent in the world, able in principle to ask questions, seek another lender, or hire a lawyer. The acutely unwell psychiatric inpatient has none of that. They may be detained involuntarily. They may be experiencing psychosis, which the surrounding system will treat as a reason to discount their account of events. They are frequently without an advocate, a family member, or anyone whose word carries weight against the clinical consensus. And the clinical authority arrayed against them is as close to absolute as exists anywhere in modern healthcare.
In that environment, the ordinary mechanisms of algorithmic accountability simply do not function. The idea that a patient might request an explanation of the model's logic, or contest its score, presumes a patient who knows the model exists, has the legal standing to challenge it, and possesses the cognitive and practical wherewithal to do so. Strip away those assumptions, as acute admission does, and you are left with a system that exercises power over people precisely in proportion to their inability to resist it. The patients most likely to be wrongly flagged, those brought in by police, those without stable housing, those who are Black or Indigenous or Middle Eastern, are very often the same patients least equipped to contest the flag. The bias and the powerlessness are not two separate problems. They compound.
What the Hospital Owes You
So what is actually owed, and by whom? Start with the hospitals, because they are where the abstraction becomes a person on a bed. A hospital that deploys a predictive model is making a clinical decision on behalf of every patient who passes through it, and the ordinary duties of medicine do not evaporate because a computer is involved. The first obligation is the most basic and the most frequently dodged: do not deploy a tool you have not tested for disparate harm. The striking thing about the CAMH work is how rare it remains. The researchers themselves framed their study as first-of-its-kind, which is an indictment of a field that has spent years celebrating predictive accuracy without routinely asking whose errors it is built on. A hospital that cannot say, in numbers, how its model's false positive rate varies by race has not done the work, and deploying anyway is not innovation but negligence.
The second obligation is to keep a human meaningfully in the loop, and to mean it. The December 2025 findings are a warning here, because they show that simply having a clinician read the output is not enough; the clinician absorbs the bias. Meaningful human oversight cannot be a rubber stamp on a screen. It has to be structured so that the model's suggestion can be genuinely overridden, so that staff are trained to interrogate rather than defer, and so that the institution treats an algorithmic flag as one contestable input rather than a verdict. The third obligation concerns the interventions themselves. If restraint, seclusion, and forced sedation are the downstream consequences of a flag, then any hospital using such a model owes its patients rigorous, race-disaggregated monitoring of those very interventions, with the explicit question of whether the algorithm is widening existing gaps. A model that quietly increases the coercion of already over-coerced groups is not a clinical aid. It is a liability dressed as one.
What the Builders Owe You
The developers who build these systems carry obligations of their own, and they cannot offload them onto the hospital that buys the product. The most fundamental is honesty about what the model actually predicts. A system marketed as predicting violence does not predict violence. It predicts recorded labels of aggression, which is a different and far more contaminated quantity. Developers know this, or should, and the gap between the marketing claim and the statistical reality is where much of the harm hides. A tool sold as objective when it is in fact a mirror of historical bias is mis-sold, and the consequences of that mis-selling are measured in restraints applied to the wrong people.
Beyond honesty comes the duty to test. Fairness auditing of the kind the CAMH team performed should be a precondition of release, not an academic afterthought published years into deployment. That means measuring false positive and true positive rates across racial, gender, housing, and admission-mode subgroups, and across their intersections, because the CAMH data showed that the worst disparities lived at the intersections. It means being transparent about those results to the institutions that buy the tool and, ideally, to the public. And it means accepting that some models should not ship. A system whose errors fall predictably on the most vulnerable patients in the building is not improved by a disclaimer. The CAMH researchers have since secured funding to develop a fairness-aware successor tool, which is the constructive response, but the existence of a better future tool does not retroactively justify deploying a biased one today.
There is also a harder, more philosophical duty here, one the field has been slow to confront. A growing body of work, including the CAMH team's own framing, suggests that the most honest use of these models may not be to predict individual patients at all, but to detect systemic bias in the institutions that generate the data. Turn the lens around. Instead of asking the algorithm to tell you which patient is dangerous, ask it to tell you where your hospital's own judgements are skewed. That reframing, from individual risk prediction to institutional self-examination, is one of the few genuinely promising paths out of the trap, because it uses the model's pattern-finding power against the bias rather than in service of it.
What the Regulators Owe You
Then there are the regulators, who are, at present, mostly absent from the bedside. The regulatory architecture for clinical AI is being built in real time, and it is being built largely around the wrong questions. Under the European Union's AI Act, AI systems used in healthcare are designated high-risk, a classification that brings obligations around transparency, documentation, and human oversight, with major requirements scheduled to phase in across 2026 and beyond (though in May 2026 the Council of the EU and the European Parliament reached a provisional agreement, under the Digital Omnibus initiative, to postpone the high-risk obligations, deferring the requirements for standalone high-risk systems under Annex III until 2 December 2027 and those for high-risk AI embedded in regulated products such as medical devices under Annex I until 2 August 2028). High-risk status is the right instinct, but a designation is only as good as its enforcement, and the Act's transparency requirements run into the same wall that defeats the patient: meaningful explanation of an opaque model remains, by the regulators' own admission, largely undefined in practice.
The data protection regime offers a sharper, if narrower, tool. Article 22 of the General Data Protection Regulation gives people the right not to be subject to decisions based solely on automated processing where those decisions have significant effects, along with rights to obtain human intervention, to express their view, and to contest the outcome. On paper, a violence prediction that channels someone towards restraint is exactly the kind of significant automated decision the provision was written for. In the psychiatric ward, however, the law's assumptions break down. Article 22 protects decisions made solely by automation, and a hospital can defeat the protection simply by keeping a clinician nominally in the loop, even one who, as the December 2025 study showed, may be doing little more than ratifying the machine's bias. The right to contest presupposes a person able to exercise it, which the acute patient frequently is not. The regulation was built for the credit application and the recruitment filter, not for the locked ward, and it shows.
What patients actually need from regulators is more specific than anything currently on the books. They need a right to know, in plain terms, that an algorithm has assessed them and what it concluded, with that disclosure made not in the moment of acute crisis but as a matter of standard record that they or their advocate can later examine. They need a presumption that algorithmic risk scores are disclosable in any review of restraint or seclusion, so that a coercive intervention can be challenged on the basis of the evidence that helped trigger it. They need mandated, published fairness audits as a condition of clinical deployment, with the false positive disparities the CAMH team measured treated as the floor of what must be reported, not a research novelty. And they need the human oversight requirement to have teeth, defined not by the presence of a clinician but by demonstrable, structured independence of judgement from the model's output.
The Right to Know You Were Suspected
Underneath all the specific obligations sits a single principle that the law has not yet caught up to, and it is the one a person on the ward would care about most. If a system has assessed you as a threat, you have a right to know it did. That right does not depend on whether you can understand the mathematics, or whether the model was accurate, or whether you were ultimately restrained. It is prior to all of that. It is the difference between being a patient and being a suspect, between care and surveillance, between a person whose treatment is being negotiated and an object whose behaviour is being managed.
The reason this right is so easily denied is the same reason it matters so much. The whole logic of pre-emptive risk prediction is that it works on you before you have done anything, which means it works best when you do not know it is working. A flag that the patient could see and contest is a flag with friction, and friction is precisely what the efficiency case for these tools is designed to remove. So the systems are built to be invisible, and the invisibility is not incidental. It is the point. The acutely unwell patient is the ideal subject for an opaque algorithm precisely because they are in no position to demand the lights be turned on.
There is a version of the future where this technology helps. Used to audit institutions rather than to judge individuals, tested relentlessly for disparate harm, kept subordinate to genuinely independent human judgement, and made visible to the people it assesses, a model could in principle expose the very biases it currently encodes. The CAMH team's pivot towards building a fairness-aware tool and towards using these methods to surface systemic inequity rather than to predict patients is a glimpse of that future, and it deserves to be taken seriously rather than dismissed as naive.
But the present is the present. Right now, in wards on more than one continent, the default trajectory is the opposite one: quiet deployment, accuracy figures that flatter the tool while hiding who it fails, oversight that defers rather than challenges, and patients who will never learn that a number was attached to their name. The people most exposed to that trajectory are the ones who have always been most exposed to the wrong end of psychiatric power: the Black patient brought in by police, the Indigenous patient without stable housing, the Middle Eastern man already statistically twice as likely to be wrongly marked as dangerous. The machine did not invent their predicament. It learned it, from us, and it is now prepared to repeat it with a confidence no human clinician could ever quite muster. The question the CAMH study leaves hanging is not whether the algorithm is biased. We know that it is. The question is whether the people it judges will ever be allowed to know it too.
References
Wang, Y., Sikstrom, L., Xiao, R., Findlay, Z., Zaheer, J., Hill, S. L., & Maslej, M. M. (2026). Fairness analysis of machine learning predictions of aggression in acute psychiatric care. npj Mental Health Research, 5, Article 16. https://www.nature.com/articles/s44184-026-00194-6
Centre for Addiction and Mental Health. (2026, April 7). First-of-its-Kind Study Shows AI Risk Prediction Tools in Psychiatry Can Reinforce Systemic Bias. CAMH News and Stories. https://www.camh.ca/en/camh-news-and-stories/rsch-study-shows-ai-risk-prediction-tools-in-psychiatry-can-reinforce-systemic-bias
News-Medical. (2026, April 7). AI models may amplify bias in psychiatric aggression predictions. https://www.news-medical.net/news/20260407/AI-models-may-amplify-bias-in-psychiatric-aggression-predictions.aspx
MSN / Medical Xpress. (2026, April). AI risk prediction tools in psychiatry can reinforce systemic bias. https://www.msn.com/en-us/health/other/ai-risk-prediction-tools-in-psychiatry-can-reinforce-systemic-bias/ar-AA20n69Y
Vejandla, S., Ray, A., Sikstrom, L., Ratto, M., Zaheer, J., & Maslej, M. M. (2025, December 1). Impacts of cognitive forcing and need for cognition on biased AI-assisted decision making about mental health emergencies. Scientific Reports. https://www.nature.com/articles/s41598-025-30506-3
Smith, C. M., Turner, N. A., Thielman, N. M., Tweedy, D. S., Egger, J., & Gagliardi, J. P. (2022). Association of Black Race With Physical and Chemical Restraint Use Among Patients Undergoing Emergency Psychiatric Evaluation. Psychiatric Services. https://psychiatryonline.org/doi/10.1176/appi.ps.202100474
Kersting, X. A. K., et al. (2019). Physical Harm and Death in the Context of Coercive Measures in Psychiatric Patients: A Systematic Review. Frontiers in Psychiatry. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6580992/
Calver, L. A., et al. (2013). A prospective study of high dose sedation for rapid tranquilisation of acute behavioural disturbance in an acute mental health unit. BMC Psychiatry. https://pmc.ncbi.nlm.nih.gov/articles/PMC3848824/
World Health Organization & Office of the United Nations High Commissioner for Human Rights. (2023). Mental Health, Human Rights and Legislation: Guidance and Practice. https://www.who.int/publications/i/item/9789240080737
Metzl, J. M. (2009). The Protest Psychosis: How Schizophrenia Became a Black Disease. Beacon Press. See also: Schwartz, R. C., & Blankenship, D. M. discussion in American Journal of Psychiatry. https://psychiatryonline.org/doi/full/10.1176/appi.ajp.2009.09101398
American Psychological Association reporting on schizophrenia overdiagnosis, summarised in: Amsterdam News. (2025, September 4). Racial bias in medicine is driving the overdiagnosis of schizophrenia in Black patients. https://amsterdamnews.com/news/2025/09/04/racial-bias-driving-overdiagnosis-of-schizophrenia-in-black-patients/
European Union. Artificial Intelligence Act, Annex III: High-Risk AI Systems. https://artificialintelligenceact.eu/annex/3/
Tandem Health. EU AI Act explained: what healthcare organisations need to know. https://tandemhealth.ai/resources/knowledge/eu-ai-act-explained-what-healthcare-organisations-need-to-know
Goldsteen, A., et al. A health-conformant reading of the GDPR's right not to be subject to automated decision-making. Medical Law Review, 32(3), 373. https://academic.oup.com/medlaw/article/32/3/373/7732100
GDPR Article 22: Automated Decision-Making, Profiling, and Your Rights. https://gdprinfo.eu/gdpr-article-22-explained-automated-decision-making-profiling-and-your-rights

Tim Green UK-based Systems Theorist & Independent Technology Writer
Tim explores the intersections of artificial intelligence, decentralised cognition, and posthuman ethics. His work, published at smarterarticles.co.uk, challenges dominant narratives of technological progress while proposing interdisciplinary frameworks for collective intelligence and digital stewardship.
His writing has been featured on Ground News and shared by independent researchers across both academic and technological communities.
ORCID: 0009-0002-0156-9795 Email: tim@smarterarticles.co.uk
Listen to the free weekly SmarterArticles Podcast








