This post is a part of our Bioethics in the News series
By Tom Tomlinson, PhD
A recently reported study claims to more accurately predict how much longer patients will live. Researchers at Stanford University assigned a neural network computer the task of training itself to develop an artificial intelligence model that would predict if a patient would die within 3-12 months of any given date. The computer trained on the EMR records of 177,011 Stanford patients, 12,587 of whom had a recorded date of death. The model was validated and tested on another 44,273 patient records. You can find the wonky details here.
The model can predict with 90% accuracy whether a patient will die within the window.
Now this is a lot better than individual physicians typically do. It’s not just that such predictions are fraught with uncertainty, given how many complex, interacting factors are at work that only a computer can handle. If uncertainty were the only factor, one would expect physicians’ prognostic errors to be randomly distributed. But they are not. Clinicians overwhelmingly err on the optimistic side, so the pessimists among them turn out to be right more often.
The study takes accurately predicting death to be a straightforwardly useful thing. It gives patients, families and clinicians more reliable, trustworthy information that is of momentous significance, better informing critical questions. Will I be around for my birthday? Is it time to get palliative or hospice care involved?
The study’s authors are particularly hopeful that the use of this model will prompt more timely use of palliative care services, and discourage overuse of chemotherapy, hospitalization, and admission to intensive care units in the last months of life—all well-documented problems in the care of terminally ill people, especially those dying of cancer. So this is a potentially very significant use of “big data” AI research methods to address major challenges in end-of-life care.
But making real progress toward these goals will take a lot more than this model can deliver.
Image description: A graphic on a blue gradient background shows the silhouette of an individual in the foreground containing colorful computer motherboard graphics. In the background are silhouettes of twelve more individuals standing in a line and containing black and white computer motherboard graphics. Image source: Maziani Sabudin/Flickr Creative Commons.
The first question is how it could inform decisions about what to do next. The limitation here is that the model uses events from my medical history occurring prior to the time it’s asked to predict my survival. Perhaps the decision I’m facing is whether to go for another round of chemotherapy for metastatic cancer; or whether instead to enter a Phase 3 clinical trial for a new therapeutic agent. The question (one might think) is what each option will add to my life expectancy.
Now if the training database had some number of patients who took that particular chemotherapy option, then that factor would have somehow been accounted for when the computer built the model. Assuming the model reliably predicted the mortality of those earlier patients, all we’d need to do is add that factor to my medical record as a hypothetical, run the model again, and see whether the prognosis changed.
But is there something about the chemotherapy being offered that is different than the regimens on which the computer trained? Then the model will not be able to assess the significance of that difference for the patient’s survival. Obviously, this limitation will be even more radical for the experimental treatment option. So in the individual case, the model’s helpfulness in making prospective treatment decisions could be quite limited. It would have to be supplemented, or even supplanted, by old-fashioned clinical judgment, or alternative algorithmic prognostic tools.
This may be one reason the study authors imagine a different use: identify patients with 3-12 months life expectancy and refer them for a palliative care consultation. The idea is to push against the tendency already noted for physicians to wait too long in making palliative care or hospice referrals. Assuming the model is running all the time in the background, it could trigger an alert to the attending physician, or even an automatic palliative care referral for all those the model flagged.
Now, in my ethics consultation experience, getting an appropriate palliative care or hospice referral only one month preceding death would be a stunning accomplishment, let alone three months prior. But the key word here is “appropriate,” since the need for palliative care is not dictated by life-expectancy alone, but more importantly, by symptoms. Not every patient with a projected life expectancy between 3 and 12 months will be suffering from symptoms requiring palliative care expertise to manage. Automatic referrals requiring palliative care evaluations could overwhelm thinly-staffed palliative care services, drawing time and resources away from patients in greater need.
Part of the problem here is the imprecision of the model, and the effects this may have on patient and provider acceptance of the results. A 90% chance of death within 3-12 months sounds ominous, but it leaves plenty of wiggle-room for unrealistic optimism: lots of patients will be confident that they are going to fall at the further end of that range, or that they will be among the 10% of cases the model got wrong altogether. And it’s not just patients who will be so affected. Their treating physicians will also be reluctant to conclude that there is nothing left to do, and that everything they did to the patient before has been in vain. Patients aren’t the only ones prone to denial.
And the nature of the AI-driven prognosis will make it more difficult to respond to patient skepticism with an explanation anyone can understand. As the authors point out, all we really know is that the model can predict within some range of probability. We don’t know why or how it’s done so. The best we can do is remove a feature of interest from the data (e.g., time since diagnosis), rerun the model, and see what effect it has on the probability for the patient’s prognosis. But the model offers no reasons to explain why there was a change, or why it was of any particular magnitude. The workings of Artificial Intelligence, in other words, are not always intelligible. Acceptable explanations will still be left to the clinician and their patient.
Tom Tomlinson, PhD, is Director and Professor in the Center for Ethics and Humanities in the Life Sciences, College of Human Medicine, and Professor in the Department of Philosophy at Michigan State University.
Join the discussion! Your comments and responses to this commentary are welcomed. The author will respond to all comments made by Thursday, March 8, 2018. With your participation, we hope to create discussions rich with insights from diverse perspectives.
You must provide your name and email address to leave a comment. Your email address will not be made public.
Click through to view references