Investigators use machine learning to predict suicide risk
Vanderbilt University Medical Center Research News Mar 15, 2017
According to the Centers for Disease Control and Prevention (CDC), in 2013 there were 41,149 suicides in the U.S., making it the 10th leading cause of death that year. Among high school students in 2013, the CDC estimates that over the previous 12 months 2.7 percent had sometime made a suicide attempt resulting in injury, poisoning or overdose that required medical attention.
Could machine learning provide a way forward? What information would be required to improve suicide risk assessment, and could it perhaps already be available in routine electronic health records?
These questions are the starting point for a study published in the journal Clinical Psychological Science by researchers from Vanderbilt and Florida State universities. Results of this study suggest that research into suicide risk is about to change gears. The researchers demonstrate how automated risk analysis based on machine learning and electronic health record data can vastly outperform previous suicide attempt risk–scoring tools.
For discriminating between attempted suicide and non–suicidal self–harm, the predictive models developed by the team ranged from 80 percent accurate at two years prior to attempted suicide, to 84 percent accurate at one week prior to an attempt. The models also had low false–negative rates in these patients, ranging from 3.5 percent down to 1.2 percent.
Predictions got even better when the models were used in the general patient population, with accuracy ranging from 84 percent at two years prior to attempted suicide, to 92 percent at one week prior.
ÂFor suicide and attempted suicide, one challenge facing conventional risk modeling has been the sheer cost of prospective clinical studies. With machine learning, weÂre able to put together large data sets from clinical data thatÂs collected routinely, in order to generate high accuracy predictions in a scalable way. They might be applied to any enterprise with electronic records.
ÂThere are pros and cons to both approaches, and clinical trials for suicide risk shouldnÂt be replaced, but we think our methods add new tools to capture risk at scale from day–to–day interactions, said Colin Walsh, MD, MA, assistant professor of Biomedical Informatics, Medicine and Psychiatry and Behavioral Science.
Walsh and colleagues started with de–identified records of adult patients seen at Vanderbilt between 1998 and 2015. They found 5,167 patients with billing codes indicating self–injury. A pair of clinical experts undertook separate reviews of this set, finding 3,250 cases, that is, 3,250 patients with a history of attempted suicide, and 1,917 controls, or patients with a history of non–suicidal self injury.
The de–identified records were pared down to demographics, diagnoses, socioeconomic status, health care utilization and medication information. To find predictors within these data, a machine learning algorithm called Ârandom decision forests shuffled this set of records repeatedly, each time building a Âdecision tree upon comparing the shuffled set to the expert–ordered setÂs strict segregation of cases and controls.
After thousands of shuffles, the algorithm became expert at predicting whether a randomly selected record from the training set was a case or a control. Finally, with a method called bootstrapping, the team used their training set to synthesize new data sets with which to measure the performance of their predictive models.
A second round of testing was set in the general patient population, using an additional 13,000 de–identified records as controls.
Walsh said discussions are proceeding at Vanderbilt about how to put machine–learning–based models of attempted suicide into use.
Go to Original
Could machine learning provide a way forward? What information would be required to improve suicide risk assessment, and could it perhaps already be available in routine electronic health records?
These questions are the starting point for a study published in the journal Clinical Psychological Science by researchers from Vanderbilt and Florida State universities. Results of this study suggest that research into suicide risk is about to change gears. The researchers demonstrate how automated risk analysis based on machine learning and electronic health record data can vastly outperform previous suicide attempt risk–scoring tools.
For discriminating between attempted suicide and non–suicidal self–harm, the predictive models developed by the team ranged from 80 percent accurate at two years prior to attempted suicide, to 84 percent accurate at one week prior to an attempt. The models also had low false–negative rates in these patients, ranging from 3.5 percent down to 1.2 percent.
Predictions got even better when the models were used in the general patient population, with accuracy ranging from 84 percent at two years prior to attempted suicide, to 92 percent at one week prior.
ÂFor suicide and attempted suicide, one challenge facing conventional risk modeling has been the sheer cost of prospective clinical studies. With machine learning, weÂre able to put together large data sets from clinical data thatÂs collected routinely, in order to generate high accuracy predictions in a scalable way. They might be applied to any enterprise with electronic records.
ÂThere are pros and cons to both approaches, and clinical trials for suicide risk shouldnÂt be replaced, but we think our methods add new tools to capture risk at scale from day–to–day interactions, said Colin Walsh, MD, MA, assistant professor of Biomedical Informatics, Medicine and Psychiatry and Behavioral Science.
Walsh and colleagues started with de–identified records of adult patients seen at Vanderbilt between 1998 and 2015. They found 5,167 patients with billing codes indicating self–injury. A pair of clinical experts undertook separate reviews of this set, finding 3,250 cases, that is, 3,250 patients with a history of attempted suicide, and 1,917 controls, or patients with a history of non–suicidal self injury.
The de–identified records were pared down to demographics, diagnoses, socioeconomic status, health care utilization and medication information. To find predictors within these data, a machine learning algorithm called Ârandom decision forests shuffled this set of records repeatedly, each time building a Âdecision tree upon comparing the shuffled set to the expert–ordered setÂs strict segregation of cases and controls.
After thousands of shuffles, the algorithm became expert at predicting whether a randomly selected record from the training set was a case or a control. Finally, with a method called bootstrapping, the team used their training set to synthesize new data sets with which to measure the performance of their predictive models.
A second round of testing was set in the general patient population, using an additional 13,000 de–identified records as controls.
Walsh said discussions are proceeding at Vanderbilt about how to put machine–learning–based models of attempted suicide into use.
Only Doctors with an M3 India account can read this article. Sign up for free or login with your existing account.
4 reasons why Doctors love M3 India
-
Exclusive Write-ups & Webinars by KOLs
-
Daily Quiz by specialty
-
Paid Market Research Surveys
-
Case discussions, News & Journals' summaries