(C) PLOS One [1]. This unaltered content originally appeared in journals.plosone.org.

(C) PLOS One [1]. This unaltered content originally appeared in journals.plosone.org.
Licensed under Creative Commons Attribution (CC BY) license.
url:https://journals.plos.org/plosone/s/licenses-and-copyright

------------

Development and validation of the Durham Risk Score for estimating suicide attempt risk: A prospective cohort analysis

['Nathan A. Kimbrel', 'Durham Veterans Affairs', 'Va', 'Health Care System', 'Durham', 'North Carolina', 'United States Of America', 'Va Mid-Atlantic Mental Illness Research', 'Education', 'Clinical Center']

Date: 2021-08

In this study, we observed that the DRS was a strong predictor of future suicide attempts in both the combined development (AUC = 0.91) and validation (AUC = 0.92) cohorts. It also demonstrated good utility in many important subgroups, including women, men, Black, White, Hispanic, veterans, lower-income individuals, younger adults, and LGBTQ individuals. We further observed that 82% of prospective suicide attempts occurred among individuals in the top 15% of DRS scores, whereas 27% occurred in the top 1%. Taken together, these findings suggest that the DRS represents a significant advancement in suicide risk prediction over traditional clinical assessment approaches. While more work is needed to independently validate the DRS in prospective studies and to identify the optimal methods to assess the constructs used to calculate the score, our findings suggest that the DRS is a promising new tool that has the potential to significantly enhance clinicians’ ability to identify individuals at risk for attempting suicide in the future.

Three prospective cohort studies, including a population-based study from the United States [i.e., the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) study] as well as 2 smaller US veteran cohorts [i.e., the Assessing and Reducing Post-Deployment Violence Risk (REHAB) and the Veterans After-Discharge Longitudinal Registry (VALOR) studies], were used to develop and validate the DRS. From a total sample size of 35,654 participants, 17,630 participants were selected to develop the checklist, whereas the remaining participants (N = 18,024) were used to validate it. The main outcome measure was future suicide attempts (i.e., actual suicide attempts that occurred after the baseline assessment during the 1- to 3-year follow-up period). Measure development began with a review of the extant literature to identify potential variables that had substantial empirical support as longitudinal predictors of suicide attempts and deaths. Next, receiver operating characteristic (ROC) curve analysis was utilized to identify variables from the literature review that uniquely contributed to the longitudinal prediction of suicide attempts in the development cohorts. We observed that the DRS was a robust prospective predictor of future suicide attempts in both the combined development (area under the curve [AUC] = 0.91) and validation (AUC = 0.92) cohorts. A concentration of risk analysis found that across all 35,654 participants, 82% of prospective suicide attempts occurred among individuals in the top 15% of DRS scores, whereas 27% occurred in the top 1%. The DRS also performed well among important subgroups, including women (AUC = 0.91), men (AUC = 0.93), Black (AUC = 0.92), White (AUC = 0.93), Hispanic (AUC = 0.89), veterans (AUC = 0.91), lower-income individuals (AUC = 0.90), younger adults (AUC = 0.88), and lesbian, gay, bisexual, transgender, and queer or questioning (LGBTQ) individuals (AUC = 0.88). The primary limitation of the present study was its its reliance on secondary data analyses to develop and validate the risk score.

Worldwide, nearly 800,000 individuals die by suicide each year; however, longitudinal prediction of suicide attempts remains a major challenge within the field of psychiatry. The objective of the present research was to develop and evaluate an evidence-based suicide attempt risk checklist [i.e., the Durham Risk Score (DRS)] to aid clinicians in the identification of individuals at risk for attempting suicide in the future.

Abbreviations: APA, American Psychiatric Association; AUC, area under the curve; AUDADIS, Alcohol Use Disorder and Associated Disabilities Interview Schedule; AUDIT, Alcohol Use Disorders Identification Test; BPD, borderline personality disorder; BSS, Beck Scale for Suicide Ideation; CAPS, Clinician-Administered PTSD Scale; CI, confidence interval; C-SSRS, Columbia-Suicide Severity Rating Scale; CTQ, Childhood Trauma Questionnaire; DAST, Drug Abuse Screening Test; DRS, Durham Risk Score; DTS, Davidson Trauma Scale; EHR, electronic health record; FN, false negative; FP, false positive; LEC, Life Events Checklist; LGBTQ, lesbian, gay, bisexual, transgender, and queer or questioning; MINI, Mini International Neuropsychiatric Interview; NESARC, National Epidemiologic Survey on Alcohol and Related Conditions; NPV, negative predictive value; NSSI, nonsuicidal self-injury; OR, odds ratio; PHQ-9, Patient Health Questionnaire-9; PPV, positive predictive value; PTSD, post-traumatic stress disorder; REHAB, Assessing and Reducing Post-Deployment Violence Risk; ROC, receiver operating characteristic; SBQ-R, Suicidal Behaviors Questionnaire-Revised; SCID, Structured Clinical Interview for DSM; SCL-90, Symptom Checklist-90; SHBQ, Self-Harm Behavior Questionnaire; SITBI, Self-Injurious Thoughts and Behaviors Interview; SPS, SAD PERSONS scale; TBI, traumatic brain injury; TLEQ, Traumatic Life Events Questionnaire; TN, true negative; TP, true positive; TRIPOD, Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis; VALOR, Veterans After-Discharge Longitudinal Registry; VR-12, Veterans Rand 12-Item Health Survey

Funding: his work was supported by a grant from the National Institute of Mental Health (NIMH; #R01MH080988) to E.B. and grants from the Department of Defense (DoD;#W81XWH-08-2-0100/W81XWH-08-2-0102) to T.K. N.K. (#I01CX001729) and J.B. (#lK6BX003777) also received support from the VA ORD Clinical Sciences Research and Development Service. In addition, the National Institute on Alcohol Abuse and Alcoholism (NIAAA) funded the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC). N.K. was also supported by the Mental and Behavioral Health Service Line of the Durham VA Health Care System, the VA Mid-Atlantic Mental Illness Research, Education, and Clinical Center (MIRECC), the VA Health Services Research and Development Center of Innovation to Accelerate Discovery and Practice Transformation (ADAPT), and the Department of Psychiatry & Behavioral Sciences at the Duke University School of Medicine. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Data Availability: Data from the NESARC, REHAB, and VALOR datasets cannot be shared publicly because the Institutional Review Board requirements for these studies that the study authors conducted the present analyses under do not allow for public sharing of data; however, the data may be made available to researchers who meet criteria for access to confidential data. Please contact the following individuals (for NESARC: [email protected] ; for REHAB: [email protected] ; for VALOR: [email protected] ) for more information. Researchers interested in obtaining the data can also go to the following websites for more information on how to obtain access to each of the datasets: NESARC ( https://www.niaaa.nih.gov/research/guidelines-andresources/epidemiologic-data ); REHAB ( https://www.durham.va.gov/research/research.asp ); and VALOR ( https://www.boston.va.gov/services/Research.asp ).

This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

Thus, there remains a pressing need for a risk assessment tool capable of helping clinicians to accurately identify individuals at risk for attempting suicide in the future. The Durham Risk Score (DRS; Fig 1 ) is a suicide attempt risk checklist developed using both rational and quantitative methods to meet this specific need. This report describes the initial development and validation of the DRS, including its utility in predicting future suicide attempts over a 1- to 3-year period across a large and diverse cohort of participants from the US [ 25 – 27 ]. In creating this measure, our goal was to create a suicide risk calculator similar in nature to the well-known Framingham Risk Score and pooled cohort equations that are widely used to screen individuals for 10-year risk of cardiovascular disease [ 28 ]. We hypothesized that by combining a broad array of empirically supported risk factors for suicidal behavior [ 3 – 18 , 21 – 24 , 26 , 27 , 29 – 43 ] into a clinical checklist that we could significantly enhance clinicians’ ability to identify individuals at risk for attempting suicide in the future.

As a result of concerns over the poor diagnostic accuracy of both clinician prediction [ 10 , 19 ] and existing clinical suicide risk assessments [ 3 – 17 ], a number of statistically driven suicide risk algorithms based on electronic health record (EHR) data have been developed in recent years and are already showing substantial promise [ 21 – 24 ]; however, such approaches have also been criticized for having limited practical utility [ 24 ]. In addition to problems related to low positive predictive values (PPVs) [ 10 ], such models also have pragmatic shortcomings, such as (1) not being available for individuals outside the healthcare systems where they were originally developed; (2) not being able to be applied to first-time patients or patients who do not meet certain criteria (e.g., a history of mental health appointments in the EHR); (3) being impractical for clinicians to calculate on their own; and (4) being difficult for clinicians to interpret because scores are often derived from machine learning approaches that rely on hidden layers, nonlinear models, and complex higher-order interactions. Thus, while machine learning–based algorithms derived from EHR data appear to substantially outperform clinician prediction and traditional clinical assessment approaches in terms of diagnostic accuracy, they also have a number of pragmatic shortcomings that potentially limit their usefulness for practicing clinicians.

Regrettably, there is little reason to believe that clinician prediction is more accurate at predicting future suicidal behavior than structured assessments [ 10 , 19 ]. For example, Randall and colleagues [ 10 ] examined the accuracy of clinician prediction of suicide risk and found that clinician assessment was also only moderately accurate at predicting future suicide attempts (AUC = 0.73). Moreover, clinician prediction of future death by suicide was no better than chance (AUC = 0.55; 95% confidence interval [CI]: 0.36 to 0.73) [ 10 ]. These findings are consistent with a 2019 meta-analysis conducted by Woodford and colleagues [ 19 ] that evaluated the accuracy of clinician prediction in relation to future self-harm (note that the term “self-harm” encompasses both suicidal and nonsuicidal self-injury [NSSI]). This meta-analysis (which did not include the study by Randall and colleagues [ 10 ] cited above) estimated sensitivity for clinician prediction of future self-harm to be 0.31 [ 19 ], indicating that clinician prediction in the included studies failed to identify 69% of the individuals who went on to engage in future self-harm. While specificity (0.85) for clinician prediction of self-harm was markedly better than sensitivity, overall classification remained poor. Woodford and colleagues [ 19 ] did not report the AUC value for clinician prediction of future self-harm in their meta-analysis; however, in preparation for the present work, we utilized Idrees and colleagues’ [ 20 ] approach to calculate AUC from the classification data provided by Woodford and colleagues [ 19 ], which included 1,685 true positives (TPs), 5,996 false positives (FPs), 1,556 false negatives (FNs), and 13,262 true negatives (TNs). This calculation revealed that the AUC value for clinician prediction for future self-harm across the 22,499 cases examined by Woodford and colleagues [ 19 ] was 0.60 (where AUC = (1/2) * [(TP/(TP+FN)) + (TN/(TN+FP))]. Thus, we concur with Woodford and colleagues’ conclusion that clinician estimation of future self-harm is too inaccurate to be clinically useful [ 19 ].

Given such findings, it is perhaps not surprising that the American Psychiatric Association’s (APA) Practice Guideline for the Assessment and Treatment of Patients with Suicidal Behaviors [ 18 ] recommends that psychiatrists utilize their clinical judgment to estimate patients’ overall level of suicide risk based on a comprehensive psychiatric evaluation, rather than relying on a standardized instrument to estimate suicide risk. The guideline further indicates that psychiatrists should consider no less than 70 different risk and protective factors when attempting to estimate patients’ suicide risk, including history of suicidal thoughts/behaviors (5 factors), psychiatric diagnoses (8 factors), physical illnesses (12 factors), psychosocial features (6 factors), childhood traumas (2 factors), genetic and familial effects (2 factors), psychological features (12 factors), cognitive features (4 factors), demographic features (6 factors), additional features (3 factors), and protective factors (10 factors) [ 18 ].

In England, Quinlivan and colleagues investigated the extent and type of suicide risk scales utilized by emergency department clinicians and mental health staff members from a stratified random sample of 32 hospitals and found that the most frequently used suicide risk assessment instruments were unvalidated, locally developed scales [ 11 ]. Indeed, 22 of 32 (68.8%) English hospitals included in this study used an unvalidated instrument to assess suicide risk, leading the authors to conclude that there is presently little consensus among clinicians and hospital systems regarding the best instrument to use to assess suicide risk [ 11 ]. In the remaining third of English hospitals included in the study, the SAD PERSONS scale (SPS) [ 12 ] emerged as the most frequently used standardized approach to suicide risk assessment [ 11 ]. Unfortunately, recent studies have found that the AUC for the SPS for prediction of future suicide attempts is not better than chance (AUC = 0.51 to 0.57) [ 13 , 14 ]. Two other similarly structured and frequently used clinical risk approaches, including the Manchester Self-Harm Rule [ 15 ] and the ReACT Self-Harm Rule [ 16 ], performed better (AUC = 0.71 for both [ 13 ]), but still well below the level of discrimination typically recommended for clinical decision-making (i.e., AUC ≥0.90). While discouraging, these findings are consistent with a recent systematic review and meta-analysis of currently available suicide risk instruments including (among others) the C-SSRS [ 5 ], BSS [ 9 ], SPS [ 12 ], the Manchester Self-Harm Rule [ 15 ], and the ReACT Self-Harm Rule [ 16 ] that concluded that there is presently “… no scientific support for the use of suicide risk instruments for predicting suicidal acts” [ 17 ].

Suicide accounted for 793,000 deaths worldwide in 2016 and was the second leading cause of death among 15 to 29 year olds [ 1 ]. Moreover, within the US, age-adjusted suicide rates have increased by 33% since 1999 [ 2 ]. Unfortunately, prospective prediction of suicidal behavior remains a major challenge for the field of psychiatry [ 3 ]. For example, a 2017 meta-analysis of longitudinal risk factors for suicidal behavior found the overall weighted odds ratio (OR) for prospective predictors of suicide attempts to be 1.5 [ 3 ]. When diagnostic accuracy was examined, no risk factor category (including suicide screeners) had a weighted area under the curve (AUC) greater than 0.61 for the prediction of future suicide attempts [ 3 ]. Similarly, a 2019 study [ 4 ] designed to prospectively evaluate several of the most commonly used suicide attempt risk instruments in the US, including the Columbia-Suicide Severity Rating Scale (C-SSRS [ 5 ]; a widely used suicide risk assessment instrument recommended for use in drug trials [ 6 ]), the Self-Harm Behavior Questionnaire (SHBQ [ 7 ]), the Suicidal Behaviors Questionnaire-Revised (SBQ-R [ 8 ]), and the Beck Scale for Suicide Ideation (BSS [ 9 ]), found that none of these instruments had an AUC above 0.67 in relation to future suicide attempts [ 4 ]. Similarly, a 2018 study by Randall and colleagues [ 10 ] also found that the C-SSRS was only moderately accurate at predicting future attempts (AUC = 0.67) and death by suicide (AUC = 0.68) [ 10 ].

Methods

Participants National Epidemiologic Survey on Alcohol and Related Conditions study. The National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) study [25,44,45] is a large, longitudinal general population study conducted by the National Institute on Alcohol Abuse and Alcoholism in the US. The initial NESARC study included a nationally representative sample of 43,093 participants assessed for a wide array of psychiatric and substance use issues in 2001 to 2002 [25]. Wave 2 occurred 3 years later and included follow-up interviews with 34,653 of the participants from Wave 1 (see Grant and colleagues [25,44,45] for additional details regarding study procedures for the NESARC project). The current analyses were restricted to the 34,641 NESARC participants who participated in Waves 1 and 2 and had follow-up suicide attempt data available from Wave 2. The random selection procedure from the IBM SPSS Statistics 24 software package was used to generate 4 random subsets of participants from the NESARC dataset, including 2 for development [NESARC 1 (N = 8,872) and NESARC 2 (N = 8,525)] and 2 for validation [NESARC 3 (N = 8,516) and NESARC 4 (N = 8,728), see Table 1 for sample characteristics]. Sampling was performed without replacement to ensure that each case was not selected more than once. Note that the 4 subsets of participants from NESARC did not differ by rate of prospective suicide attempts, p = 0.973; lifetime suicide attempts, p = 0.729; gender, p = 0.541; age, p = 0.448; race, p = 0.814; sexual orientation, p = 0.839; education, p = 0.343; income, p = 0.67; or employment status, p = 0.923. See Grant and colleagues [25,44,45] for additional details regarding study procedures for the NESARC study. PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Table 1. Descriptive statistics for the longitudinal samples used to develop and validate the DRS. https://doi.org/10.1371/journal.pmed.1003713.t001 Assessing and Reducing Post-Deployment Violence Risk study. The Assessing and Reducing Post-Deployment Violence Risk (REHAB) sample was comprised of Iraq/Afghanistan-era veterans from the US who participated in a 1-year longitudinal study entitled “Assessing and Reducing Post-Deployment Violence Risk” that focused on examining the association between post-traumatic stress disorder (PTSD), traumatic brain injury (TBI), and violence [26,46].To be eligible for the present analyses, participants had to have no history of post-deployment suicide attempts at the time of the baseline assessment as well as follow-up suicide attempt data available for analysis. The former inclusion criteria were used to ensure that all prospective suicide attempts reported at the 6- and 12-month follow-up assessments truly represented new instances of suicide attempts, as this study relied exclusively on self-report to assess suicide attempts. Additional details regarding this study’s methodology can be found in Elbogen and colleagues [46] and Adkisson and colleagues [26]. Veterans After-Discharge Longitudinal Registry study. The Veterans After-Discharge Longitudinal Registry (VALOR) sample was comprised of US veterans who participated in the VALOR study [27,47], a 2-year longitudinal study of Iraq/Afghanistan-era veterans. Analyses were limited to participating veterans (N = 780) with complete baseline data and follow-up suicide attempt data available for analysis. Further details regarding this study’s methodology can be found in Rosen and colleagues [47] and Lee and colleagues [27].

Main outcome variable The present analyses focused on the prediction of future suicide attempts (as opposed to death by suicide) for several reasons. First, death by suicide is an extremely rare event. In the US, the age-adjusted rate of suicide was 13.9/100,000 in 2019 [2]. Suicide attempts are far more common than suicide deaths [2,31,32] and are among the strongest known predictors of death by suicide [3,31,32]. Indeed, Olfson and colleagues [32] found that 1.6% of individuals who attempted suicide died by suicide within 12 months, whereas 3.9% died by suicide within 5 years. Suicide attempts are also routinely assessed in high-quality longitudinal datasets, whereas there are few, if any, longitudinal research databases with sufficiently large samples sizes to study death by suicide that also contain high-quality, systematically assessed data on established predictors of suicidal behavior collected via rigorous research-based assessments. Of note, Belsher and colleagues [24] recently recommended that future suicide risk models target more common outcomes, including suicide attempts specifically, to develop better performing models of suicide risk following their review of existing suicide risk models. It is also important to recognize that suicide attempts are highly serious events in their own right. As noted by the World Health Organization [48], “Suicide attempts result in a significant social and economic burden for communities due to the utilization of health services to treat the injury, the psychological and social impact of the behaviour on the individual and his/her associates and, occasionally, the long-term disability due to the injury.” Assessment of suicide attempts. Prospective suicide attempts were assessed by trained interviewers in the NESARC study during the Low Mood portion of the interview with the following question: “During that time since your LAST interview when (your mood was at its lowest/you enjoyed or cared least about things), did you attempt suicide?” Thus, for the vast majority of participants included in the present analyses, the main outcome variable was assessed by an interviewer who was explicitly trained to only record new instances of suicide attempts that occurred after the initial NESARC baseline assessment. Similarly, in the VALOR sample, the Self-Injurious Thoughts and Behaviors Interview (SITBI) [49] was administered at the 2-year follow-up by a trained interviewer who specifically focused on identifying new instances of suicide attempts that had occurred since the time of the baseline assessment. Specifically, Project VALOR participants were asked the following question in relation to the 2-year time period following the baseline assessment: “Have you ever made an actual attempt to kill yourself in which you had at least some intent to die?” Participants’ EHRs were also reviewed for instances of suicide attempts and/or death by suicide as part of Project VALOR. Further details regarding these procedures can be found in Lee and colleagues [27] and Rosen and colleagues [47]. Finally, in the REHAB sample, suicide attempts were assessed via self-report with a study-specific instrument designed to assess pre-deployment suicide attempts, deployment-based suicide attempts, and post-deployment suicide attempts separately [26]. Because this was the only study included in the present analyses that relied exclusively on self-report to assess prospective suicide attempts, veterans who reported 1 or more post-deployment suicide attempts at the time of the baseline assessment were excluded from the present analyses to ensure that any new instances of post-deployment suicide attempts reported at the 6- and 12-month follow-up assessments truly reflected new occurrences of suicide attempts and were not the result of a reporting error.

Overview of the analysis plan The primary analyses underlying the development and validation of the DRS began in April 2018 and ended in July 2020 and were conducted under research protocols approved by the Institutional Review Boards of the Durham Veterans Affairs Health Care System, Duke University School of Medicine, and the VA Boston Healthcare System. Additional analyses requested by reviewers during the peer review process were conducted from March 2021 to April 2021. While a written prospective analysis plan was not developed prior to initiating work on this project, a systematic approach was used to develop and validate the DRS. Specifically, measure development began with a review of the extant literature on risk factors for suicidal behavior [3–18,21–24,26,27,29–43]. After identifying and ranking a wide array of potential longitudinal predictors of death by suicide and suicide attempts from the literature, secondary data analyses were conducted to develop the DRS in the development samples (i.e., NESARC 1, NESARC 2, and REHAB; combined N = 17,630). It was then tested in the validation samples (i.e., NESARC 3, NESARC 4, and VALOR; combined N = 18,024) to determine if it continued to be predictive in separate cohorts of similar size and composition. This study is reported as per the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) reporting guideline (see S1 TRIPOD Checklist).

Development of the Durham Risk Score Because our primary goal was to develop a suicide attempt risk checklist that could be used by clinicians to reliably discriminate high-risk patients from low-risk patients, receiver operating characteristic (ROC) curve analysis was the primary statistical approach used to develop the DRS in the development samples. Logistic regression was also utilized to help guide variable selection procedures in some instances. ROC curves, correlation matrices, and chi-squared tests were used to evaluate bivariate associations and to identify optimal cut points or iterations of variables that were maximally predictive of suicide attempts. We elected to split the NESARC sample into 4 smaller samples to ensure that we would have (1) 2 large datasets with which to conduct the initial development work in; and (2) 2 large datasets of similar size and composition in which to test the performance of the final selected model. That is, consistent with standard holdout cross-validation approaches that utilize a training dataset (T tr ) and a validation dataset (T v ) to avoid overfitting due to limiting the development sample to a single dataset, we utilized 2 large, randomly selected subsets of NESARC participants to develop the DRS. A third sample (REHAB), which was smaller, collected independently, and comprised entirely of veterans (many of whom had psychiatric disorders), was also included in the development phase to further protect against overfitting and to increase generalizability of findings. Thus, from a total sample size of 35,654 participants, 17,630 participants (including NESARC 1, NESARC 2, and REHAB) were utilized to develop the DRS, whereas the remaining samples (NESARC 3, NESARC 4, and VALOR; combined N = 18,024) were held out to test the performance of the DRS in testing datasets (Tt) of similar size and composition. Table 1 provides descriptive statistics for each of the samples included in the present analyses. Consistent with recommendations for building appropriate and stable predictive models [50,51], independent variable selection was guided by theory [29,30], prior empirical investigations [3–5,7–18,21–24,26,27,29–43], clinical considerations [3–18,21–24,29–31,48], univariate and bivariate statistical analyses, and consideration of multicollinearity among independent variables. Accordingly, independent variable selection and screening began with a review of the relevant literature concerning risk factors for suicidal behavior [3–5,7–18,21–24,26,27,29–43]. An a priori decision was made to prioritize variables that had particularly strong empirical support as longitudinal risk factors in the literature (e.g., recent psychiatric hospitalization)—even if their effects were less pronounced in our specific samples—in hopes of increasing the stability and replicability of the checklist in future work. To simplify quantification of the empirical evidence, we relied on Franklin and colleagues’ [3] meta-analysis, which, in our opinion, was the most comprehensive work on this subject available at the time of the analyses. The top 10 broad risk categories for suicide deaths and suicide attempts were assigned scores from 1 to 10, where a score of 10 was assigned to the broad risk categories most strongly associated with suicide deaths and attempts. In addition, the top 5 predictors of suicide deaths and suicide attempts identified in this meta-analysis were also assigned scores from 6 to 10. Thus, potential evidence scores ranged from 0 to 40 (see Table A in S1 File). Table 2 provides the empirical evidence score that we assigned to each of the variables based on the findings from Franklin and colleagues [3] as well as the potential impact of each variable’s entry into the model on the cumulative AUC value for different iterations of the DRS across the 3 development samples. PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Table 2. Empirical evidence scores and impact on cumulative AUC values for each of the DRS variables across the 3 development samples. https://doi.org/10.1371/journal.pmed.1003713.t002 As can be seen in Table 2 and Table A in S1 File, a history of prior suicide attempts was the variable with the highest total empirical evidence score based on this approach (total empirical evidence score = 35; mean AUC = 0.62), whereas psychosis/schizophrenia was the lowest scoring variable (total empirical evidence score = 1; mean AUC = 0.52) that was considered. As can be seen in Fig A in S1 File, a statistically significant positive correlation (r = 0.37, p < 0.001) was observed between the total empirical evidence score and the mean bivariate AUC value for each construct considered across the 3 development samples, providing support for our general (albeit simplistic) approach to quantifying the empirical evidence for the variables considered. To ensure that scoring and interpretation remained as simple as possible (i.e., to ensure that higher scores would equal higher risk), an a priori decision was also made to only include dichotomous risk factors with obvious main effects. Thus, protective factors, risk factors that only had effects in the presence of other variables (e.g., through interactions), and scaled risk factors were excluded as potential predictors (although in several cases we were able to successfully dichotomize items collected on a scale, e.g., sleep problems and perceived health). Additionally, consistent with Babyak’s recommendations [50], overlapping constructs were aggregated in many instances to increase model stability and reduce the number of variables included in the checklist. As a result, composite variables were created for “mood disorders,” “substance use disorders,” “violence/incarceration,” “sexual abuse/sexual assault,” and “lesbian, gay, bisexual, transgender, and queer or questioning (LGBTQ)”. An iterative, sequential approach to model building was taken whereby variables expected to have strong and pronounced effects on future risk for suicide based on the extant literature (e.g., prior suicide attempts, hospitalization, NSSI, and suicidal ideation) [3] were entered before variables with less empirical support (e.g., demographic predictors). We began by calculating ROC curves for each of the potential predictors across the 3 development samples (see Table A in S1 File). Then, beginning with the 2 variables we identified as having the strongest empirical support from the literature (i.e., prior suicide attempts and prior psychiatric hospitalization), we evaluated if the combination (i.e., sum) of these 2 variables resulted in an AUC value that was consistently higher across the development samples than the AUC values for the individual variables when examined separately. Utilizing this general approach, we systematically evaluated each new variable for potential inclusion in the checklist until we were unable to identify any additional variables that improved discrimination of high-risk individuals from low-risk individuals in 1 or more of the development samples (see Table 2). The final set of constructs selected for inclusion in the DRS are provided in Table 2, which also shows the impact of each variable’s entry into the model on the cumulative AUC value for different iterations of the DRS across the 3 development samples. It is, however, important to note that an iterative approach was taken to variable selection and that the constructs ultimately selected for inclusion in the DRS were those constructs that not only optimized predictive validity across the 3 development samples, but were also logical from both a theoretical and clinical perspective [18,29,30]. Other variables from the extant literature [3,18,21–23,29–43] were also considered (see Table A in S1 File), but not ultimately selected, including (among others) other psychiatric disorders (e.g., schizophrenia and anxiety disorders), recent life stressors, and various demographic variables (e.g., marital status). Different orders and iterations of variables (e.g., frequency, severity, and time frame of assessment) were also considered in order to optimize the predictive value of variables within the development samples. Please also note that many other potentially important variables (e.g., suicidal intent, access to lethal means, suicide plans, and a psychiatric hospitalization during the past 30 days) were not available for analysis in the samples utilized in the present study. To be retained in the final version of the checklist, each variable needed to (1) have clear empirical support in the literature; (2) demonstrate a positive bivariate association with future suicide attempts in 1 or more of the development samples; (3) evidence incremental validity in 1 or more of the development samples; and (4) show minimal negative impact on incremental validity in the remaining development samples. Utilizing the approach described above, we initially selected 23 items for inclusion in the checklist, each weighted equally. Once we reached the point at which we were no longer able to identify any new variables that further improved the predictive utility of the score, we examined if doubling the weight by adding an additional point to the sum score of any of the items identified as top predictors further improved AUC values. This analysis revealed that doubling the weight of 4 of the top longitudinal predictors identified by Franklin and colleagues [3] (i.e., lifetime history of suicide attempts, psychiatric hospitalization, NSSI, and borderline personality disorder [BPD]) further improved the overall AUC value in 1 or more of the development samples (see Table 2).

Evaluation of the Durham Risk Score ROC curves and logistic regression analyses were used to evaluate the discriminative ability of the DRS across the samples. Signal detection analysis was used to identify an optimal cut score [52] and to develop risk groups to facilitate interpretation of scores. Concentration of risk was evaluated [23], and rates of attempts, risk ratios, ORs, and 95% CIs were calculated for risk groups. ROC curves were also calculated in subgroups of interest, including women, men, Black, White, Hispanic, lower-income individuals, younger adults, veterans, LGBTQ individuals, as well as individuals with and without a history of suicidal thoughts and behaviors.

Missing data Although we are strong proponents of multiple imputation and maximum likelihood estimation methods to handle missingness in most situations, we elected to treat missing data as absent (i.e., “0”) in the calculation of DRS scores in the present research because (1) this approach best reflects real-world clinical practice; and (2) some variables were systematically missing across different studies because they were not assessed as part of the study protocol. The only exception to this approach was for the VALOR sample, which was used to validate the DRS. Specifically, because the VALOR study protocol only assessed 15 of the 23 (i.e., 65%) variables used to calculate the DRS, VALOR analyses were limited to participating veterans (N = 780) with follow-up attempt data as well as complete data for all 15 of these variables to ensure that participants in the VALOR analyses had no more than 35% missing data.

[END]

[1] Url: https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1003713

(C) Plos One. "Accelerating the publication of peer-reviewed science."
Licensed under Creative Commons Attribution (CC BY 4.0)
URL: https://creativecommons.org/licenses/by/4.0/

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/