Background: Timely, precise, and localized surveillance of nonfatal events is needed to improve response and prevention of opioid-related problems in an evolving opioid crisis in the United States. Records of naloxone administration found in prehospital emergency medical services (EMS) data have helped estimate opioid overdose incidence, including nonhospital, field-treated cases. However, as naloxone is often used by EMS personnel in unconsciousness of unknown cause, attributing naloxone administration to opioid misuse and heroin use (OM) may misclassify events. Better methods are needed to identify OM.
Objective: This study aimed to develop and test a natural language processing method that would improve identification of potential OM from paramedic documentation.
Methods: First, we searched Denver Health paramedic trip reports from August 2017 to April 2018 for keywords naloxone, heroin, and both combined, and we reviewed narratives of identified reports to determine whether they constituted true cases of OM. Then, we used this human classification as reference standard and trained 4 machine learning models (random forest, k-nearest neighbors, support vector machines, and L1-regularized logistic regression). We selected the algorithm that produced the highest area under the receiver operating curve (AUC) for model assessment. Finally, we compared positive predictive value (PPV) of the highest performing machine learning algorithm with PPV of searches of keywords naloxone, heroin, and combination of both in the binary classification of OM in unseen September 2018 data.
Results: In total, 54,359 trip reports were filed from August 2017 to April 2018. Approximately 1.09% (594/54,359) indicated naloxone administration. Among trip reports with reviewer agreement regarding OM in the narrative, 57.6% (292/516) were considered to include information revealing OM. Approximately 1.63% (884/54,359) of all trip reports mentioned heroin in the narrative. Among trip reports with reviewer agreement, 95.5% (784/821) were considered to include information revealing OM. Combined results accounted for 2.39% (1298/54,359) of trip reports. Among trip reports with reviewer agreement, 77.79% (907/1166) were considered to include information consistent with OM. The reference standard used to train and test machine learning models included details of 1166 trip reports. L1-regularized logistic regression was the highest performing algorithm (AUC=0.94; 95% CI 0.91-0.97) in identifying OM. Tested on 5983 unseen reports from September 2018, the keyword naloxone inaccurately identified and underestimated probable OM trip report cases (63 cases; PPV=0.68). The keyword heroin yielded more cases with improved performance (129 cases; PPV=0.99). Combined keyword and L1-regularized logistic regression classifier further improved performance (146 cases; PPV=0.99).
Conclusions: A machine learning application enhanced the effectiveness of finding OM among documented paramedic field responses. This approach to refining OM surveillance may lead to improved first-responder and public health responses toward prevention of overdoses and other opioid-related problems in US communities.
The more than 47,000 opioid-involved overdose deaths in 2018 in the United States [, ] insufficiently reflect the nonfatal burden associated with prescription opioid misuse and heroin use (OM) by an estimated 10.3 million people [ ]. Timely, precise, and localized surveillance of nonfatal events is needed to define medical treatment trends related to OM and improve response and prevention of overdoses and other opioid-related problems.
Timely information sources about nonfatal opioid-related events include hospitals, emergency departments (EDs) , and prehospital emergency medical services (EMS). Paramedics routinely encounter patients with symptoms consistent with drug overdose and administer naloxone (an effective opioid antagonist) to reverse symptoms [ ]. EMS data have helped estimate opioid overdose incidence, including nonhospital, field-treated cases [ - ]. Frequency of naloxone administration has positively correlated with opioid and heroin overdose-related ED visits [ ] and fatal opioid overdose rates [ ], suggesting that naloxone administration might be a relevant proxy to monitor need for interventions.
Opioid misuse and heroin use  refer to illicit use and nonmedical prescription opioid use for extended periods or for experience and feelings derived from the medication [ ]. Naloxone, administered by paramedics to reverse opioid-induced respiratory depression [ , ], might serve as a potential OM sentinel, particularly when OM has resulted in an opioid overdose [ , , ]. However, as naloxone is often used by EMS personnel in unconsciousness of unknown cause, attributing naloxone administration to opioid overdose and OM may misclassify events as opioid-related. A study of EMS-administered naloxone reported poor sensitivity and low positive predictive value (PPV) for opioid overdose [ ].
Better methods are needed to accurately identify opioid-related problems and trends of OM. To fill this gap, we sought to develop and test a natural language processing (NLP) method that would improve classification of OM among paramedic trip reports with documentation of naloxone administration or evidence of heroin use.
Denver Health’s (DH)  Paramedic Division is the main provider of EMS for the city and county of Denver. Their record system adheres to the National Emergency Medical Services Information System data standard version 3.4.0 [ ]. We processed the following variables for each trip report: free-text narratives, primary impressions, alcohol or drug use note, and list of medications administered by paramedics. summarizes the 3 study phases.
The Quality Improvement Committee of DH, which is endorsed by the Colorado Multiple Institutional Review Board at the University of Colorado, Denver, determined that this work did not constitute human subjects research.
|Phase||Purpose||Description of methods||Time frame|
|1||Assess performance of keyword search approaches||Searched trip reports for keywords (ie, “naloxone,” “heroin,” and both combined) and reviewed charts of identified reports to assess positive predictive value||August 2017 to April 2018|
|2||Train and test supervised machine learning classification||Guided machine learning models using previous phase’s chart review classification results and selected the highest performing algorithm in binary classification of opioid misuse and heroin use||August 2017 to April 2018|
|3||Validate performance measures across approaches||Compared the highest performing machine learning algorithm with the performance of searches of keywords “naloxone,” “heroin,” and combination of both||September 2018|
Phase 1: Assess Text String Search Approaches
Naloxone administrations have been previously used to flag potential OM resulting in opioid overdoses [, , ], and heroin use implies OM. To reduce the DH EMS dataset to a prescreened subset of all paramedic reports, we searched for presence of keywords naloxone (or narcan) among administered medications or heroin (or misspelled variations herion and heroine) in trip report narratives between August 1, 2017, and April 30, 2018. No opioid brand names (eg, Oxycontin or Tramadol) were used to identify opioid-related events. Trip reports that included the keywords were reviewed by 2 independent reviewers, both DH paramedics, to answer the question: “Is there narrative evidence (yes, no or unsure) of illicit opioid use or prescription OM (ie, use beyond clinical needs, for extended periods, or for experience and feelings derived from the medication)?” If unsure or when adverse events from opioids did not imply misuse, reviewers were to classify that report as negative. We hypothesized lower false-positive rates for the heroin vs naloxone methods because heroin use implies OM. To visualize trends, weekly potential OM paramedic trip report counts for each search approach were calculated. Pearson correlation coefficients (r) assessed correlation between weekly OM paramedic trip report counts by search approach and reviewer assessments.
Phase 2: Train and Test Supervised Machine Learning Classification
Trip reports with naloxone among administered medications or heroin in narratives, plus reviewer agreement regarding OM in the narrative, served as our reference standard classification for training and validation of machine learning models; trip reports without reviewer agreement were omitted (examples in). We removed the blank space between words in all variables, except in narratives, to create single-text entities (ie, DenverHealth instead of Denver Health). We stemmed words and removed stop words (eg, the, a, or and). To prevent overfitting, an 80% training set and 20% test set were created. Training corpus was converted into a document term matrix (terms as columns and documents as rows) that described the frequency of terms that occurred in narratives. To classify trip reports (OM evidence: yes or no), we used NLP machine learning models available from the caret Package [ ] on R version 3.4.1 (ie, random forest, k-nearest neighbors, support vector machines, and L1-regularized logistic regression). Values of hyperparameters and parameters for each model were estimated using default configurations (ie, no hyperparameter tuning), which were optimized with 3 repeats of 5-fold cross-validation and then fit to the entire training set. We assessed performance of each model by calculating PPV, negative predictive value (NPV), true-positive rates (TPRs), true-negative rates (TNRs), and areas under the receiver operating characteristic curves (AUCs), and we selected the binary classification algorithm with the highest AUC for subsequent model assessment. Details can be found in authored R code in .
Phase 3: Validate Performance Measures Across Approaches
We searched for presence of the keywords naloxone (or narcan) among administered medications or heroin (or misspelled variations herion and heroine) in narratives of unseen September 2018 trip reports. Resulting trip reports were manually assessed following the same methodology as in phase 1. We then applied the machine learning classifier selected in phase 2 of the study to the reduced dataset of September 2018 trip reports. We hypothesized that machine learning models would decrease false-positive classifications of the combined naloxone and heroin search method because the algorithm would have learned and benefited from agreement in human assessments in phase 1. Reviewers’ assessment was used as a reference standard to calculate PPV for each approach.
Phase 1 Findings
In total, 54,359 trip reports were filed, and 1.09% (594/54,359) indicated naloxone administration; reviewers agreed on assessment in 86.9% (516/594) of reports. Among trip reports with agreement, 56.6% (292/516) were considered to include information revealing OM.
Approximately 1.63% (884/54,359) of all trip reports mentioned heroin in the narrative. Reviewers agreed on potential OM assessment in 92.9% (821/884) of these. Among trip reports with agreement, almost all (784/821, 95.5%) were considered to include information revealing OM.
Combined results, where naloxone was administered by paramedics or heroin was mentioned in the narrative, accounted for 2.39% (1298/54,359) of trip reports. Reviewers agreed on potential OM assessment in trip reports in 89.83% (1166/1298) of these. Among trip reports with agreement, more than three-quarters (907/1166, 77.79%) included information consistent with OM.
Weekly counts of keywords mention varied by approach;is annotated to show periods of divergent trends between weekly sums of flagged reports and those affirmed by reviewer assessment. The naloxone approach was not consistent with reviewer assessment trends (r=0.60); the heroin and combined approaches were consistent with reviewer assessment trends (r=0.88 and r=0.90, respectively).
Phase 2 Findings
The reference standard used to train and test machine learning models included details of 1166 naloxone- and heroin-flagged trip reports with positive OM reviewer assessment in phase 1. L1-regularized logistic regression was the highest performing algorithm (AUC=0.94; PPV=0.95; TPR=0.91; NPV=0.72; and TNR=0.84), followed by support vector machines (AUC=0.91; PPV=0.92; TPR=0.92; NPV=0.73; and TNR=0.73), random forest (AUC=0.91; PPV=0.91; TPR=0.95; NPV=0.79; and TNR=0.65), and k-nearest neighbors (AUC=0.81; PPV=0.79; TPR=1; NPV=0.1; and TNR=0.08). L1-regularized logistic regression yielded higher performance than the other algorithms; further statistical analyses, confusion matrices, and features that scored highest are presented in.
Phase 3 Findings
Among 5983 September 2018 trip reports, naloxone identified 63 events, and chart review revealed 20 false positives (PPV=0.68). Examples of false positives are presented in. Keyword heroin identified 129 trip reports, and chart review revealed 1 false positive (PPV=0.99). Combined naloxone and heroin searches identified 171 trip reports with 20 false positives (PPV=0.88).
L1-regularized logistic regression, the highest performing machine learning algorithm from phase 2, did not identify the one true negative of OM in reports flagged by heroin but identified 18 of the 20 true negatives of OM in reports flagged by naloxone administrations. The classifier identified 146 potential OM events from the 171 trip reports flagged by the combined text search with only 2 false positives. Results are summarized in. The machine learning classifier produced counts closer to those from reviewer assessment ( shows counts for weeks 36 to 39 of 2018).
|Approach||Number of identified trip reports by approach (N)||Positive predictive value, n (%)||Correlationa between weekly opioid misuse counts and chart review assessment|
|Naloxone search among administered medications||63||43 (68.3)||0.86|
|Heroin search in narratives||129||128 (99.2)||0.99|
|Combined search approach (naloxone or heroin)||171||151 (88.3)||1|
|Machine learningb on combined search approach||146||144 (98.6)||0.99|
aPearson correlation coefficient.
bL1-regularized logistic regression.
This study sought to better understand documentation in paramedic trip reports as a tool to support more effective nonfatal OM surveillance. Accurate detection of potential OM events in survivors of EMS runs can reflect short-term trends in OM-related events at the community and national levels. These are potential leading indicators for assessing the nonfatal magnitude of the opioid crisis in an area.
Fluctuating supplies and introduction of powerful, illicitly manufactured opioids may rapidly change local morbidity and mortality patterns [, ]. Availability of near real-time data of opioid-related problems from the field may guide prevention and intervention efforts of emergency responders, health care providers, and public health practitioners [ ]. Our methods, similar to those used to identify opioid overdose risk [ ], could be applied to enhance information accuracy of EMS data for state and local public health departments, an important goal in the Centers for Disease Control and Prevention (CDC) Emergency Response Cooperative Agreement [ ].
Public health agencies in the United States are seeking data sources and data-driven indicators for early warning systems to identify medical consequences of misuse of prescription and illicit opioids . Our study found that naloxone administrations inaccurately identified and underestimated opioid-related paramedic trip events in Denver. This result is compatible with recent findings that naloxone administration was a poor proxy for opioid overdose [ ]. Our study also found that EMS-administered naloxone did not reflect trends (rise or fall) in OM-related EMS runs assessed by chart review. By itself, EMS naloxone administration was a poor stand-alone indicator and would benefit from additional information embedded in EMS records.
As a simple alternative, the keyword heroin increased over 2.5-fold (from 63 flagged by the current standard [ie, naloxone administrations] to 171) the number of records with potential OM. This strategy flagged OM reports accurately, with only 1 false positive. Combined naloxone and heroin NLP search increased sensitivity but with substantial false positives. To improve this, we applied a machine learning algorithm that produced both higher sensitivity and specificity. This same tactic, previously employed to identify alcohol misuse in clinical notes of electronic health records , could be extended to include more opioid-related terms such as prescription opioid names. New studies should try to assess the effects of including records flagged by keywords such as heroin or opioid brand names in model training, testing, and validation.
Two main limitations were present in this study. First, we used data from only 1 EMS system. Although DH paramedics adhere to a widely used data standard , implementation may vary between organizations. Second, calculation of the probability that cases not flagged by NLP methods were truly negative cases (NPV) was impossible as manual chart review of all trip reports would require human effort beyond our capacity.
JTP was hosted by Denver Public Health for his CDC Public Health Informatics Fellowship. The authors would like to thank Chad M Heilig (CDC), Scott H Lee (CDC), Matthew J Maenner (CDC), and Emily Bacon (DPH) for machine learning and statistics review and advice. The authors are grateful to the Journal of Medical Internet Research reviewers who made valuable suggestions to strengthen this manuscript and who highlighted new research avenues, which the authors hope to explore in forthcoming work. The authors of this study have no financial disclosures to report. This study is part of DH’s Center for Addiction Medicine. The findings and conclusions in this study are those of the authors and do not necessarily represent the official position of the CDC.
JTP devised the study and led analysis, interpretation of data and results, and draft writing. KS contributed substantially to design and analysis. AJD contributed substantially to interpretation of data and results and draft writing. All authors contributed to interpretation of results and revision and approval of the final version.
Conflicts of Interest
Excerpts of narratives in paramedic trip reports without reviewer agreement.DOCX File , 32 KB
R code used in phase 2.TXT File , 6 KB
Additional statistical analysis, confusion matrices, and feature scores by machine learning classifiers.DOCX File , 46 KB
Excerpts of narratives of false positive results in phase 3.DOCX File , 32 KB
- Scholl L, Seth P, Kariisa M, Wilson N, Baldwin G. Drug and opioid-involved overdose deaths — United States, 2013–2017. MMWR Morb Mortal Wkly Rep 2019;67(5152):1419-1427. [CrossRef]
- Ahmad FB, Escobedo LA, Rossen LM, Spencer MR, Warner M, Sutton P. Centers for Disease Control and Prevention (CDC), National Center for Health Statistics. 2019. Provisional Drug Overdose Death Counts URL: https://www.cdc.gov/nchs/nvss/vsrr/drug-overdose-data.htm [accessed 2019-12-04]
- Substance Abuse and Mental Health Services Administration. Substance Abuse and Mental Health Services Administration. Rockville, MD: Center for Behavioral Health Statistics and Quality, Substance Abuse and Mental Health Services Administration; 2018. Key Substance Use and Mental Health Indicators in the United States: Results From the 2018 National Survey on Drug Use and Health URL: https://www.samhsa.gov/data/sites/default/files/cbhsq-reports/NSDUHNationalFindingsReport2018/NSDUHNationalFindingsReport2018.pdf [accessed 2019-11-26]
- Vivolo-Kantor AM, Seth P, Gladden RM, Mattson CL, Baldwin GT, Kite-Powell A, et al. Vital signs: trends in emergency department visits for suspected opioid overdoses - United States, July 2016-September 2017. MMWR Morb Mortal Wkly Rep 2018 Mar 9;67(9):279-285 [FREE Full text] [CrossRef] [Medline]
- Faul M, Dailey MW, Sugerman DE, Sasser SM, Levy B, Paulozzi LJ. Disparity in naloxone administration by emergency medical service providers and the burden of drug overdose in US rural communities. Am J Public Health 2015 Jul;105(Suppl 3):e26-e32. [CrossRef] [Medline]
- Knowlton A, Weir BW, Hazzard F, Olsen Y, McWilliams J, Fields J, et al. EMS runs for suspected opioid overdose: implications for surveillance and prevention. Prehosp Emerg Care 2013;17(3):317-329 [FREE Full text] [CrossRef] [Medline]
- Ray BR, Lowder EM, Kivisto AJ, Phalen P, Gil H. EMS naloxone administration as non-fatal opioid overdose surveillance: 6-year outcomes in Marion County, Indiana. Addiction 2018 Dec;113(12):2271-2279. [CrossRef] [Medline]
- Merchant RC, Schwartzapfel BL, Wolf FA, Li W, Carlson L, Rich JD. Demographic, geographic, and temporal patterns of ambulance runs for suspected opiate overdose in Rhode Island, 1997-20021. Subst Use Misuse 2006;41(9):1209-1226. [CrossRef] [Medline]
- Lindstrom HA, Clemency BM, Snyder R, Consiglio JD, May PR, Moscati RM. Prehospital naloxone administration as a public health surveillance tool: a retrospective validation study. Prehosp Disaster Med 2015 Aug;30(4):385-389. [CrossRef] [Medline]
- Cash RE, Kinsman J, Crowe RP, Rivard MK, Faul M, Panchal AR. Naloxone administration frequency during emergency medical service events - United States, 2012-2016. MMWR Morb Mortal Wkly Rep 2018 Aug 10;67(31):850-853 [FREE Full text] [CrossRef] [Medline]
- NIH National Institute on Drug Abuse. 2019. Opioid Overdose Crisis: Revised January 2019 URL: https://www.drugabuse.gov/drugs-abuse/opioids/opioid-overdose-crisis [accessed 2019-12-02]
- Hemsing N, Greaves L, Poole N, Schmidt R. Misuse of prescription opioid medication among women: a scoping review. Pain Res Manag 2016;2016:1754195 [FREE Full text] [CrossRef] [Medline]
- Pasero C, McCaffery M. Reversing respiratory depression with naloxone. Am J Nurs 2000 Feb;100(2):26. [Medline]
- Lewanowitsch T, Irvine RJ. Naloxone methiodide reverses opioid-induced respiratory depression and analgesia without withdrawal. Eur J Pharmacol 2002 Jun 7;445(1-2):61-67. [CrossRef] [Medline]
- Grover JM, Alabdrabalnabi T, Patel MD, Bachman MW, Platts-Mills TF, Cabanas JG, et al. Measuring a crisis: questioning the use of naloxone administrations as a marker for opioid overdoses in a large US EMS system. Prehosp Emerg Care 2018;22(3):281-289. [CrossRef] [Medline]
- Gabow P, Eisert S, Wright R. Denver Health: a model for the integration of a public hospital and community health centers. Ann Intern Med 2003 Jan 21;138(2):143-149. [CrossRef] [Medline]
- National Emergency Medical Services Information System. National Emergency Medical Services Information System. Salt Lake City; 2016 Jul 13. NEMSIS Data Dictionary NHTSA v3.4.0 - EMS Data Standard URL: https://nemsis.org/media/nemsis_v3/release-3.4.0/DataDictionary/PDFHTML/DEMEMS/NEMSISDataDictionary.pdf [accessed 2019-11-26]
- Kuhn M. CiteSeerX. Institute for Statistics and Mathematics of WU (Wirtschaftsuniversität Wien): The Comprehensive R Archive Network; 2015 Jul 16. A Short Introduction to the caret Package URL: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.696.4901&rep=rep1&type=pdf [accessed 2019-12-02]
- Ciccarone D. Fentanyl in the US heroin supply: a rapidly changing risk environment. Int J Drug Policy 2017 Aug;46:107-111 [FREE Full text] [CrossRef] [Medline]
- O'Donnell J, Gladden RM, Mattson CL, Kariisa M. Notes from the field: overdose deaths with carfentanil and other fentanyl analogs detected - 10 states, July 2016-June 2017. MMWR Morb Mortal Wkly Rep 2018 Jul 13;67(27):767-768 [FREE Full text] [CrossRef] [Medline]
- Lo-Ciganic W, Huang JL, Zhang HH, Weiss JC, Wu Y, Kwoh CK, et al. Evaluation of machine-learning algorithms for predicting opioid overdose risk among medicare beneficiaries with opioid prescriptions. JAMA Netw Open 2019 Mar 1;2(3):e190968 [FREE Full text] [CrossRef] [Medline]
- Centers for Disease Control and Prevention. Centers for Disease Control and Prevention. CDC-RFA-TP18-1802 Cooperative Agreement for Emergency Response: Centers for Disease Control and Prevention; 2018 Jun 20. CDC-RFA-TP18-1802 Cooperative Agreement for Emergency Response: Public Health Crisis Response 2018 Opioid Overdose Crisis Cooperative Agreement Supplemental Guidance URL: https://www.cdc.gov/cpr/readiness/00_docs/TP18-1802OpioidSupplementalGuidance-508.pdf [accessed 2019-12-04]
- Prieto JT, McEwen D, Davidson AJ, Al-Tayyib A, Gawenus L, Sangareddy SR, et al. Monitoring opioid addiction and treatment: do you know if your population is engaged? Drug Alcohol Depend 2019 Sep 1;202:56-60. [CrossRef] [Medline]
- Afshar M, Phillips A, Karnik N, Mueller J, To D, Gonzalez R, et al. Natural language processing and machine learning to identify alcohol misuse from the electronic health record in trauma patients: development and internal validation. J Am Med Inform Assoc 2019 Mar 1;26(3):254-261. [CrossRef] [Medline]
|AUC: area under the receiver operating curve|
|CDC: Centers for Disease Control and Prevention|
|DH: Denver Health|
|ED: emergency department|
|EMS: emergency medical services|
|NLP: natural language processing|
|NPV: negative predictive value|
|OM: opioid misuse|
|PPV: positive predictive value|
|TNR: true-negative rate|
|TPR: true-positive rate|
Edited by CL Parra-Calderón; submitted 26.07.19; peer-reviewed by D Epstein, M Torii; comments to author 30.08.19; revised version received 05.09.19; accepted 08.10.19; published 03.01.20Copyright
©José Tomás Prieto, Kenneth Scott, Dean McEwen, Laura J Podewils, Alia Al-Tayyib, James Robinson, David Edwards, Seth Foldy, Judith C Shlay, Arthur J Davidson. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 03.01.2020.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.