This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
Adverse drug reactions (ADRs) are common and are the underlying cause of over a million serious injuries and deaths each year. The most familiar method to detect ADRs is relying on spontaneous reports. Unfortunately, the low reporting rate of spontaneous reports is a serious limitation of pharmacovigilance.
The objective of this study was to identify a method to detect potential ADRs of drugs automatically using a deep neural network (DNN).
We designed a DNN model that utilizes the chemical, biological, and biomedical information of drugs to detect ADRs. This model aimed to fulfill two main purposes: identifying the potential ADRs of drugs and predicting the possible ADRs of a new drug. For improving the detection performance, we distributed representations of the target drugs in a vector space to capture the drug relationships using the word-embedding approach to process substantial biomedical literature. Moreover, we built a mapping function to address new drugs that do not appear in the dataset.
Using the drug information and the ADRs reported up to 2009, we predicted the ADRs of drugs recorded up to 2012. There were 746 drugs and 232 new drugs, which were only recorded in 2012 with 1325 ADRs. The experimental results showed that the overall performance of our model with mean average precision at top-10 achieved is 0.523 and the rea under the receiver operating characteristic curve (AUC) score achieved is 0.844 for ADR prediction on the dataset.
Our model is effective in identifying the potential ADRs of a drug and the possible ADRs of a new drug. Most importantly, it can detect potential ADRs irrespective of whether they have been reported in the past.
An adverse drug reaction (ADR) [
Spontaneous reporting in pre- and postmarket stages are the most familiar methods to identify ADRs early on. Specifically, safety reports from clinical trials are used to list common ADRs in the premarket stage [
The fundamental method for identifying ADRs pertains to identifying the relationship between drugs and their side effects from diverse sources of information [
Several studies have utilized either chemical or molecular pathways of drugs to predict ADRs [
However, most of those approaches rely on heavily handcrafted features and treat ADR identification as a classification problem, which does not take the order of the ADRs discovered into consideration. Therefore, the process tends to be more expensive and leads to the loss of significant information on drug-ADR relationships in the model training phase. Furthermore, these approaches are unable to predict the ADR of new drugs, thus rendering the detection of ADR more difficult [
To address these limitations, we used a deep neural network (DNN) model for the detection of ADRs of drugs. The model has 2 purposes: the identification of ADRs, which entailed the discovery of potential ADRs of a drug from known ADR records, and the prediction of ADRs, which pertained to predicting the possible ADRs for a new drug. We used the word-embedding approach and mapping function to process new drugs that did not appear in the dataset. Furthermore, we examined the overall performance of the model with various feature combinations and the number of hidden layers in the DNN architecture.
To develop and evaluate a DNN model, we used data from Side Effect Resource (SIDER) [
For enriching the scientific evidence and enhancing the detection of ADRs, we collected millions of papers from the Medical Literature Analysis and Retrieval System Online (MEDLINE) [
Molecular Weight
XLogP3
Hydrogen Bond Donor Count
Hydrogen Bond Acceptor Count
Rotatable Bond Count
Exact Mass
Monoisotopic Mass
Topological Polar Surface Area
Heavy Atom Count
Formal Charge
Complexity
Isotope Atom Count
Defined Atom Stereocenter Count
Undefined Atom Stereocenter Count
Defined Bond Stereocenter Count
Undefined Bond Stereocenter Count
Covalently- Bonded Unit Count
We treated ADR identification as an information retrieval problem, such that our model could discover the potential relationships between each drug and the 1325 side effects recorded. We represented the prediction target of 1325 dimensions with a binary profile of elements corresponding to the presence or absence of side effects with 1 or 0,
The biomedical literature played an important role in this study because it contains a large amount of information related to drugs and ADRs such as clinical notes and case reports. However, one of the issues in extracting the drug information from biomedical literature is the uncertainty regarding which words or documents represent the drug. Therefore, we trained the model to understand the semantic features of drugs from 2.3 million biomedical papers on 764 drugs introduced before 2009 by utilizing one of the most popular embedding methods Word2Vec [
The architecture of the deep neural network model for predicting and identifying the possible adverse drug reactions (ADRs) of a drug. After predicting, we generated a list of possible ADRs of a drug by ranking the probability of ADRs from the output in the model.
Feature representation of adverse drug reaction (ADR) identification and prediction.
We designed a DNN model that can identify and predict the ADRs of a drug with different requirements. This model (
We treated the identification task as an information retrieval problem because drugs may have more than one ADR. Therefore, we designed the last layer with 1325 hidden nodes, which was equal to the number of ADRs in the dataset. Evaluating the probability of ADR
In this study, we present a detailed analysis of the performance of our DNN model. Let
Moreover, we removed the D2V and kept the other features to train the model. The results showed that the D2V was most informative, possibly because the D2V learned the valuable information from millions of papers. We then focused on method comparison with several common methods. We compared the abilities of 3 machine learning methods, namely, probability matrix factorization (PMF), Linear Support Vector Classifier, and Gaussian Naïve Bayes [
Subsequently, we investigated whether our model could process the specific tasks of prediction and identification. The performance on the prediction task (
In addition, we plotted the performance of the model with a different number of hidden layers (
To evaluate our mapping function, we examined the drug expansion through the transfer of drug description to the D2V. The results, shown in
The result showing the performance of model evaluated by area under the receiver operating characteristic curve (AUC).
Model | AUC |
Probability matrix factorization | 0.500 |
Linear Support Vector Classifier | 0.523 |
Gaussian Naïve Bayes | 0.597 |
Deep neural network adverse drug reaction (DNN ADR) without hidden layer | 0.641 |
DNN ADR with 1 hidden layer | 0.823 |
DNN ADR with 2 hidden layers | |
DNN ADR with 3 hidden layers | 0.814 |
DNN ADR without Bio features | 0.823 |
DNN ADR without Chem features | 0.837 |
DNN ADR without drug2vec features | 0.803 |
DNN ADR |
aThe italicized values indicate the best results in this comparison.
Left: Effects of different feature combinations to detect the adverse drug reactions (ADRs) of drugs; right: A comparison of our deep neural network (DNN) model with various machine learning approaches. PMF: probability matrix factorization; LinearSVC: Linear Support Vector Classifier; GaussianNB: Gaussian Naïve Bayes.
Left: Performance of the deep neural network (DNN) model on the adverse drug reaction (ADR) identification and prediction tasks and the overall performance; right: In this experiment, we showed the performance of the model with several different layers. GaussianNB: Gaussian Naïve Bayes; LinearSVC: Linear Support Vector Classifier.
The results showing the ability of the mapping function to transfer the drug description to drug2vec with Mean Average Precision at Top N (MAP@N).
MAP@N | 1 | 3 | 5 | 10 | 15 | 20 |
Mapping function | ||||||
drug2vec | 0.065 | 0.174 | 0.267 | 0.453 | 0.453 | 0.453 |
aThe italicized values indicate the best results in this comparison.
In this study, we aimed to increase the diversity of information on drugs to improve our ability to detect ADRs. Accordingly, we extracted information from the chemical and biological properties of drugs and from the existing biomedical literature. The MEDLINE was selected as the source for biomedical literature to identify important auxiliary data because it contains several types of biomedical papers, such as clinical trials, case reports, and observational studies, related to drugs. However, it was difficult to use keywords to identify specific drugs from millions of papers and words. Therefore, we utilized 2.3 million biomedical papers to identify the semantic features of drug using the skip-gram model in Word2Vec. In particular, for a central word
Relationship between drugs using the semantic feature (drug2vec) of the deep neural network model. There were 746 nodes in this graph, each representing a drug. The clusters indicated the drugs with a specific treatment. Top: The cluster comprised antidepressants; middle: The cluster contained antibiotics; bottom: The cluster included ophthalmic medications.
For instance, the drugs in the cluster with the blue circle shown in
Subsequently, we examined the ability of this model to perform its identification and prediction functions with reference to serious ADRs defined by the Micromedex. Using the identification function of the model, we ranked the potential ADRs in a list by the probability of their occurrence (
The adverse drug reaction (ADR) prediction and identification results of the model.
Drug | Serious ADR | Rank | Probability | |
Dantrolene | Anemia | 12 | 0.012 | |
Dantrolene | Congestive heart failure | 15 | 0.009 | |
Hydroxychloroquine | Muscle Cramp | 1 | 0.997 | |
Hydroxychloroquine | Photophobia | 16 | 0.017 | |
19-nortestosterone | Serum cholesterol raised | 4 | 0.150 | |
Carbachol | Retinal detachment | 3 | 0.690 | |
Atazanavir | Anemia | 17 | 0.920 | |
Carbinoxamine maleate | Agranulocytosis | 14 | 0.453 | |
Carbinoxamine maleate | Anemia, Hemolytic | 16 | 0.340 | |
Darunavir | Hyperglycemia | 20 | 0.750 | |
Temsirolimus | Infection | 20 | 0.974 | |
Zoladex | Myocardial infarction | 7 | 0.961 | |
Zoladex | Hypersensitivity | 12 | 0.920 |
Findings revealed that our model has the capacity to predict the serious ADRs of new drugs. For instance, the model predicted that Zoladex could lead to a serious ADR, myocardial infarction, which is one of the commonest causes of death in developing countries.
This study has several limitations that need to be addressed in future studies. First, data diversity plays an important role in the model. We only used the data published in SIDER. Our model will be more persuasive and reliable if we can include more data from different datasets. Because the chemical and biological properties of drugs contribute most to their effects on human, the more the databases of drug properties included, the better the performance of our model. On the other hand, if we have access to more open-source data, including clinical trials, spontaneous reporting systems, and EMRs with support from government and pharmaceutical industry, our model will have better prediction. Furthermore, our model focused on the ADR prediction and identification. To identify the probability of occurrence of each ADR, we set 1325 hidden nodes and the total number of ADRs in the dataset in the output layer. In other words, although we had a mapping function to address new drugs, this model could only predict existing ADRs. Therefore, in the future work, we plan to utilize more detailed features such as drug-ADR interaction [
We developed a novel ADR detection model based on the biological and chemical properties of drugs and the D2V (the semantic feature). After discussing the drug similarities with domain experts from the National Cheng Kung University Hospital and the Institute of Clinical Pharmacy and Pharmaceutical Sciences, we found out that the D2V can represent a characteristic of the drug. Our model could not only discover the potential ADRs of drugs but also predict the possible ADRs of new drugs. To discover potential ADRs based on the previous records, our model could identify the hidden relationship between ADR-ADR interactions. Furthermore, to predict the possible ADRs of a new drug without any previous ADR records, using the D2V feature, our mapping function exhibited good profiling for transferring the drug description into the D2V. The model exhibited good performance on both tasks and generated the most suitable results. It will help pharmacists and health care providers to understand the potential risk of side effect of drugs and address the issue of underreporting of spontaneous reports. Above all, our model will aid pharmacovigilance by identifying and predicting potential ADRs.
adverse drug events
adverse drug reaction
area under the receiver operating characteristic curve
deep neural network
electronic medical records
mean average precision
Medical Literature Analysis and Retrieval System Online
probability matrix factorization
Side Effect Resource
The authors would like to express their great appreciation to the Ministry of Science and Technology (MOST) and the Ministry of Health and Welfare (MOHW) for their support (grant numbers: MOST 104-2923-E-006-003-MY3, MOST 107-2634-F-006-006, and MOHW105-FDA-D-113-000416). The authors would also like to appreciate Yung-Hsin Tseng, a master’s student, who was involved in conventional test and case study.
None declared.