This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
Low adherence to recommended treatments is a multifactorial problem for patients in rehabilitation after myocardial infarction (MI). In a nationwide trial of internet-delivered cognitive behavior therapy (iCBT) for the high-risk subgroup of patients with MI also reporting symptoms of anxiety, depression, or both (MI-ANXDEP), adherence was low. Since low adherence to psychotherapy leads to a waste of therapeutic resources and risky treatment abortion in MI-ANXDEP patients, identifying early predictors for adherence is potentially valuable for effective targeted care.
The goal of the research was to use supervised machine learning to investigate both established and novel predictors for iCBT adherence in MI-ANXDEP patients.
Data were from 90 MI-ANXDEP patients recruited from 25 hospitals in Sweden and randomized to treatment in the iCBT trial Uppsala University Psychosocial Care Programme (U-CARE) Heart study. Time point of prediction was at completion of the first homework assignment. Adherence was defined as having completed more than 2 homework assignments within the 14-week treatment period. A supervised machine learning procedure was applied to identify the most potent predictors for adherence available at the first treatment session from a range of demographic, clinical, psychometric, and linguistic predictors. The internal binary classifier was a random forest model within a 3×10–fold cross-validated recursive feature elimination (RFE) resampling which selected the final predictor subset that best differentiated adherers versus nonadherers.
Patient mean age was 58.4 years (SD 9.4), 62% (56/90) were men, and 48% (43/90) were adherent. Out of the 34 potential predictors for adherence, RFE selected an optimal subset of 56% (19/34; Accuracy 0.64, 95% CI 0.61-0.68,
For developing and testing effective iCBT interventions, investigating factors that predict adherence is important. Adherence to iCBT for MI-ANXDEP patients in the U-CARE Heart trial was best predicted by cardiac-related fear and sex, consistent with previous research, but also by novel linguistic predictors from written patient behavior which conceivably indicate verbal ability or therapeutic alliance. Future research should investigate potential causal mechanisms and seek to determine what underlying constructs the linguistic predictors tap into. Whether these findings replicate for other interventions outside of Sweden, in larger samples, and for patients with other conditions who are offered iCBT should also be investigated.
ClinicalTrials.gov NCT01504191; https://clinicaltrials.gov/ct2/show/NCT01504191 (Archived at Webcite at http://www.webcitation.org/6xWWSEQ22)
Myocardial infarction (MI) afflicts more than 7 million individuals each year, making it the most common acute cardiac event caused by cardiovascular disease (CVD)—the leading cause of death in the world [
A substantial subgroup of patients with MI additionally also suffer from symptoms of anxiety, depression, or both (MI-ANXDEP). MI-ANXDEP patients have a higher risk factor burden and worse prognosis compared to MI patients in general [
The multicenter Uppsala University Psychosocial Care Programme (U-CARE) Heart study was the first randomized controlled trial to test the effectiveness of a therapist-supported iCBT treatment for MI-ANXDEP patients [
Treatment adherence is in general a multifactorial phenomenon. Adherence to and effectiveness of iCBT has been associated with higher education, older age, and female sex [
The present iCBT U-CARE Heart study design offered a group of additional predictors that have not been assessed in this way, namely linguistic variables based on the texts that patients wrote in response to their standardized homework assignments. Syntactic structure and word use has to some extent been investigated before with regard to anxiety and depression [
The objective of our study was to investigate if predictors available up to the start of treatment (initial homework assignment response) would predict adherence to iCBT treatment at first follow-up in MI-ANXDEP patients. To this end, we applied a contemporary machine learning procedure to U-CARE Heart data to manage the relatively large amount of predictors and complex covariance structure. We hypothesized that symptom severity, age, sex, education, and linguistic behavior would predict adherence to treatment. We also hypothesized that more severe symptoms, younger age, being a woman, having a higher education, and using more words in the assignment response would be positively associated with adherence to iCBT.
The recruitment, treatment, and follow-up of patients has been described in detail elsewhere [
The outcome variable was dichotomous: adherence was defined as completing 3 or more homework assignments (≥21% of total treatment), and nonadherence was defined as having completed less than that. This cutoff was chosen in part because it is clinically relevant to ascertain who continues with the self-tailored part of the U-CARE Heart treatment after completing the initial 2 standardized homework assignments versus who does not continue. Furthermore, the chosen cutoff rendered fairly balanced classes for the machine learning procedure, which is important for it to work properly with moderately sized data [
The linguistic predictors were extracted from the patients’ answers to the first standardized homework assignment, which consisted of an introductory text and 8 questions designed for the patient to describe their MI, associated psychological reaction, present psychological state, present social support, and what the patient wanted from iCBT treatment. In effect, patients had access to the same material prior to carrying out their homework assignment [
Five of the 34 predictors had missing data, in the order of proportion missing: number of standard glasses of alcohol consumed per week, 11% (10/90); BMI, 10% (9/90); heart rate, 7% (6/90); systolic blood pressure (SBP), 7% (6/90); and the number of days between hospital admission for MI and study randomization, 4% (4/90). Missing values were thus relatively few and not considered missing completely at random (MCAR), instead their missingness was assumedly related to the other measured variables (MAR). We also did not impute the outcome. Thus,
Adherence is a multifactorial problem [
Descriptive statistics for all treated patients with myocardial infarction and stratified by adherence to internet-delivered cognitive behavioral therapy.
Variables | All (n=90) | Adherers (n=43) | Nonadherers (n=47) | Missing | ||||
|
||||||||
|
Age (years) mean (SD) | 58.4 (9.4) | 57.0 (10.4) | 60.0 (8.3) | .17 | 0 | ||
|
Women, n (%) | 34 (38) | 23 (54) | 11 (23) | .006 | 0 | ||
|
|
.80 | 0 | |||||
|
|
Single | 15 (17) | 8 (19) | 7 (15) |
|
|
|
|
|
Cohabitant/married | 72 (80) | 34 (79) | 38 (81) |
|
|
|
|
|
Not single but living alone | 3 (3) | 1 (2) | 2 (4) |
|
|
|
|
|
|
|
|
.79 | 0 | ||
|
|
Elementary | 14 (16) | 5 (12) | 9 (19) |
|
|
|
|
|
High school | 31 (34) | 16 (37) | 15 (32) |
|
|
|
|
|
University ≤3 years | 20 (22) | 10 (23) | 10 (21) |
|
|
|
|
|
University >3 years | 25 (28) | 12 (28) | 13 (28) |
|
|
|
|
Country of birth, n (%) | 17 (19) | 8 (19) | 9 (19) | >.99 | 0 | ||
|
|
|
|
|
|
|||
|
Heart rate, mean (SD) | 77.0 (20.4) | 77.6 (21.3) | 76.5 (19.7) | .81 | 6 | ||
|
SBPa, mean (SD) | 149.5 (32.0) | 150.5 (28.2) | 148.5 (35.6) | .78 | 6 | ||
|
BMIb, mean (SD) | 27.9 (5.0) | 27.9 (5.8) | 28.0 (4.3) | .89 | 9 | ||
|
Alcohol (glasses/week), median (IQRc) | 2.0 (0.0, 7.3) | 2.0 (0.0, 8.5) | 2.0 (0.0, 5.0) | .44 | 10 | ||
|
Current smoker, n (%) | 4 (4) | 2 (5) | 2 (4) | >.99 | 0 | ||
|
CVDd medication adherence, n (%) | 18 (20) | 11 (26) | 7 (15) | .32 | 0 | ||
|
|
|
|
|
.45 | 0 | ||
|
|
None | 75 (83) | 34 (79) | 41 (88) |
|
|
|
|
|
As needed | 6 (7) | 3 (7) | 3 (6) |
|
|
|
|
|
Regularly | 7 (8) | 4 (9) | 3 (6) |
|
|
|
|
|
Regularly and as needed | 2 (2) | 2 (5) | 0 (0) |
|
|
|
|
|
|
|
|
.48 | 0 | ||
|
|
No | 67 (74) | 31 (72) | 36 (77) |
|
|
|
|
|
≥Once per year, <once per month | 9 (10) | 6 (14) | 3 (6) |
|
|
|
|
|
≥Once per month | 14 (16) | 6 (14) | 8 (17) |
|
|
|
|
|
|
|
|
|
|||
|
CAQe fear | 12.7 (6.0) | 14.6 (5.4) | 11.0 (6.0) | .004 | 0 | ||
|
CAQ avoidance | 7.3 (4.4) | 7.4 (4.2) | 7.1 (4.7) | .74 | 0 | ||
|
CAQ attention | 5.7 (3.2) | 6.4 (3.4) | 5.1 (3.0) | .05 | 0 | ||
|
CAQ total | 25.7 (10.0) | 28.4 (9.8) | 23.2 (9.6) | .01 | 0 | ||
|
ESSIf total | 20.1 (4.4) | 20.4 (4.0) | 19.7 (4.7) | .49 | 0 | ||
|
EQ5Dg VASh | 66.0 (16.8) | 64.7 (15.6) | 67.2 (17.9) | .48 | 0 | ||
|
EQ5D emotional distress | 1.0 (0.5) | 1.0 (0.5) | 1.0 (0.4) | .84 | 0 | ||
|
MADRSi total | 14.9 (6.2) | 14.9 (5.7) | 15.0 (6.7) | .96 | 0 | ||
|
BADSj total | 21.4 (6.1) | 22.4 (5.7) | 20.6 (6.3) | .15 | 0 | ||
|
HADSk anxiety | 10.3 (3.0) | 10.5 (2.7) | 10.2 (3.2) | .71 | 0 | ||
|
HADS depression | 7.9 (3.0) | 8.0 (2.7) | 7.9 (3.4) | .92 | 0 | ||
|
HADS total | 18.3 (4.7) | 18.4 (4.0) | 18.2 (5.3) | .77 | 0 | ||
|
|
|
|
|
|
|||
|
Number of words, mean (SD) | 306.8 (246.7) | 376.8 (257.2) | 242.7 (220.5) | .009 | 0 | ||
|
Number of mutual words, mean (SD) | 6.2 (5.7) | 7.6 (5.9) | 4.9 (5.2) | .02 | 0 | ||
|
Sentence length, mean (SD) | 13.0 (5.5) | 13.6 (5.0) | 12.4 (5.9) | .28 | 0 | ||
|
Adjectives/adverbs, mean (SD) | 193.2 (43.6) | 187.4 (39.9) | 198.5 (46.6) | .23 | 0 | ||
|
Possessive pronouns, mean (SD) | 13.1 (10.0) | 12.8 (8.1) | 13.4 (11.5) | .78 | 0 | ||
|
Personal pronouns, mean (SD) | 64.6 (27.1) | 70.2 (24.3) | 59.4 (28.8) | .06 | 0 | ||
|
Mentions the MIl, n (%) | 69 (77) | 35 (81) | 34 (72) | .44 | 0 | ||
|
|
|
|
|
|
|||
|
Days from MI to allocation, mean (SD) | 70.5 (14.9) | 70.3 (15.0) | 70.7 (14.9) | .91 | 4 | ||
|
|
|
|
|
.59 | 0 | ||
|
|
63 (70) | 29 (67) | 34 (72) |
|
|
||
|
|
Telephone | 11 (12) | 5 (12) | 6 (13) |
|
|
|
|
|
SMSm | 15 (17) | 9 (21) | 6 (13) |
|
|
|
|
|
1 (1) | 0 (0) | 1 (2) |
|
|
aSBP: systolic blood pressure.
bBMI: body mass index.
cIQR: interquartile range.
dCVD: cardiovascular disease.
eCAQ: Cardiac Anxiety Questionniare [
fESSI: ENRICHD Social Support Instrument [
gEQ5D: European Quality of Life Questionnaire–Five Dimensions.
hVAS: visual analog scale.
iMADRS: Montgomery-Asberg Depression Rating Scale [
jBADS: Behavioral Activation for Depression Scale–Short Form [
kHADS: Hospital Anxiety and Depression Scale [
lMI: myocardial infarction.
mSMS: short message service.
Although random forest already has built-in cross-validation control for overfitting through its “out-of-bag” predictions, we added a second wrapper layer around the classifier in the form of backwards algorithmic predictor selection via recursive feature elimination (RFE) resampled with 3×10–fold cross-validation [
If not stated differently, we report categorical variables as count (%), numerical variables as arithmetic mean (SD),
The linguistic data preprocessing was carried out with the corpus tool AntConc version 3.4.4m (Waseda University) [
Descriptive data are available in
After imputation, the RFE feature selection procedure was applied to extract the most potent predictors for classifying adherers versus nonadherers.
Predictor selection result with recursive feature elimination.
Relative importance of each predictor for adherence sorted by group. BADS: Behavioral Activation for Depression Scale–Short Form; BMI: body mass index; CAQ: Cardiac Anxiety Questionnaire; EQ5D: European Quality of Life Questionnaire–Five Dimensions; HADS: Hospital Anxiety and Depression Scale ; MI: myocardial infarction; VAS: visual analog scale.
Our study tested and compared established and novel predictors for adherence to 14 weeks of therapist-supported iCBT using data from 90 MI-ANXDEP patients recruited from 25 hospitals in Sweden and randomized to treatment in the U-CARE Heart clinical trial. The time point of prediction was after completion of the first homework assignment, which therefore allowed the study of previously untested linguistic predictors extracted from actual written behavior together with previously established predictors. A robust machine learning procedure sifted out the most potent predictors for adherence assessed at the end of treatment, which were found to be self-assessed cardiac fear, sex, number of words, self-assessed general cardiac anxiety, average sentence length, and number of mutual words used.
Both symptoms of general cardiac anxiety and specific cardiac fear were among the strongest predictors, and to the extent of symptom and mechanistic overlap, this corroborates previous findings that depression is associated with increased adherence to cardiac rehabilitation [
On the other hand, our findings do not replicate other previously identified predictors for adherence to iCBT such as education and age [
We also discovered that novel linguistic predictors based on written verbal responses predicted adherence. The number of words may be a proxy for verbal fluency and degree of patient effort in therapy, and the number of mutual words might be a proxy for the degree of therapeutic alliance, which in part corroborates previous research on therapeutic alliance and other interlinked concepts that promote adherence to iCBT [
Although more work is arguably needed, the data collection, preprocessing, and analysis of written responses can be automated to a considerable degree so the current lack of off-the-shelf clinical utility might not be a future obstacle. An automated tool for predicting adherence can be constructed and then possibly used as a decision support tool by the clinician. Moreover, the tool could also determine the risk of low adherence in patients, which could possibly inform the tailoring of treatment for the MI-ANXDEP patient more objectivity and accurately compared to the guesswork and crude cutoffs often applied to counter low adherence in clinical research and care today. So-called artificial intelligence and the related supervised machine learning applications that are now being rapidly researched and implemented broadly would likely also be of benefit to better solve the clinically relevant problem of predicting adherence to internet-delivered treatments.
A limitation of this study is the sample size. Although the present U-CARE Heart study is the largest iCBT trial for MI-ANXDEP patients to date, it provides limited reliability estimates. The sample is too small to subdivide for more detailed analyses of those exclusively depressed or anxious. Within the limits of the present sample size not allowing for an external validation data set, the generalizability of findings is, however, quite good given that (a) applied predictive modeling procedure was robustly cross-validated, (b) national coverage was very good with recruitment from 25 hospitals, and (c) patients were recruited very similarly to routine clinical care.
Although we used expert content knowledge to select predictors and tested a range of common and domain specific predictors, there was still the possibility for using other predictors. This might explain the room for improvement in terms of classification acuity. Given that we studied a whole new class of predictors consisting of actual written behavior selected by domain experts, this study adds further novelty in that manner. The confirmation of some previously known predictors for adherence to psychotherapy with scarcely studied but very common MI-ANXDEP patients indicates potential clinical utility with MI-ANXDEP patients. The study was conducted in Sweden, and we cannot readily extrapolate our findings beyond our national and linguistic borders. The MI-ANXDEP population is also a distinct subgroup of MI patients, and the iCBT intervention is specifically tailored to these patients. Hence, replication outside of Sweden with different patients and for other psychotherapeutic treatments would be valuable.
There was also the limitation of operationalizing the outcome. This can be done in several ways, with the most strict adherence definition being to complete all treatment modules [
For developing and testing effective iCBT interventions, investigating factors that predict adherence is important. Using a supervised machine learning approach, adherence to iCBT treatment in a multicenter trial for MI-ANXDEP patient was best predicted by a diverse set of predictors. The most potent predictors also included novel linguistic predictors from written patient behavior at the start of treatment. Our findings may improve the tailoring of iCBT for these high-risk patients. Future research should also investigate possible causal mechanisms and determine if these findings replicate outside of Sweden, in larger samples, and for other patient groups that might benefit from iCBT.
Supplemental material.
CONSORT-EHEALTH checklist (V 1.6.1).
Cardiac Anxiety Questionnaire
cognitive behavioral therapy
cardiovascular disease
Hospital Anxiety and Depression Scale
internet-based cognitive behavioral therapy
k nearest neighbor
myocardial infarction
myocardial infarction with comorbid symptoms of depression, anxiety, or depression and anxiety
missing at random
missing completely at random
recursive feature elimination
systolic blood pressure
Uppsala University Psychosocial Care Programme
We are grateful to the U-CARE Heart patients. This study was supported by the Swedish Research Council for Health, Working Life, and Welfare (2014-4947), the Vårdal Foundation (2014-0114), and the strategic research program U-CARE (2009-1093).
JW, EMGO, and EG designed the study. JW, EMGO, EG, CH, GM, FN, and LvE interpreted the findings, critically revised the manuscript, and approved its final form and submission. EG, JW, FN, and EMGO preprocessed data. JW analyzed data and drafted the manuscript.
None declared.