Background: While many digital mental health interventions (DMHIs) have been found to be efficacious, patient engagement with DMHIs has increasingly emerged as a concern for implementation in real-world clinical settings. To address engagement, we must first understand what standard engagement levels are in the context of randomized controlled trials (RCTs) and how these compare with other treatments.
Objective: This scoping review aims to examine the state of reporting on intervention engagement in RCTs of mobile app–based interventions intended to treat symptoms of depression. We sought to identify what engagement metrics are and are not routinely reported as well as what the metrics that are reported reflect about standard engagement levels.
Methods: We conducted a systematic search of 7 databases to identify studies meeting our eligibility criteria, namely, RCTs that evaluated use of a mobile app–based intervention in adults, for which depressive symptoms were a primary outcome of interest. We then extracted 2 kinds of information from each article: intervention details and indices of DMHI engagement. A 5-element framework of minimum necessary DMHI engagement reporting was derived by our team and guided our data extraction. This framework included (1) recommended app use as communicated to participants at enrollment and, when reported, app adherence criteria; (2) rate of intervention uptake among those assigned to the intervention; (3) level of app use metrics reported, specifically number of uses and time spent using the app; (4) duration of app use metrics (ie, weekly use patterns); and (5) number of intervention completers.
Results: Database searching yielded 2083 unique records. Of these, 22 studies were eligible for inclusion. Only 64% (14/22) of studies included in this review specified rate of intervention uptake. Level of use metrics was only reported in 59% (13/22) of the studies reviewed. Approximately one-quarter of the studies (5/22, 23%) reported duration of use metrics. Only half (11/22, 50%) of the studies reported the number of participants who completed the app-based components of the intervention as intended or other metrics related to completion. Findings in those studies reporting metrics related to intervention completion indicated that between 14.4% and 93.0% of participants randomized to a DMHI condition completed the intervention as intended or according to a specified adherence criteria.
Conclusions: Findings suggest that engagement was underreported and widely varied. It was not uncommon to see completion rates at or below 50% (11/22) of those participants randomized to a treatment condition or to simply see completion rates not reported at all. This variability in reporting suggests a failure to establish sufficient reporting standards and limits the conclusions that can be drawn about level of engagement with DMHIs. Based on these findings, the 5-element framework applied in this review may be useful as a minimum necessary standard for DMHI engagement reporting.
Digital mental health interventions (DMHIs) are a promising avenue for accessible treatment for people with widespread and debilitating mental health issues such as depression. The field of psychiatry continues to struggle with an insufficient supply of highly trained providers able to offer evidence-based services who are accessible in terms of location and cost. While face-to-face, evidence-based psychotherapy remains the first-line treatment option for mild to moderate depression , emerging literature on DMHIs suggests that these too could be an effective stand-alone or supplemental treatment option [ , ]. These interventions have, therefore, generated significant public interest as they are more accessible and lower cost than face-to-face psychotherapy.
As interest has mounted, however, so too have concerns about low patient engagement with these interventions. In the last 10 years, several large implementation studies of DMHIs have shown that the majority of patients offered these interventions do not engage at the recommended frequency or complete the full course of treatment [- ]. In a large implementation study, Gilbody et al [ ] concluded that “while [DMHIs] have been shown to be efficacious in developer led trials, [they were] not effective in usual NHS care settings. The main reason for this was low adherence and engagement with treatment rather than lack of efficacy.” Such low engagement rates threaten the clinical viability of these treatments.
DMHI engagement has been defined as a patient’s initial adoption and sustained interactions with an intervention [- ]. Within the broader construct of engagement, intervention adherence refers to the extent to which participants engage in the content of the intervention as intended. In the context of randomized controlled trials (RCTs) intervention adherence can be reported as the number of intervention completers with the criteria for completion being clearly specified. However, within the broader construct of engagement, other metrics, such as the rate at which participants download and initiate intervention use (ie, uptake), degree or level of use of the intervention, and duration of use of the intervention are also relevant.
Engagement is particularly important to consider in RCTs because low intervention engagement poses a threat to the validity of conclusions drawn. It could lead to underestimating the intervention effect especially if a dose-response relationship exists . Furthermore, as discussed by Eysenbach [ ], if a participant did not significantly engage with an intervention, it is difficult to conclude that the intervention produced a positive outcome even if such outcomes were observed. In these cases, we are left with questions about the extent to which confounding variables, such as attention from study staff, could have produced any observed intervention effect. Finally, when degree of intervention engagement is not clearly described in manuscripts, we lose information on how an intervention must be used to achieve observed effects. For example, if an 8-week intervention period was studied and a positive intervention effect was observed, but 70% of participants only used the intervention for the first 2 weeks of the intervention period, we may conclude that just 2 weeks of use may be producing positive results. Alternatively, we may conclude that a certain level of effect could be expected after 2 weeks of use, whereas a different, perhaps more pronounced effect, could be expected after 8 weeks of use.
The concept of what constitutes sufficient engagement with DMHIs is inherently messier than for some other types of mental health interventions. For example, sufficient engagement with antidepressant medications typically means taking a daily pill. In psychotherapy, sufficient engagement is typically defined as attending all planned psychotherapy sessions. Use of medication and appointment attendance are clear quantitative adherence metrics. In the case of DMHIs, however, heterogeneity in intervention design leaves us with considerably less clarity on appropriate intervention adherence metrics. Some DMHIs, such as the Get Happy Program , consist of a series of lessons or modules that are designed to be completed in a sequential fashion over a specified number of weeks. These programs mirror face-to-face therapy programs where there is an assumption of some established weekly content review or dedicated time commitment. Other DMHIs are designed to be used more frequently for briefer periods. For example, IntelliCare [ ] is designed to be used on a daily basis, but length of time in the app is not prescribed. Still, other interventions (eg, the MONARCA System [ ]) consist primarily of symptom monitoring and are designed to be used frequently to inform and support clinician-based care.
This inherent heterogeneity of DMHIs makes engagement difficult to compare across studies. It also calls for consideration of what constitutes appropriate reporting related to both the larger construct of engagement and the narrower construct of adherence. To date, reviews and meta-analyses related to engagement with DMHIs have tended to focus on related, but distinct concepts. For example, study dropout or study attrition has been evaluated as a proxy for treatment dropout, with findings suggesting significant dropout [- ]. Similarly, user-rated acceptability and feasibility have been evaluated as proxies for engagement [ ]. Finally, several recent reviews have explored variables related to user engagement with DMHIs [ , , ]. However, to date, no review to our knowledge has explored the actual level of user engagement in RCTs of DMHIs. Therefore, the objective of this scoping review was to examine reporting on user engagement in RCTs of mobile app–based interventions for symptoms of depression. Specifically, we sought to identify (1) the extent to which key engagement metrics are routinely reported and (2) what the metrics that are reported reflect about standard levels of engagement.
The creation of this report was guided by the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) Extension for Scoping Reviews () [ ].
Information Sources and Search Strategy
A systematic search was conducted using OvidSP to search 7 electronic databases, MEDLINE, Embase, Cochrane Central Register of Controlled Trials, Allied and Complementary Medicine, Health Management Information Consortium, Health Technology Assessment, and PsycINFO, for articles published through May 1, 2020 (). The search was conducted on May 7, 2020. In brief, the search strategy combined synonyms for the population of interest (patients with mental illness), the intervention modality (mobile phone apps), and the type of study (RCT). Search results were limited to the English language and studies of humans.
|Search category||Search terms|
|Population||“depression” OR “depressive” OR “mental illness” OR “mental health” OR “mood disorder” OR “affective disorder” OR “anxiety” OR “panic disorder” OR “phobia” OR “bipolar” OR “psychosis” OR “schizophr*” AND|
|Intervention||“smartphone*” OR “mobile phone*” OR “cell phone*” OR “iphone” OR “android” OR “mhealth” OR “mobile application” OR “phone application” AND|
|Type of study||“randomised” OR “randomized” OR “randomly” OR “random assignment” OR “controlled trial” OR “clinical trial” OR “control group” OR “intervention”|
|Databases selected for search||Ovid MEDLINE, Embase, Cochrane Central Register of Controlled Trials, Allied and Complementary Medicine, Health Management Information Consortium, Health Technology Assessment, and PsycINFO|
Inclusion and Exclusion Criteria
Only articles published in peer-reviewed journals were included. Articles were deemed eligible if they were RCTs of mobile app–based interventions targeting adults (aged >18) with clinical depression, in which depressive symptoms were a primary outcome of interest, and retention in posttreatment assessments was reported. We defined a mobile app–based intervention as one that required use of a mobile device app as part of the treatment.
We defined studying a “clinically depressed” sample as meeting at least one of the following criteria: (1) eligibility criteria requiring participants to have scores on a depression self-report measure over an established clinical cutoff; (2) eligibility criteria requiring participants to have a psychiatric diagnosis per their medical record or per a structured clinical interview; or (3) reported average baseline scores on a depression self-report measure above an established clinical cutoff in all groups. When there was ambiguity on the established clinical cutoff for a self-report measure, we used the lowest published cutoff score.
At least two independent reviewers judged article eligibility (JML, JGL, or RVB), with any disagreements resolved through mediation with a third reviewer (TPH). The screening process began with title and abstract review followed by a full-text review of any articles that appeared potentially relevant based on the abstract/title review or where there was insufficient information in the abstract to determine eligibility.
Data Extraction and Synthesis
Data extraction occurred in 3 parts. First, data were extracted by one author (JGL or RVB). Next, the rationale for each datapoint and where it came from in the original articles were reviewed with JML. Finally, all datapoints considered ambiguous or disagreements between the authors who completed the initial data extraction and JML were reviewed with one additional author (TPH).
Two kinds of information were extracted from each article. First, intervention details were extracted, including the (1) clinical population, (2) length of the treatment period, (3) a description of the study conditions, (4) total sample size in each condition, and (5) whether human support by a coach or licensed clinician was offered as part of the intervention.
Second, a 5-element framework of minimum necessary DMHI engagement reporting, developed by our study team, was used to extract key descriptive and numeric indices of participant engagement with the intervention. Elements in this framework were as follows: (1) recommended intervention app use as communicated to participants at enrollment and, when reported, intervention app adherence criteria; (2) rate of uptake, defined as the number and percentage of participants randomized to the intervention who engaged with their assigned app at all; (3) level of intervention app use metrics, specifically number of times participants used the app and amount of time participants spent in the app; (4) duration of intervention app use metrics (ie, whether weekly use patterns were reported and the number and percentage of participants who used the app in the final week of the intervention period); and (5) number and percentage of participants randomized who could be considered intervention completers. Furthermore, for context, we identified whether studies used backend data or other methods (such as self-report) to quantify app usage and extracted any additional data presented on intervention engagement.
Selection and Inclusion of Studies
The full systematic search retrieved a total of 3137 records (). Following the removal of duplicate articles across electronic databases, 2083 articles were screened at the title-and-abstract phase. This identified 150 articles as potentially eligible, which were subsequently screened in full. Full-text screening resulted in the exclusion of 128 articles for reasons specified in . A total of 22 independent studies [ , , - ] were ultimately eligible for inclusion.
Characteristics of Included Studies
Detailed study characteristics are presented in. While all 22 studies included a clinically depressed sample and symptoms of depression as a primary outcome, the target populations differed. Of the 22 eligible studies, the following target populations were recruited: depression (n=13), suicidal ideation (n=1); depression or anxiety (n=3); bipolar disorder (n=1); medical population with clinically significant symptoms of depression (n=2); community sample (n=1); and college students (n=1). Intervention periods ranged from 2 weeks to 6 months and sample sizes ranged from 30 to 720. Interventions evaluated included a range of human support: 11 were entirely self-help interventions involving no human support, 9 involved a licensed clinician, 1 involved a clinical coach, and 1 included clinical support from research staff for whom licensure status was not specified. For descriptive purposes, apps studied were assigned to 1 of 3 categories: those intended to be used as daily self-management/skill-building tools (n=13); those intended to provide support in the context of clinician-administered care or to facilitate communication with clinicians (n=5); and treatments involving a discrete number of lessons/modules typically to be completed on a weekly basis (n=4).
|Study||Clinical population||Treatment period||Conditions||Sample size||Human contact||App category|
|Arean et al ||Daily self-management/skill building|
|Bakker et al ||Daily self-management/skill building|
|Birney et al ||Daily self-management/skill building|
|Borjalilu et al ||Daily self-management/skill building|
|Dahne et al ||Daily self-management/skill building|
|Dahne et al ||Daily self-management/skill building|
|Faurholt-Jepsen et al ||Support for appointments/interaction with clinician|
|Fitzpatrick et al ||Daily self-management/skill building|
|Guo et al ||Discrete number of lessons/modules|
|Lüdtke et al ||Daily self-management/skill building|
|Ly et al ||Support for appointments/interaction with cliniciand|
|Ly et al ||Support for appointments/interaction with clinician|
|Mantani et al ||Discrete number of lessons/modules|
|Moberg et al ||Daily self-management/skill building|
|Mohr et al ||Daily self-management/skill building|
|Motter et al ||Daily self-management/skill building|
|O’Toole et al ||Support for appointments/interaction with clinician|
|Place et al ||Support for appointments/interaction with clinician|
|Proudfoot et al ||Discrete number of lessons/modules|
|Roepke et al f||Daily self-management/skill building|
|Stiles-Shields et al ||Daily self-management/skill building|
|Watts et al ||Discrete number of lessons/modules|
aCBT: cognitive behavioral therapy.
bTAU: treatment as usual.
cN/A: no treatment administered.
dThe intervention in Ly et al  contained elements of daily self-management/skill building, but completion was defined by interactions with a clinician so this was deemed primarily an intervention to support appointments/interaction with a clinician.
eMohr et al  was a 2 × 2 factorial trial design. Group sample sizes specified here are not mutually exclusive.
fRoepke et al  reported that the SuperBetter intervention was targeted to occur on the iPhone, but could be used via a website on computers. This study was deemed eligible because the intention was for it to be smartphone based.
gStiles-Shields et al  involved coaching, but is categorized as involving a clinician (not a coach) because the coach was a licensed clinician.
Reporting on Participant Engagement
Data extracted based on our 5-element framework are presented in(with additional details presented in ). With the exception of Ludtke et al [ ], all studies that reported on app usage indicated using backend data from the app to monitor app usage in the test condition(s). Ludtke et al [ ] only offered self-reported app usage data; 14/22 papers (64%) reported the rate of app uptake defined as the number of participants randomized to the intervention condition(s) who engaged with the app at least once. Findings in those studies reporting the rate of app uptake indicated that between 42% and 100% of those participants randomized to an app-based DMHI condition engaged with the app at least once.
With regard to ongoing use, reports were varied. A total of 13 papers (59%) reported a level-of-use metric. The most common level-of-used metric was number of sessions/launches (n=12). Time spent in the app was a less popular level-of-use metric (n=4). Fewer papers reported metrics on duration of use. Only 5 studies (23%) reported weekly use patterns over the course of the intervention and the number of participants who were still using the intervention during the last week of the treatment period.
With regard to questions of whether participants completed the intervention as intended, reporting was also varied.describes the app intervention instructions given to participants and app adherence criteria to the extent that these were specified in each article. Only 3 studies clearly reported the number of participants randomized to the DMHI who were considered to have completed the app-based components of the intervention as intended per specified intervention instructions. An additional 4 studies (footnote i in ) reported the number of participants who met a specified adherence threshold such as using the intervention app once per week; 4 more studies reported metrics related to intervention completion, including percentage of patients who used the app on a daily basis (n=1; footnote m); percentage of patients completing the intervention based on a criterion defined by clinician contact rather than app use (n=2; marked by footnote o); and percentage of participants who downloaded all the intervention content (n=1; marked by footnote t). Findings in those studies reporting metrics related to intervention completion indicated that between 14.4% and 93.0% of participants randomized to a DMHI condition completed the intervention as intended or according to a specified app adherence criteria. Among the 11 studies reporting this metric, 6 reported that less than or equal to 50% of participants completed the intervention.
|Study and intervention name||App use instructions or adherence criteria||Rate of uptakeb, n (%)||Level of use||Duration of use||Completersc, n (%) or %|
|App uses||Minutes spent using the app, mean (SD)||Reported weekly use pattern||Used in the final week, n (%) or %|
|Arean et al |
|Project: EVO||Use app 6 times/week for 30 minutes/day (3 or more times/week considered adherent)||177 (42.1)d||Mean 10.78 (SD 11.4)e||NRf||Yes||42 (20.1)g||30 (14.4)g,h,i|
|iPST||Use app as often as possible (1 or more times/week considered adherent)||—j||—j||NR||Yes||40 (19.0)g||40 (19.0)g,h,i|
|Health Tips App||No specific instructions, but daily advice was provided||NR||NR||NR||No||NR||NR|
|Bakker et al |
|MoodKit||No specific instructions reported||NR||NR||NR||No||NR||NR|
|MoodPrism||No specific instructions reported||NR||NR||NR||No||NR||NR|
|MoodMission||No specific instructions reported||NR||NR||NR||No||NR||NR|
|Birney et al |
|MoodHacker||Daily app use||NR||Mean 16.0 (SD 13.3)||78 (78)||No||NR||NR|
|Borjalilu et al |
|Aramgar app||Complete recommended app exercises daily||NR||NR||NR||No||NR||NR|
|Aramgar app with face-to-face therapy||Twice/week face-to-face workshops plus daily app exercises||NR||NR||NR||No||NR||NR|
|Dahne et al |
|¡Apívate!||Use app once/day (1 or more times/week considered adherent)||22 (100)||Mean 61.4 (SD 91.7)||65.8 (82.8)||Yes||11 (50)||11 (50)h,i|
|iCouch CBT||Use app once/day (1 or more times/week considered adherent)||NR||NR||NR||Yes||33g||33g,h,i|
|Dahne et al |
|Moodivate||Use the app once/day (1 or more times/week considered adherent)||21 (100)k||Mean 46.8 (SD 30.1)||120.8 (101.0)||Yes||9 (50)k||9 (50)h,i,k|
|MoodKit||Use app once/day||NR||NR||NR||No||NR||NR|
|Faurholt-Jepsen et al |
|MONARCA||Use app for self-monitoring daily||34 (87.2)l||NR||NR||No||NR||93.0m|
|Fitzpatrick et al |
|Woebot||Daily monitoring and “regular check-ins”||34 (100)||Mean 12.14 (SD 2.23)||NR||No||NR||NR|
|Guo et al |
|Run4Love||Complete 9 cognitive behavioral stress management sessions, 3 review sessions, and set weekly physical activity goal||NR||NR||NR||No||NR||NR|
|Lüdtke et al |
|Be Good to Yourself app||Use app “several times a week”||26 (59.1)n||NR||NR||No||NR||19 (43.2)n|
|Ly et al |
|Behavioral activation smartphone app||Add at least two behavioral goals to the app and register/write a reflection in the app when these goals were completed||81 (96.4)d||NR||NR||No||NR||25 (63.0)e,o|
|Mindfulness smartphone app||Use audio tracks with exercises to facilitate the practice of mindfulness||—j||—j||NR||No||NR||32 (78.0)e,o|
|Ly et al |
|Blended treatment||No specific instructions reported||NR||NR||NR||No||NR||42 (91.3)o|
|Mantani et al |
|CPT-Kokoro app||Complete 8 mobile app sessions, 1 per week||80 (98.76)||Mean 7.01 (SD 1.5)p||NR||Yes||43 (53.1)||43 (53.1)p|
|Moberg et al |
|Pacifica||No specific instructions reported||246 (97.2)||Median 19 (range 1-286)e||NR||No||NR||NR|
|Mohr et al |
|IntelliCare: coached||No specific instructions reported (last app use at or after 7 weeks considered adherent)||143 (95.3)l||Median 215 (IQR 141-330.8)||NR||Yes||136 (90.7)q||136 (90.7)m,p|
|IntelliCare: self-guided||No specific instructions reported (last app use at or after 7 weeks considered adherent)||151 (100)l||Median 218 (IQR 113-310)||NR||Yes||126 (83.4)q||126 (83.4)m,p|
|IntelliCare: recommendations||No specific instructions reported (last app use at or after 7 weeks considered adherent)||146 (98.0)l||Median 232 (IQR 126-356)||NR||Yes||132 (88.6)q||132 (88.6)m,p|
|IntelliCare: no recommendations||No specific instructions reported (last app use at or after 7 weeks considered adherent)||148 (97.4)l||Median 201.5 (IQR 125.8-285.5)||NR||Yes||130 (85.5)q||130 (85.5)m,p|
|Motter et al |
|Executive function/processing speed–focused CCTr||Use app 15 minutes/day 5 days/week||NR||NR||168.3 (69.0)||No||NR||NR|
|Verbal ability–focused CCT||Use app 15 minutes/day 5 days/week||NR||NR||363.8 (253.4)||No||NR||NR|
|O’Toole et al |
|LifeApp’tite||At discretion of therapists to decide frequency of app use||50 (83.3)||NR||NR||No||NR||NR|
|Place et al |
|Cogito||Record weekly audio notes on mood and complete weekly self-reports||NR||NR||NR||No||NR||NR|
|Proudfoot et al |
|MyCompass||Complete a minimum of 2 modules and monitor at least three moods or behaviors||NR||Mean 14.7 (SD 16.7)p||NR||No||NR||NR|
|Roepke et al |
|CBT-PPT SuperBetter||Use app 10 minutes/day||72 (77.4)||Mean 21.5 (SD 34.3), median 9.5d,e||NR||No||NR||31 (33.3)s|
|General SuperBetter||Use app 10 minutes/day||72 (74.23)||—j||NR||No||NR||64 (66.0)s|
|Stiles-Shields et al |
|Boost Me||No specific instructions reported||10 (100)l||Mean 97.7||NR||No||NR||NR|
|Thought Challenger||No specific instructions reported||7 (70)l||Mean 33.5||NR||No||NR||NR|
|Watts et al |
|Get Happy Program Mobile App||Complete 6 lessons and associated homework||15 (68.2)l||Mean 5.1 (SD 1.6)e,t||NR||Nou||NRu||10 (45.5)p|
aTable includes all treatment conditions that involved a mobile app component.
bRate of uptake: number of participants randomized to the intervention who used it at least once.
cCompleter: participants who completed the intervention as intended per intervention instructions or per specified adherence criteria.
dReported metric cut across treatment groups.
eOnly included participants who logged onto the app at least once.
fNR: not reported.
gEstimate based on figure, exact number not reported.
hAssumes participants who met adherence criteria during the last week also met adherence criteria in previous weeks. For example, in a 4-week intervention, those reported to have used the app in week 4 also used in weeks 1-3.
iCompletion refers to meeting a specified adherence criteria involving app use not to complying with intervention use instructions.
jMetrics were only reported across conditions rather than for each group independently; all numbers are rounded to 1 decimal place.
kOwing to technical issues, data on rate of uptake were only available in 21 participants and data on ongoing use were only available in 18 participants. To calculate percentages presented, the number of people for whom data were available was used as the denominator.
lBased on reported numbers of participants who were randomized to the condition, but never started treatment. Reasons were not always related to willingness/interest in trying the relevant app. For example, reason may have been that the participant was unresponsive to outreach to inform them of their assigned treatment.
mArticle reports “93.03% (SD 15.6) of patients randomized to the intervention group evaluated the subjective items in the MONARCA system on a daily basis.” Unclear if this refers to participants using the system an average of 93.03% of days or if it refers to 93.03% of the participants in the intervention using it every day of the 6-month intervention period.
nAs use data were self-reported, these metrics only include those participants who completed the posttreatment assessment. To calculate percentages presented, the total size of the treatment group was used as the denominator.
oCompletion was defined by clinician contact not app use.
pMetric takes into account all participants randomized to the condition even if they did not log onto the app.
qNumber represents the number of participants whose last use was week 7 or after.
rCCT: computerized cognitive training.
sRefers to the number of participants who downloaded all content.
tUses refers to lessons completed.
uNumber of lessons completed was reported, but lessons were not precisely 1 per week.
This scoping review has revealed that reporting on engagement with DMHIs in RCTs is highly variable. A number of basic metrics of intervention engagement, such as rate of intervention uptake, weekly use patterns, and number of intervention completers, were routinely not reported. When intervention engagement metrics were reported, it was common to see low levels of engagement. The variability in reporting and frequency of low engagement when reported highlight the importance of establishing minimum necessary reporting standards for engagement in DHMI research.
Only 64% (14/22) of studies included in this review specified rate of uptake, defined as the number of participants randomized to the intervention condition who used the app at least once. Past research suggests that rate of uptake cannot be assumed, especially in the context of fully remote, self-guided digital interventions. Those studies that did report this metric showed varied levels of uptake. For example, Arean at al  found that over one-half of participants did not download their assigned app, whereas Roepke et al [ ] and Watts et al [ ] found that closer to one-quarter of participants did not download their assigned app. The studies reviewed here varied in the type of app and design so different rates of uptake may be expected, but the extent of inconsistent reporting was surprising.
Level of use metrics, defined as both the number of app launches and the amount of time the intervention was used, was only reported in 59% (13/22) of the studies reviewed. These metrics—specifically, average number of uses and average time spent in the app—should be feasible to calculate when researchers have access to activity log data of the tested app, which was the case in most of the studies included. There can be some complications reporting these metrics. For example, it can be difficult to accurately report time spent in the app when participants leave an app open on their device for longer than they are actively using it. Similarly, apps can be launched only to be closed in a matter of seconds. However, in cases where these metrics are not appropriate for the intervention being evaluated, we would have expected to see alternative metrics such as number of clicks reported, but this was only the case in 1 of the reviewed studies .
Approximately one-quarter of studies (5/22, 23%) reported on participant duration of use, defined as reporting both weekly use patterns and the number of participants who used the app in the final week of the intervention period. It is well documented that, in general, mobile apps tend to be used heavily when first downloaded and that use decreases over time . Similarly, concerns related to sustained engagement with web-based psychiatric interventions have been reported in routine-care implementation studies [ - , ]. Inconsistent use of psychiatric intervention apps over time is an issue that needs to be addressed if our field is to mature; however, addressing this issue will be all the more difficult if such variations in use are not adequately reported in our published literature. Data from Dahne et al [ ] provide an excellent example of how this metric is useful to report alongside level of use. They reported that 81.8% of participants in the intervention condition used the app at least eight times (an average of at least once per week), but only 51% of participants used the app during the last week. Much like patterns of use with other popular apps, these data suggest high initial use that declines over time.
In the context of intervention research, it is important to include some clear metric of intervention adherence or completion. Yet only 50% (11/22) of studies in this review clearly reported the number of participants considered to have completed the app-based components of the intervention as intended or other metrics related to completion such as percentage of patients who met a specified adherence threshold; percentage of patients completing clinician-based components of the intervention; and percentage of participants who downloaded all the intervention content. Just like psychotherapy or medication use, mobile app–based interventions incorporate some expected efficacious dose into the instructions for use. The fact that use can be accurately and objectively tracked from backend metrics is highly encouraging, and distinguishes our field from other treatment research (such as medication trials) where adherence has historically been extremely difficult to reliably measure. Further, completion need not be full use exactly as intended. For example, Arean et al  specified that 50% compliance with intervention instructions was considered completion. Simply not discussing who uses mobile app depression interventions as intended, however, will limit the potential for insight into and utility of these interventions.
Finally, one of our objectives in this review was to quantify standard level of engagement in RCTs of mobile app–based depression interventions. Our data extraction led us to conclude that with the current state of reporting, this is nearly impossible to do. What we did conclude is that engagement at all points—uptake, level, duration, and completion—is widely varied. Moreover, it was not uncommon to see completion rates at or below 50% of those participants randomized to a treatment condition (n=6) or to simply see engagement rates not reported at all (n=5).
This scoping review has several limitations. First, this review illustrates an important dilemma in the field of DMHI research, but findings are limited to a subset of DMHI literature, specifically only that involving depression interventions in psychiatric samples with mobile app–based interventions. While we expect our proposed reporting guidelines to be useful across DMHIs, the extent to which the findings of this review carry through to mobile app interventions in other areas of mental health remains unclear. Second, our original goal in approaching this scoping review was to quantify typical engagement with DMHIs in RCTs; however, as we began the literature review, we ascertained that this goal would be difficult given the variability (and often absence) of metrics reported. This study, therefore, represents a shift in objectives. Third, we only reviewed papers from academic sources, which limits the kinds of mental health apps we took into account. The quality and objectivity of the data contained within independently published reports from private industries on their own mental health apps have yet to be reviewed. Finally, this review only evaluated literature though May 2020. While there is no reason to expect that reporting on engagement has improved, this work should be conceptualized as only a starting point for a discussion of appropriate reporting guidelines and future reviews or meta-analyses on this topic are warranted.
The emerging field of DMHIs has reached a critical juncture: intervention engagement has been widely recognized as the key factor limiting DMHI clinical utility. This review illustrates that engagement is variable and frequently underreported. Adopting a set of reporting guidelines that specify the minimum necessary information when publishing RCTs of DMHIs will provide new insights into how to improve engagement in mental health apps; allow for clear comparisons between DMHIs and other treatment options; and offer benchmarks upon which further research must improve. Such reporting standards will complement the expanding literature on user-centered evaluations of engaging with digital health tools and interventions [- ].
To this end, we suggest the 5-element framework applied in this study be used to guide minimum necessary DMHI engagement reporting standards. This framework includes the following: (1) intervention instructions or adherence criteria, defined as an explicit statement of what it means for participants to have used an intervention as intended or met some minimum intervention threshold; (2) rate of uptake, defined as the number of participants randomized to the intervention who downloaded the associated app(s) and used them at least once; (3) level of use metrics, defined as both the number of app launches and the amount of time the intervention was used (with alternative metrics such as number of clicks appropriate if more suitable for the intervention and justified); (4) duration of use, defined as participants’ weekly use patterns; and (5) number of completers, defined as the number of participants who completed the intervention as intended per intervention instructions or per specified adherence criteria. We believe this framework could be a useful starting point to promote standards of reporting within the field, with room for future iterations.
Certainly complexities exist when identifying and reporting engagement with DMHIs given that these interventions vary widely in content and format. The reporting guidelines that we have suggested in response to our findings are intended both to be broadly applicable across DMHIs and to challenge the field to move past complexities and move toward greater transparency and rigor. We hope this begins an important discussion on reporting standards that will improve our understanding of how to evaluate and optimize DMHIs.
JML was partially supported by an NIMH Mentored Patient-Oriented Career Development Award (K23MH120324) and an NARSAD Young Investigator Grant from the Brian and Behavior Research Foundation. The authors acknowledge Britney Gluskin for her assistance with title and abstract screening.
Conflicts of Interest
JF is supported by a UK Research and Innovation Future Leaders Fellowship (MR/T021780/1) and has received honoraria / consultancy fees from Atheneum, Informa, Gillian Kenny Associates, Big Health, Nutritional Medicine Institute, ParachuteBH, Richmond Foundation and Nirakara, independent of this work.
Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews (PRISMA-ScR) Checklist.DOCX File , 48 KB
Other DMHI use data reported. DMHI: digital mental health intervention.DOCX File , 27 KB
- American Psychiatric Association. Practice Guideline for the Treatment of Patients With Major Depressive Disorder. Washington, D.C: American Psychiatric Association; 2010:1-152.
- Firth J, Torous J, Nicholas J, Carney R, Pratap A, Rosenbaum S, et al. The efficacy of smartphone-based mental health interventions for depressive symptoms: a meta-analysis of randomized controlled trials. World Psychiatry 2017 Oct;16(3):287-298 [FREE Full text] [CrossRef] [Medline]
- Linardon J, Cuijpers P, Carlbring P, Messer M, Fuller-Tyszkiewicz M. The efficacy of app-supported smartphone interventions for mental health problems: a meta-analysis of randomized controlled trials. World Psychiatry 2019 Oct 09;18(3):325-336 [FREE Full text] [CrossRef] [Medline]
- Gilbody S, Brabyn S, Lovell K, Kessler D, Devlin T, Smith L, REEACT collaborative. Telephone-supported computerised cognitive-behavioural therapy: REEACT-2 large-scale pragmatic randomised controlled trial. Br J Psychiatry 2017 May;210(5):362-367. [CrossRef] [Medline]
- Cavanagh K, Seccombe N, Lidbetter N. The Implementation of Computerized Cognitive Behavioural Therapies in a Service User-Led, Third Sector Self Help Clinic. Behav. Cogn. Psychother 2011 Feb 22;39(4):427-442. [CrossRef]
- Hensel JM, Shaw J, Ivers NM, Desveaux L, Vigod SN, Cohen A, et al. A Web-Based Mental Health Platform for Individuals Seeking Specialized Mental Health Care Services: Multicenter Pragmatic Randomized Controlled Trial. J Med Internet Res 2019 Jun 04;21(6):e10838 [FREE Full text] [CrossRef] [Medline]
- Gilbody S, Littlewood E, Hewitt C, Brierley G, Tharmanathan P, Araya R, REEACT Team. Computerised cognitive behaviour therapy (cCBT) as treatment for depression in primary care (REEACT trial): large scale pragmatic randomised controlled trial. BMJ 2015 Nov 11;351:h5627 [FREE Full text] [CrossRef] [Medline]
- Borghouts J, Eikey E, Mark G, De Leon C, Schueller SM, Schneider M, et al. Barriers to and Facilitators of User Engagement With Digital Mental Health Interventions: Systematic Review. J Med Internet Res 2021 Mar 24;23(3):e24387 [FREE Full text] [CrossRef] [Medline]
- Arnold C, Farhall J, Villagonzalo K, Sharma K, Thomas N. Engagement with online psychosocial interventions for psychosis: A review and synthesis of relevant factors. Internet Interv 2021 Sep;25:100411 [FREE Full text] [CrossRef] [Medline]
- Baltierra NB, Muessig KE, Pike EC, LeGrand S, Bull SS, Hightow-Weidman LB. More than just tracking time: Complex measures of user engagement with an internet-based health promotion intervention. J Biomed Inform 2016 Feb;59:299-307 [FREE Full text] [CrossRef] [Medline]
- Donkin L, Christensen H, Naismith SL, Neal B, Hickie IB, Glozier N. A systematic review of the impact of adherence on the effectiveness of e-therapies. J Med Internet Res 2011 Aug 05;13(3):e52 [FREE Full text] [CrossRef] [Medline]
- Eysenbach G. The law of attrition. J Med Internet Res 2005 Mar 31;7(1):e11 [FREE Full text] [CrossRef] [Medline]
- Watts S, Mackenzie A, Thomas C, Griskaitis A, Mewton L, Williams A, et al. CBT for depression: a pilot RCT comparing mobile phone vs. computer. BMC Psychiatry 2013 Feb 07;13(1):49. [CrossRef]
- Mohr DC, Tomasino KN, Lattie EG, Palac HL, Kwasny MJ, Weingardt K, et al. IntelliCare: An Eclectic, Skills-Based App Suite for the Treatment of Depression and Anxiety. J Med Internet Res 2017 Jan 05;19(1):e10 [FREE Full text] [CrossRef] [Medline]
- Faurholt-Jepsen M, Frost M, Ritz C, Christensen EM, Jacoby AS, Mikkelsen RL, et al. Daily electronic self-monitoring in bipolar disorder using smartphones – the MONARCA I trial: a randomized, placebo-controlled, single-blind, parallel group trial. Psychol. Med 2015 Jul 29;45(13):2691-2704. [CrossRef]
- Torous J, Lipschitz J, Ng M, Firth J. Dropout rates in clinical trials of smartphone apps for depressive symptoms: A systematic review and meta-analysis. J Affect Disord 2020 Feb 15;263:413-419. [CrossRef] [Medline]
- Melville KM, Casey LM, Kavanagh DJ. Dropout from Internet-based treatment for psychological disorders. Br J Clin Psychol 2010 Nov;49(Pt 4):455-471. [CrossRef] [Medline]
- Amagai S, Pila S, Kaat AJ, Nowinski CJ, Gershon RC. Challenges in Participant Engagement and Retention Using Mobile Health Apps: Literature Review. J Med Internet Res 2022 Apr 26;24(4):e35120 [FREE Full text] [CrossRef] [Medline]
- Ng MM, Firth J, Minen M, Torous J. User Engagement in Mental Health Apps: A Review of Measurement, Reporting, and Validity. Psychiatr Serv 2019 Jul 01;70(7):538-544 [FREE Full text] [CrossRef] [Medline]
- Tricco AC, Lillie E, Zarin W, O'Brien KK, Colquhoun H, Levac D, et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): Checklist and Explanation. Ann Intern Med 2018 Sep 04;169(7):467. [CrossRef]
- Arean PA, Hallgren KA, Jordan JT, Gazzaley A, Atkins DC, Heagerty PJ, et al. The Use and Effectiveness of Mobile Apps for Depression: Results From a Fully Remote Clinical Trial. J Med Internet Res 2016 Dec 20;18(12):e330 [FREE Full text] [CrossRef] [Medline]
- Bakker D, Kazantzis N, Rickwood D, Rickard N. A randomized controlled trial of three smartphone apps for enhancing public mental health. Behav Res Ther 2018 Oct;109:75-83. [CrossRef] [Medline]
- Birney AJ, Gunn R, Russell JK, Ary DV. MoodHacker Mobile Web App With Email for Adults to Self-Manage Mild-to-Moderate Depression: Randomized Controlled Trial. JMIR Mhealth Uhealth 2016 Jan 26;4(1):e8 [FREE Full text] [CrossRef] [Medline]
- Borjalilu S, Mazaheri MA, Talebpour A. Effectiveness of Mindfulness-Based Stress Management in The Mental Health of Iranian University Students: A Comparison of Blended Therapy, Face-to-Face Sessions, and mHealth App (Aramgar). Iran J Psychiatry Behav Sci 2019 May 12;13(2):84726. [CrossRef]
- Dahne J, Collado A, Lejuez CW, Risco CM, Diaz VA, Coles L, et al. Pilot randomized controlled trial of a Spanish-language Behavioral Activation mobile app (¡Aptívate!) for the treatment of depressive symptoms among united states Latinx adults with limited English proficiency. J Affect Disord 2019 May 01;250:210-217 [FREE Full text] [CrossRef] [Medline]
- Dahne J, Lejuez C, Diaz VA, Player MS, Kustanowitz J, Felton JW, et al. Pilot Randomized Trial of a Self-Help Behavioral Activation Mobile App for Utilization in Primary Care. Behav Ther 2019 Jul;50(4):817-827 [FREE Full text] [CrossRef] [Medline]
- Fitzpatrick KK, Darcy A, Vierhile M. Delivering Cognitive Behavior Therapy to Young Adults With Symptoms of Depression and Anxiety Using a Fully Automated Conversational Agent (Woebot): A Randomized Controlled Trial. JMIR Ment Health 2017 Jun 06;4(2):e19 [FREE Full text] [CrossRef] [Medline]
- Guo Y, Hong YA, Cai W, Li L, Hao Y, Qiao J, et al. Effect of a WeChat-Based Intervention (Run4Love) on Depressive Symptoms Among People Living With HIV in China: A Randomized Controlled Trial. J Med Internet Res 2020 Feb 09;22(2):e16715 [FREE Full text] [CrossRef] [Medline]
- Lüdtke T, Pult LK, Schröder J, Moritz S, Bücker L. A randomized controlled trial on a smartphone self-help application (Be Good to Yourself) to reduce depressive symptoms. Psychiatry Res 2018 Nov;269:753-762. [CrossRef] [Medline]
- Ly KH, Trüschel A, Jarl L, Magnusson S, Windahl T, Johansson R, et al. Behavioural activation versus mindfulness-based guided self-help treatment administered through a smartphone application: a randomised controlled trial. BMJ Open 2014 Jan 09;4(1):e003440 [FREE Full text] [CrossRef] [Medline]
- Ly KH, Topooco N, Cederlund H, Wallin A, Bergström J, Molander O, et al. Smartphone-Supported versus Full Behavioural Activation for Depression: A Randomised Controlled Trial. PLoS One 2015 May 26;10(5):e0126559 [FREE Full text] [CrossRef] [Medline]
- Mantani A, Kato T, Furukawa TA, Horikoshi M, Imai H, Hiroe T, et al. Smartphone Cognitive Behavioral Therapy as an Adjunct to Pharmacotherapy for Refractory Depression: Randomized Controlled Trial. J Med Internet Res 2017 Nov 03;19(11):e373 [FREE Full text] [CrossRef] [Medline]
- Moberg C, Niles A, Beermann D. Guided Self-Help Works: Randomized Waitlist Controlled Trial of Pacifica, a Mobile App Integrating Cognitive Behavioral Therapy and Mindfulness for Stress, Anxiety, and Depression. J Med Internet Res 2019 Jun 08;21(6):e12556 [FREE Full text] [CrossRef] [Medline]
- Mohr DC, Schueller SM, Tomasino KN, Kaiser SM, Alam N, Karr C, et al. Comparison of the Effects of Coaching and Receipt of App Recommendations on Depression, Anxiety, and Engagement in the IntelliCare Platform: Factorial Randomized Controlled Trial. J Med Internet Res 2019 Aug 28;21(8):e13609 [FREE Full text] [CrossRef] [Medline]
- Motter JN, Grinberg A, Lieberman DH, Iqnaibi WB, Sneed JR. Computerized cognitive training in young adults with depressive symptoms: Effects on mood, cognition, and everyday functioning. J Affect Disord 2019 Feb 15;245:28-37. [CrossRef] [Medline]
- O'Toole MS, Arendt MB, Pedersen CM. Testing an App-Assisted Treatment for Suicide Prevention in a Randomized Controlled Trial: Effects on Suicide Risk and Depression. Behav Ther 2019 Mar;50(2):421-429. [CrossRef] [Medline]
- Place S, Blanch-Hartigan D, Smith V, Erb J, Marci CD, Ahern DK. Effect of a Mobile Monitoring System vs Usual Care on Depression Symptoms and Psychological Health: A Randomized Clinical Trial. JAMA Netw Open 2020 Jan 03;3(1):e1919403 [FREE Full text] [CrossRef] [Medline]
- Roepke AM, Jaffee SR, Riffle OM, McGonigal J, Broome R, Maxwell B. Randomized Controlled Trial of SuperBetter, a Smartphone-Based/Internet-Based Self-Help Tool to Reduce Depressive Symptoms. Games Health J 2015 Jun;4(3):235-246. [CrossRef] [Medline]
- Stiles-Shields C, Montague E, Kwasny MJ, Mohr DC. Behavioral and cognitive intervention strategies delivered via coached apps for depression: Pilot trial. Psychol Serv 2019 May;16(2):233-238 [FREE Full text] [CrossRef] [Medline]
- Proudfoot J, Clarke J, Birch M, Whitton AE, Parker G, Manicavasagar V, et al. Impact of a mobile phone and web program on symptom and functional outcomes for people with mild-to-moderate depression, anxiety and stress: a randomised controlled trial. BMC Psychiatry 2013 Nov 18;13:312 [FREE Full text] [CrossRef] [Medline]
- Overall App Benchmarks H2 2017. Localytics. 2017. URL: https://uplandsoftware.com/localytics/resources/cheat-sheet/overall-app-benchmarks-h2-2017/#:~:text=User%20Retention,-Average%20Three%20Month& text=According%20to%20our%20data%2C%2043,month%20after%20they%20downloaded%20it [accessed 2022-09-01]
- Nievas-Soriano BJ, García-Duarte S, Fernández-Alonso AM, Bonillo-Perales A, Parrón-Carreño T. Users evaluation of a Spanish eHealth pediatric website. Comput Methods Programs Biomed 2021 Nov;212:106462 [FREE Full text] [CrossRef] [Medline]
- Schueller SM, Neary M, O'Loughlin K, Adkins EC. Discovery of and Interest in Health Apps Among Those With Mental Health Needs: Survey and Focus Group Study. J Med Internet Res 2018 Jun 11;20(6):e10141 [FREE Full text] [CrossRef] [Medline]
- Lipschitz J, Miller CJ, Hogan TP, Burdick KE, Lippin-Foster R, Simon SR, et al. Adoption of Mobile Apps for Depression and Anxiety: Cross-Sectional Survey Study on Patient Interest and Barriers to Engagement. JMIR Ment Health 2019 Jan 25;6(1):e11334 [FREE Full text] [CrossRef] [Medline]
|AMED: Allied and Complementary Medicine|
|DMHI: digital mental health intervention|
|HMIC: Health Management Information Consortium|
|HTA: Health Technology Assessment|
|PRISMA: Preferred Reporting Items for Systematic Reviews and Meta-Analyses|
|RCT: randomized controlled trial|
Edited by T Leung; submitted 02.05.22; peer-reviewed by C Bedard, B Nievas Soriano; comments to author 29.05.22; revised version received 20.07.22; accepted 19.08.22; published 14.10.22Copyright
©Jessica M Lipschitz, Rachel Van Boxtel, John Torous, Joseph Firth, Julia G Lebovitz, Katherine E Burdick, Timothy P Hogan. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 14.10.2022.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.