This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
Despite the growing interest in the experience sampling method (ESM) as a data collection tool for mental health research, the absence of methodological guidelines related to its use has resulted in a large heterogeneity of designs. Concomitantly, the potential effects of the design on the response behavior of the participants remain largely unknown.
The objective of this meta-analysis was to investigate the associations between various sample and design characteristics and the compliance and retention rates of studies using ESM in mental health research.
ESM studies investigating major depressive disorder, bipolar disorder, and psychotic disorder were considered for inclusion. Besides the compliance and retention rates, a number of sample and design characteristics of the selected studies were collected to assess their potential relationships with the compliance and retention rates. Multilevel random/mixed effects models were used for the analyses.
Compliance and retention rates were lower for studies with a higher proportion of male participants (
The findings demonstrate that ESM studies can be carried out in mental health research, but the quality of the data collection depends upon a number of factors related to the design of ESM studies and the samples under study that need to be considered when designing such protocols.
The experience sampling method (ESM) [
Although ESM presents several advantages over conventional clinical assessment tools, the very nature of this method, requiring multiple self-evaluations over time in daily life, also introduces some challenges. One major challenge is to achieve high compliance and retention rates. The compliance rate can be defined as the ratio of the number of self-evaluations that participants actually completed over the theoretical maximum number of self-evaluations allowed by the protocol (0%‑100% when expressed as a percentage), whereas the retention rate refers to the proportion (or percentage) of participants included in the final analyses (eg, a subject withdrawing their participation from a study, for example, because the data collection procedure is experienced to be too burdensome, would be excluded). These two rates are often inherently linked in ESM research, as participants providing an insufficient number of responses are conventionally excluded from the analyses [
In the framework of ESM, compliance and retention rates are often reported to describe the quantity of data collected and to provide an indication of the quality of the data collection procedures. ESM studies are naturalistic investigations, inevitably leading to missing data. When people are engaging in certain sport, leisure, or work activities, driving in their car, or taking a nap, they will not be able to fill out the ESM questionnaire (either because they do not hear the notification of the data collection device or because responding would be inconvenient, unsafe, or inappropriate to do in a given situation). Compliance rates close to 100% are therefore unlikely. Yet, ideally, one wants to reach the highest compliance possible, as this alleviates concerns about selective reporting at moments that are most convenient for the study participants (which could lead to bias). At the same time, we also need a sufficient number of data points to investigate, for example, variability over time, and to estimate stable associations between variables measured using this method. It is, therefore, important to identify how characteristics of both the ESM design and the samples under investigation influence compliance and retention. Using this information, we might be able to identify designs that are more acceptable to a given group of study participants.
To our knowledge, whether design and sample characteristics influence retention has not been the focus of prior research, but several studies have examined this question with respect to compliance. Compliance tends to decrease over the duration of the ESM follow-up [
The ESM literature displays a rather heterogeneous methodological landscape. Designs vary from 2 [
The compliance rate in ESM studies may also be influenced by the individual characteristics of the study samples. Indeed, compliance appears to drop in relation to the ratio of male participants [
Therefore, both design- and participant-related factors may influence compliance. Fortunately, compliance is typically reported within the ESM literature, making this information highly accessible for a meta-analysis over a large sample of studies. To date, two studies have addressed this question through a meta-analysis. Morren et al [
This meta-analysis, therefore, aims to fill this gap and examines compliance and retention in ESM studies focusing on severe mental disorders, investigating the effect of a large set of design‑ and participant‑related factors with the aim to provide, if achievable, empirically-based guidelines that could support researchers’ choices in designing ESM protocols.
This study was based on the PRISMA-P (Preferred Reporting Items for Systematic Review and Meta-Analysis Protocols) guidelines [
A systematic literature search was performed until February 2017 without publication time limit in PubMed and Web of Science (ie, Web of Science Core Collection). The search strategy was designed to include relevant terms for identifying studies using momentary assessment methods (eg, “experience sampling method” and “ecological momentary assessment”) and terms related to the clinical diagnosis of the participants under study (eg, “psychotic disorder”, “major depressive disorder”, and “bipolar disorder”). The search strategy used either Medical Subject Heading or keyword headings. A concept plan was built with the identified keywords and descriptors to run the search (see
Studies using ESM/EMA designs in adults with a psychotic disorder, major depressive disorder, bipolar disorder, or at high risk for these disorders, and samples of the general population including individuals with or at high risk for these disorders have been included in this review to cover a broader range of the continuum of mood and psychotic disorders. Observational and randomized controlled studies have been included. Case studies, case reports, protocols, descriptions of study designs, systematic reviews, and studies published in a language other than English have not been considered. When available within the included studies, data from nonpsychopathological/healthy control groups have also been considered to serve as a reference group. Studies with only a single daily assessment have been excluded as this form of time sampling is qualitatively distinct from the repeated momentary assessments within a day that defines ESM research. To determine the eligibility of the original studies, two researchers (HV and AR) independently conducted the screening of the studies in the title/abstract and full-text phases based on the inclusion and exclusion criteria. Screening results were compared with identify any discrepancies. In case of a disagreement, a third researcher (IM-G) was consulted and the discrepancy was resolved through group consensus.
When available, data were extracted for the following items: (1) general study characteristics (ie, authors, title, year, and study design); (2) sample characteristics (ie, number of participants included in the study/analysis, mean age, gender composition, clinical status, ethnicity, educational status, employment status, marital status, cohabiting status, and medication use); (3) design characteristics (ie, number of momentary assessments per day, number of assessment days, number of assessment periods as continuous or intermittent assessment, delay between assessment periods, sampling method [fixed, semirandom, or random sampling], time intervals between the assessments within a day, time intervals between the first and the last assessment within a day, time of the start and the end of the assessments within a day, number of items in the questionnaire, approximate mean duration of the questionnaire, type of scales used in the questionnaire, type of method used to perform the assessment, type of incentive, and amount of the incentive); and (4) the compliance rate (proportion of self-evaluations completed by the participants compared with the theoretical maximal number of self-evaluations allowed by the design) and the retention rate (proportion of individuals included in the final analysis out of the number of individuals included at baseline). For studies that included multiple groups (eg, a psychotic disorder group and a healthy control group), sample/design characteristics and the compliance and retention rates were coded at the group level. Studies that fulfilled the inclusion criteria were examined for overlapping samples (
According to the PRISMA guidelines, risk of bias should be assessed for each study (eg, lack of blinding, lack of randomization). However, the current review did not investigate randomized controlled trials and neither compliance nor retention rates were primary outcomes within the sample of studies included in the meta-analysis. Additionally, there is to date no standardized risk of bias assessment guideline for ambulatory studies. The evaluation of the risk of bias was therefore not performed (although we did examine the data for potential publication bias; see further below).
For compliance, there is, in principle, a proportion of completed self-evaluations per participant (eg, 0.80 for the first subject, 0.65 for the second subject, and so on), but this information is never reported. Instead, we analyzed the mean proportions (equation [a],
Equations.
For the analysis of the retention rates, the reported/calculated proportions (of individuals included in the final analysis compared with the number of individuals included at baseline) were first transformed using the (variance-stabilizing) arcsine transformation before the analysis (equation [d],
As a study may include multiple groups, we used a multilevel random/mixed effects model [
Heterogeneity was assessed using the Q-test [
After screening based on title and abstract, a total of 220 studies were considered for inclusion (
Flow chart of study inclusion protocol.
Descriptive statistics of the sample of studies (N=79).
Characteristics | Study level, n (%) | Group level, n (%) | ||
|
||||
|
|
|||
|
|
<2000 | 4 (5) | N/Aa |
|
|
2000‑2004 | 4 (5) | N/A |
|
|
2005-2009 | 10 (13) | N/A |
|
|
2010-2014 | 41 (52) | N/A |
|
|
≥2015 | 20 (25) | N/A |
|
|
|||
|
|
0-49 | 24 (30) | 80 (61) |
|
|
50-99 | 26 (33) | 32 (24) |
|
|
100-149 | 14 (18) | 9 (7) |
|
|
150-199 | 6 (8) | 4 (3) |
|
|
≥200 | 9 (11) | 7 (5) |
|
|
|
|
|
|
|
1 | 42 (53) | N/A |
|
|
2 | 26 (33) | N/A |
|
|
3 | 7 (9) | N/A |
|
|
4 | 3 (4) | N/A |
|
|
5 | 1 (1) | N/A |
|
||||
|
|
|||
|
|
18-29 | 29 (37) | 39 (30) |
|
|
30-39 | 23 (29) | 45 (34) |
|
|
40-49 | 15 (19) | 27 (21) |
|
|
≥50 | 3 (4) | 5 (4) |
|
|
Unavailable | 9 (11) | 16 (12) |
|
|
|||
|
|
<25 | 4 (5) | 11 (8) |
|
|
25-49 | 18 (23) | 28 (21) |
|
|
50-74 | 34 (43) | 57 (43) |
|
|
≥75 | 17 (22) | 26 (20) |
|
|
Unavailable | 6 (8) | 10 (8) |
|
|
|||
|
|
Healthy controls | N/A | 33 (25) |
|
|
General population | N/A | 19 (14) |
|
|
High risk for a severe mental disorder | N/A | 10 (8) |
|
|
Major depressive disorder | N/A | 30 (23) |
|
|
Bipolar disorder | N/A | 9 (7) |
|
|
Psychotic disorder | N/A | 31 (24) |
|
||||
|
|
|||
|
|
1-5 | 12 (15) | 20 (15) |
|
|
6-10 | 54 (68) | 94 (71) |
|
|
>10 | 13 (17) | 18 (14) |
|
|
|||
|
|
2-3 | 11 (14) | 17 (13) |
|
|
4-5 | 23 (29) | 33 (25) |
|
|
6-7 | 6 (8) | 11 (8) |
|
|
8-9 | 9 (11) | 13 (10) |
|
|
10 | 27 (34) | 54 (41) |
|
|
>10 | 2 (3) | 3 (2) |
|
|
Unavailable | 1 (1) | 1 (1) |
|
|
|||
|
|
Fixed | 14 (18) | 23 (17) |
|
|
Semirandom | 32 (41) | 55 (42) |
|
|
Random | 31 (39) | 51 (39) |
|
|
Unavailable | 2 (3) | 3 (2) |
|
|
|||
|
|
<20 | 36 (46) | 58 (44) |
|
|
20-39 | 24 (30) | 36 (27) |
|
|
40-59 | 8 (10) | 15 (11) |
|
|
≥60 | 1 (1) | 4 (3) |
|
|
Unavailable | 10 (13) | 19 (14) |
|
|
|||
|
|
Likert scale | 46 (58) | 80 (61) |
|
|
Visual analogue scale | 8 (10) | 11 (8) |
|
|
Mixed | 23 (29) | 38 (29) |
|
|
Unavailable | 2 (3) | 3 (2) |
|
|
|||
|
|
Paper | 27 (34) | 50 (38) |
|
|
Personal digital assistant | 42 (53) | 66 (50) |
|
|
Other | 11 (14) | 16 (12) |
|
|
|||
|
|
50-59 | 3 (4) | 7 (5) |
|
|
60-69 | 8 (10) | 12 (9) |
|
|
70-79 | 24 (30) | 39 (30) |
|
|
80-89 | 21 (27) | 35 (27) |
|
|
≥90 | 9 (11) | 16 (12) |
|
|
Unavailable | 14 (18) | 23 (17) |
|
|
|||
|
|
50-59 | 1 (1) | 2 (1) |
|
|
60-69 | 4 (5) | 6 (4) |
|
|
70-79 | 10 (13) | 13 (10) |
|
|
80-89 | 11 (14) | 19 (14) |
|
|
≥90 | 46 (58) | 76 (58) |
|
|
Unavailable | 7 (9) | 16 (12) |
aN/A: not applicable.
The final sample of studies comprised 8013 individuals from 132 different groups (with 1‑5 groups per study). The mean age of the individuals was 31.7 years (SD 10.3, range of the mean age of the groups=18‑71.9), and 62.79% (5032/8013) of the participants were female (SD 23.1, range of the percentage of females in the groups=6.7%‑100%). Overall, 1282 (1282/8013, 16.00%) were individuals without a diagnosis of psychiatric illness, 3456 (3456/8013, 43.13%) were recruited from the general population, 1423 (1432/8013, 17.76%) were diagnosed with a psychotic disorder, 1326 (1326/8013, 16.55%) were diagnosed with major depressive disorder, 266 (266/8013, 3.32%) were diagnosed with bipolar disorder, and 260 (260/8013, 3.24%) were diagnosed with a high risk for one of the mental disorders under study.
From a design perspective, ESM studies included in the meta-analysis involved a mean of 6.9 evaluations per day (SD 3.0, range 2‑14) for 11.2 days (SD 19.0, range 1‑150) for a total mean number of 60.2 evaluations per study (SD 45.0, range 8‑300). Successive evaluations within a day were separated by an average of 131.2 min (SD 92.8, range 45‑720) and participants were required to fill in evaluations during a mean total time window of 13.5 h per day (SD 2.2, range 3-17). The sampling scheme was random in 39.2%, semirandom in 40.5%, and fixed in 17.7% of the studies. On average, 22.5 items per questionnaire were collected by the ESM studies (SD 18.6, range 2‑135). As compensation, the mean value of the incentives for the completion of the ESM studies was €63.6 (SD 69, range 0‑350).
Other variables such as ethnicity, education level, marital status, or other design parameters (eg, continuous or intermittent assessment, approximate mean duration of the questionnaire, type of incentive, and strategies taken by the researchers to maintain/increase retention and compliance) may be relevant for the association with compliance and retention, but were reported inconsistently or by too few studies to be taken into account.
Mean compliance was reported in 65 (65/79, 82%) of the studies, whereas retention rate was reported in 73 (73/79, 92%) of the studies, and 58 (58/79, 73%) of the studies reported both compliance and retention rates. All studies included in the analysis reported at least one of these main outcomes. At the group level, compliance rates were available for 109 (109/132, 82.6%), and retention rates were available for 116 (109/132, 87.9%) of the groups (see
The underlying true effects were heterogeneous, showing Q104=3398.31,
Visual inspection of the funnel plots did not reveal any marked asymmetry (
Funnel plots for compliance and retention.
The results of the meta-regression analyses of the sample characteristics are shown in
The analyses revealed significant relationships between some of the characteristics of the participants and the mean compliance and retention rates. Specifically, the proportion of women in ESM studies was found to be a significant predictor of both compliance (
Results of the meta-regression analyses of the sample characteristics.
Sample characteristics | k | Estimate | SE | 95% CI | QM test ( |
R² (%) | ||||
|
|
|
|
|
|
|
Study | Group | ||
|
||||||||||
|
|
98 |
|
|
|
|
—a | 34 | 0 | |
Intercept |
|
85.65 | 3.44 |
|
78.91-92.39 |
|
|
|
||
Beta |
|
−0.18 | 0.1 | .08 | −0.38 to 0.02 |
|
|
|
||
|
99 |
|
|
|
|
— | 0 | 44 | ||
Intercept |
|
68.41 | 2.51 |
|
63.49-73.33 |
|
|
|
||
Beta |
|
0.18 | 0.04 | <.001 | 0.11-0.25 |
|
|
|
||
|
105 |
|
|
|
|
41.48 (5) | 0 | 54 | ||
Intercept (HCb) |
|
82.61 | 1.53 |
|
79.61-85.6 |
|
|
|
||
Beta (GPc) |
|
−1.55 | 2.6 | .55 | −6.64 to 3.54 |
|
|
|
||
Beta (HRd) |
|
−1.67 | 2.36 | .48 | −6.30 to 2.96 |
|
|
|
||
Beta (MDDe) |
|
−0.77 | 1.8 | .67 | −4.31 to 2.76 |
|
|
|
||
Beta (BDf) |
|
0.57 | 2.44 | .82 | −4.21 to 5.36 |
|
|
|
||
Beta (PDg) |
|
−10.77 | 1.75 | <.001 | −14.2 to −7.34 |
|
|
|
||
|
||||||||||
|
102 |
|
|
|
|
— | 0 | 42 | ||
Intercept |
|
1.382 | 0.067 | 1.250-1.514 |
|
|
|
|||
Beta |
|
−0.00 | 0.002 | .35 | −0.006-0.002 |
|
|
|
||
|
107 |
|
|
|
|
— | 12 | 0 | ||
Intercept |
|
1.183 | 0.055 | 1.075-1.290 |
|
|
|
|||
Beta |
|
0.002 | 0.001 | <.01 | 0.001-0.004 |
|
|
|
||
|
112 |
|
|
|
|
26.27 (5) | 0 | 41 | ||
Intercept (HC) |
|
1.405 | 0.031 | — | 1.344-1.466 |
|
|
|
||
Beta (GP) |
|
−0.081 | 0.047 | .09 | −0.173 to 0.011 |
|
|
|
||
Beta (HR) |
|
−0.123 | 0.064 | .06 | −0.249 to 0.004 |
|
|
|
||
Beta (MDD) |
|
−0.035 | 0.041 | .39 | −0.114 to 0.045 |
|
|
|
||
Beta (BD) |
|
−0.098 | 0.064 | .13 | −0.224 to 0.028 |
|
|
|
||
Beta (PD) |
|
−0.192 | 0.039 | <.001 | −0.268 to −0.116 |
|
|
|
aNot applicable.
bHC: healthy control.
cGP: general population.
dHR: high risk for a severe mental disorder.
eMDD: major depressive disorder.
fBD: bipolar disorder.
gPD: psychotic disorder.
The results of the meta-regression analyses of the design characteristics are shown in
Second, the duration of the time interval between successive evaluations within a day was also found to be a significant predictor of compliance (
Results of the meta-regression analyses of the design characteristics.
Design characteristics | k | Estimate | SE | 95% CI | QM test ( |
R² (%) | ||||
|
|
|
|
|
|
|
Study | Group | ||
|
||||||||||
|
104 |
|
|
|
|
—a | 19 | 0 | ||
Intercept |
|
86.23 | 2.75 |
|
80.84-91.61 |
|
|
|
||
Beta |
|
−0.99 | 0.38 | <.01 | −1.73 to −0.25 |
|
|
|
||
|
103 |
|
|
|
|
— | 1 | 0 | ||
Intercept |
|
78.69 | 1.86 |
|
75.04-82.34 |
|
|
|
||
Beta |
|
0.14 | 0.18 | .43 | −0.21 to 0.49 |
|
|
|
||
|
76 |
|
|
|
|
— | 36 | 0 | ||
Intercept |
|
74.8 | 10.41 |
|
54.39-95.21 |
|
|
|
||
Beta |
|
0.28 | 0.76 | .71 | −1.21 to 1.78 |
|
|
|
||
|
71 |
|
|
|
|
— | 51 | 0 | ||
Intercept |
|
71.43 | 3.4 |
|
64.76-78.10 |
|
|
|
||
Beta |
|
0.06 | 0.02 | .02 | 0.01-0.11 |
|
|
|
||
|
83 |
|
|
|
|
— | 0 | 0 | ||
Intercept |
|
81.96 | 2.39 |
|
77.27-86.65 |
|
|
|
||
Beta |
|
−0.15 | 0.1 | .14 | −0.34 to 0.05 |
|
|
|
||
|
103 |
|
|
|
|
6.78 | 22 | 0 | ||
Intercept (semirandom) |
|
78.5 | 1.64 |
|
75.27-81.72 |
|
|
|
||
Beta (random) |
|
−0.63 | 2.29 | .78 | −5.13 to 3.86 |
|
|
|
||
Beta (fixed) |
|
6.7 | 2.95 | .02 | 0.90-12.50 |
|
|
|
||
|
105 |
|
|
|
|
14.98 | 27 | 0 | ||
Intercept (PDAb) |
|
81.14 | 1.38 |
|
78.45-83.84 |
|
|
|
||
Beta (paper-pencil) |
|
−2.90 | 2.24 | .20 | −7.29 to 1.49 |
|
|
|
||
Beta (calls) |
|
6.89 | 4.75 | .15 | −2.43 to 16.20 |
|
|
|
||
Beta (SMS) |
|
−0.91 | 6.06 | .88 | −12.79 to 10.97 |
|
|
|
||
Beta (voicemail) |
|
−12.64 | 8.19 | .12 | −28.69 to 3.41 |
|
|
|
||
Beta (Web-based) |
|
−13.99 | 6.49 | .03 | −26.70 to −1.27 |
|
|
|
||
Beta (mixed) |
|
−16.5 | 7.79 | .03 | −31.77 to −1.23 |
|
|
|
||
|
102 |
|
|
|
|
0.28c | 7 | 0 | ||
Intercept (LSd) |
|
79.03 | 1.45 |
|
76.19-81.87 |
|
|
|
||
Beta (VASe) |
|
−0.84 | 3.45 | .81 | −7.60 to 5.93 |
|
|
|
||
Beta (mixed) |
|
0.98 | 2.48 | .69 | −3.87 to 5.83 |
|
|
|
||
|
43 |
|
|
|
|
— | 23 | 0 | ||
Intercept |
|
75.36 | 2.23 | 70.99-79.73 |
|
|
|
|||
Beta |
|
0.04 | 0.02 | .02 | 0.01-0.09 |
|
|
|
||
|
||||||||||
Evaluations | 111 |
|
|
|
|
|
0 | 1 | ||
Intercept |
|
1.275 | 0.053 |
|
1.171-1.379 |
|
|
|
||
Beta |
|
0.007 | 0.007 | .34 | −0.007 to 0.020 |
|
|
|
||
|
109 |
|
|
|
|
— | 0 | 0 | ||
Intercept |
|
1.329 | 0.036 |
|
1.259-1.399 |
|
|
|
||
Beta |
|
−0.000 | 0.004 | .96 | 0.007-0.007 |
|
|
|
||
|
87 |
|
|
|
|
— |
|
|
||
Intercept |
|
1.358 | 0.186 |
|
0.994-1.722 |
|
|
|
||
Beta |
|
−0.001 | 0.014 | .92 | −0.028 to 0.025 |
|
|
|
||
|
86 |
|
|
|
|
— | 2 | 17 | ||
Intercept |
|
1.36 | 0.06 |
|
1.243-1.478 |
|
|
|
||
Beta |
|
−0.000 | 0 | .71 | −0.001 to 0.001 |
|
|
|
||
|
92 |
|
|
|
|
— | 0 | 2 | ||
Intercept |
|
1.274 | 0.044 |
|
1.188-1.360 |
|
|
|
||
Beta |
|
0.002 | 0.002 | .35 | −0.00 to 0.01 |
|
|
|
||
|
111 |
|
|
|
|
0.17c | 0 | 0 | ||
Intercept (semirandom) |
|
1.322 | 0.031 |
|
1.263-1.382 |
|
|
|
||
Beta (random) |
|
−0.007 | 0.045 | .88 | −0.095 to 0.082 |
|
|
|
||
Beta (fixed) |
|
0.018 | 0.058 | .76 | −0.095 to 0.131 |
|
|
|
||
|
112 |
|
|
|
|
7.22c | 7 | 0 | ||
Intercept (PDA) |
|
1.342 | 0.026 |
|
1.291-1.393 |
|
|
|
||
Beta (paper-pencil) |
|
−0.039 | 0.043 | .36 | −0.124 to 0.046 |
|
|
|
||
Beta (calls) |
|
−0.123 | 0.114 | .28 | −0.346 to 0.101 |
|
|
|
||
Beta (SMS) |
|
0.082 | 0.121 | .50 | −0.155 to 0.318 |
|
|
|
||
Beta (Web-based) |
|
−0.153 | 0.089 | .09 | −0.328 to 0.022 |
|
|
|
||
Beta (mixed) |
|
0.229 | 0.164 | .16 | −0.093 to 0.550 |
|
|
|
||
|
111 |
|
|
|
|
1.55c | 0 | 0 | ||
Intercept (LS) |
|
1.3 | 0.026 |
|
1.248-1.352 |
|
|
|
||
Beta (VAS) |
|
0.062 | 0.07 | .37 | −0.074 to 0.198 |
|
|
|
||
Beta (mixed) |
|
0.047 | 0.045 | .30 | −0.042 to 0.135 |
|
|
|
||
|
52 |
|
|
|
|
— | 0 | 19 | ||
Intercept |
|
1.272 | 0.041 |
|
1.193-1.352 |
|
|
|
||
Beta |
|
0 | 0 | .62 | −0.001 to 0.001 |
|
|
|
aData not applicable.
bPDA: personal digital assistant.
cNot significant.
dLS: Likert scale.
eVAS: visual analogue scale.
Graphical representation of the relationship between the compliance of experience sampling method studies and the frequency of daily self-evaluations.
The aim of the present meta-analysis was to investigate compliance and retention rates in ESM studies including subjects across the spectrum of severe mental disorders and to examine how these outcomes are related to various person characteristics and design aspects. First, we found relatively high mean levels of compliance (ie, 78.7%) and retention (ie, 93.1%) across the included ESM studies. This is in line with previous findings in individuals with chronic pain [
Both the gender composition and the clinical status of the groups were found to predict the degree of compliance and retention in ESM studies. First, the proportion of male participants within a sample was negatively associated with compliance, supporting similar findings in adolescents [
In sum, ESM studies in individuals with a psychotic disorder or in samples with a higher proportion of male participants are at risk for lower compliance and retention rates. To increase compliance and retention, researchers could engage in procedures that aim to maintain the compliance of the participants as described in the review of Morren et al [
We also found a number of design characteristics that were associated with the compliance and retention rates. First, the number of evaluations per day was associated with compliance levels in the ESM studies. On average, for each additional evaluation per day, mean compliance is predicted to fall by approximately 1% point. However, a lower compliance rate with a higher number of evaluations may still result in more data points. For example, according to our results, an ESM study involving 8 evaluations per day would result in an estimated mean of 6.18 completed evaluations/day, whereas a sampling frequency of 7 evaluations per day would result in only 5.48 evaluations/day. This result does not corroborate the findings of previous single studies investigating samples with different characteristics [
Second, the current meta-analysis found no significant association between the number of data collection days and the compliance and retention rates. This result corroborates the absence of an effect of study duration on compliance observed in substance users [
Third, our analyses revealed an association between the ESM sampling strategy and the compliance and retention rate, with fixed sampling schemes resulting on average in higher compliance and retention rates. Although this seems to favor fixed over random sampling schemes to improve the quantity of the data, the choice is not so simple. For instance, Husky et al [
Fourth, we found a positive association between the value of the incentives and the compliance rates in ESM studies, similar to what was reported by Morren et al [
Finally, no significant differences in compliance or retention rates were found between studies using a PDA compared with paper-and-pencil diaries. A similar result was recently reported in a meta-analysis of ESM studies in substance users [
In fact, this point underscores a more general lack of clarity in the description of the methods used in ESM research, an issue previously underlined by Morren et al [
Overall, this systematic review and meta-analysis demonstrate that both the characteristics of the samples under study and the design of ESM studies may influence compliance and retention rates in ESM research. On the basis of these findings, we propose the following recommendations:
There is evidence that compliance and retention rates depend on the characteristics of the individuals under investigation. Samples of individuals with psychosis and a higher number of male participants appear to have a higher risk of lower compliance and retention. The potentially higher loss of data should be taken into account in the sample size calculation preceding any ESM study investigating individuals with these characteristics.
The evidence also suggests that the degree of compliance depends on various design choices in ESM studies.
A higher number of evaluations per day and smaller time intervals between successive evaluations are associated with lower compliance, whereas this is not the case for the number of days in an ESM study. Therefore, it may be worthwhile to decrease the number of evaluations while increasing the number of days, as such obtaining a similar number of data points while maximizing compliance.
The total amount of the incentive was associated with better compliance. Therefore, increasing the amount of the incentive may have a beneficial effect on the compliance of the participants with an ESM study.
The relative lack of transparency in reporting ESM protocols is likely to undermine the replicability of ESM studies and the assessment of their feasibility in severe mental disorders.
We recommend disclosing clearly all aspects of the procedures used in a given ESM study, regardless of their relevance for a given study, including but not limited to the actual number of ESM items participants answered, the amount of time between a signal and the answer of a participant that experimenters used to define compliance with a momentary evaluation, and any exclusion reasons, especially if experimenters exclude participants based on a predefined minimal mean compliance level.
We advise to report both the compliance mean level and the related SD, and the retention rate. When possible, this information should be provided at the group level.
This is the first review to systematically investigate predictors of compliance and retention rates in ESM research in severe mental disorders. However, despite its strengths, this review is not without limitations. First, the inconsistent report of essential information on the design of the ESM studies is likely to have introduced statistical errors in the estimation of the associations.
Second, compliance and retention rates are differently operationalized across studies in the literature. For compliance, evaluations are considered unanswered if the participants responded after 15 min following the trigger in some studies [
Finally, it would have been of interest to examine to what degree potential participants are willing to participate in a study using ESM as a data collection method in the first place (and whether this is associated with certain participant or design characteristics). A brief search of the literature revealed considerable heterogeneity in reported
This meta-analysis constitutes a first step toward the optimization of ESM research. Compliance and retention were associated with the gender and clinical status of the participants. Compliance, but not retention, was also associated with a number of design characteristics. In particular, compliance was lower with higher sampling frequencies but not with the duration of ESM studies, a finding that stands in contrast with current practices in ESM research. This review also demonstrates that ESM studies can be carried out in mental health research, but the quality of the data collection may depend upon a number of factors related to the design of the studies and samples under investigation that need to be considered when designing such protocols.
Supplementary material.
bipolar disorder
ecological momentary assessment
experience sampling method
general population
healthy control
high risk for a severe mental disorder
Likert scale
major depressive disorder
psychotic disorder
personal digital assistant
Preferred Reporting Items for Systematic Review and Meta-Analysis Protocols
visual analogue scale
None declared.