This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
Mobile device-based ecological momentary assessment (mobile-EMA) is increasingly used to collect participants' data in real-time and in context. Although EMA offers methodological advantages, these advantages can be diminished by participant noncompliance. However, evidence on how well participants comply with mobile-EMA protocols and how study design factors associated with participant compliance is limited, especially in the youth literature.
To systematically and meta-analytically examine youth’s compliance to mobile-EMA protocols and moderators of participant compliance in clinical and nonclinical settings.
Studies using mobile devices to collect EMA data among youth (age ≤18 years old) were identified. A systematic review was conducted to describe the characteristics of mobile-EMA protocols and author-reported factors associated with compliance. Random effects meta-analyses were conducted to estimate the overall compliance across studies and to explore factors associated with differences in youths’ compliance.
This review included 42 unique studies that assessed behaviors, subjective experiences, and contextual information. Mobile phones were used as the primary mode of EMA data collection in 48% (20/42) of the reviewed studies. In total, 12% (5/42) of the studies used wearable devices in addition to the EMA data collection platforms. About half of the studies (62%, 24/42) recruited youth from nonclinical settings. Most (98%, 41/42) studies used a time-based sampling protocol. Among these studies, most (95%, 39/41) prompted youth 2-9 times daily, for a study length ranging from 2-42 days. Sampling frequency and study length did not differ between studies with participants from clinical versus nonclinical settings. Most (88%, 36/41) studies with a time-based sampling protocol defined compliance as the proportion of prompts to which participants responded. In these studies, the weighted average compliance rate was 78.3%. The average compliance rates were not different between studies with clinical (76.9%) and nonclinical (79.2%;
The compliance rate among mobile-EMA studies in youth is moderate but suboptimal. Study design may affect protocol compliance differently between clinical and nonclinical participants; including additional wearable devices did not affect participant compliance. A more consistent compliance-related result reporting practices can facilitate understanding and improvement of participant compliance with EMA data collection among youth.
There is a growing interest in studying the dynamic relationship among individuals’ experiences, social or physical environments, and behaviors. The assessment of these dynamic relationships is enhanced by the development of momentary data collection strategies, such as experience sampling methods (ESM) and ecological momentary assessment (EMA) [
EMA studies can be broadly categorized into (1) time-based and (2) event-based designs. These strategies provide different insights about the study participants. The time-based strategy usually aims to acquire representative characteristics and patterns of behaviors and experiences across time, whereas a study using an event-based strategy aims to examine antecedents and consequences of specific behaviors or experiences [
Although collecting momentary data using mobile technologies offers many advantages, these advantages depend on the quality of the collected data. Incorporation of mobile technologies in EMA studies can facilitate momentary data collection with an improved measurement of compliance and possibly in a higher frequency than using more conventional collection techniques. Although this provides an opportunity to understand behavior on a more granular level, systemic missing data (eg, participant noncompliance or engagement in competing activities) still threatens data quality. As stated above, several features of mobile technologies can minimize the impacts of some types of noncompliance behavior (eg, backfilling) on data quality. Nonetheless, as EMA study protocols usually involve participants being repeatedly interrupted and asked to provide self-reported information, these demands on study participants can lead to high perceived participant burden, and to noncompliance [
The first aim of this review is to describe the characteristics of EMA protocols conducted among pediatric populations across a wide spectrum of health behaviors. The second aim is to quantify overall compliance rates and to examine the association between study design factors (length of monitoring period and daily sampling frequency) and reported compliance using a meta-analytic approach. Studies using clinical and nonclinical samples were both included in the review; however, given that study populations and objectives often differ quite substantially for these types of studies, they were examined separately and the results were compared. The exploratory aim of this study is to examine the association between participant compliance and other pertinent study design variables (eg, inclusion of additional wearable devices and incentive structure) on a post hoc basis. Finally, this study will also provide recommendations for future research that incorporates mobile devices in collecting real-time self-reported data to maximize the advantages of EMA methodologies.
A comprehensive literature search was conducted using the publicly accessible academic literature search engines (PubMed, PsycINFO,
Abstracts and full articles of the retrieved titles were screened for relevance. The article selection strategies, inclusion criteria, and exclusion criteria were determined in consensus meetings among authors. In order to be included in the systematic review, studies were required to (1) be an empirical study; (2) employ EMA strategies, including diary methods with more than one entry per day, ESM, and event sampling methods; (3) utilize mobile technologies for EMA data collection (cell phones, PDAs, smartphones, and so on); and (4) include children or adolescent (age ≤18 years) participants. Studies that involved adult participants (age >18 years), in addition to children and adolescents, were only included in the review if separate analytic or descriptive results were presented for the children or adolescents subgroups. Studies with any of the following 5 exclusion criteria were excluded: (1) did not utilize any electronic, wearable, or mobile technology; (2) utilized paper-based diaries to collect momentary data; (3) collected momentary or diary data once or less than once per day during the monitoring period; (4) utilized call-based (phone interview) data collection; and (5) data collection did not take place in free-living natural settings.
A subset of studies meeting the criteria for the systematic review was included in the meta-analysis portion of the study. To meet the criteria for meta-analysis, studies were required to report (1) sufficient information that permitted the calculation of an average compliance rate (eg, percentage of EMA prompts answered) to be used as effect size (ES), (2) number of participants in the study, and (3) daily prompting frequency and length of monitoring period.
Random effects meta-analyses were conducted to (1) examine the average rate of compliance with EMA protocols pooled across all included studies and then across studies with clinical participants and nonclinical participants separately, and (2) to explore potential between
In this review, an adequate calculation of the standard errors (and hence the inverse variance weights used in meta-analysis) is complicated by the fact that EMA studies involve a nested study design, with multiple observations (prompts) clustered in participants. In this case, the effective sample size of each study needs to account for the clustered design. Following the methods recommended in the Cochrane Handbook for Systemic Review of Interventions [
The
Calculation of effect size.
The second goal of this meta-analysis was to examine the association between EMA study design characters (average daily prompting frequency and length of monitoring period) and participant compliance rates. Both average daily prompting frequency and length of monitoring were coded based on information described in the reviewed publications. For studies that employed different frequencies for weekends and weekdays, an average daily prompting frequency was calculated by dividing the total number of times participants were prompted by the number of study days.
The associations between study design variables (ie, length of EMA protocols and sampling frequency) and reported compliance were examined using random effects analysis of variance (ANOVAs) with inverse variance weights. Models that examine the association of compliance with (1) study length and (2) daily prompting frequency were estimated separately for studies with participants from nonclinical or clinical populations. The length of study protocol was operationalized as study length ≤1 week, >1 week and ≤2 weeks, and >2 weeks, and the prompting frequency was operationalized as prompting frequency of 2-3 times per day, 4-5 times per day, and ≥6 times per day to ensure that each category included a sufficient amount of studies for the purposes of comparison. We considered testing the interaction term of study length and prompting frequency, but did not conduct this analysis because there would have been no or very few studies in several categories comprising the interaction effect. All meta-analysis procedures were conducted using Comprehensive Meta-Analysis (version 3, Englewood NJ, USA).
A total of 6826 nonduplicate titles were identified. Of these, 6803 were identified using search engines and 23 were identified through cited work from the articles screened. After reviewing abstracts and full-text articles for inclusion and exclusion criteria, 91 empirical articles representing 42 unique studies were included in the qualitative systematic review and 36 studies were included in the meta-analysis portion of the study. The detailed study selection process is outlined in
The preferred reporting items for systemic review and meta-analysis (PRISMA) diagram.
Among studies included in the systemic review, the average number of participants in the analytic sample across studies was 98.81 (SD 130.66; range 5-562). Across all studies, the average proportion of female participants was 56.4%, where 3 (7.1%) recruited only female participants. Excluding these 3 studies, the average proportion of female participants was 52.7% (SD 18.7%; range 7.6-86.7%). A majority of the included studies (n=26, 61.9%) recruited only participants from community or nonclinical settings. The 16 studies with clinical populations focused on youths with various health conditions: attention deficient/hyperactivity disorder [
The length of EMA protocols ranged from 2 to 42 days (13.27 [SD 9.08]). The average length of monitoring was not statistically different between studies with nonclinical participants (mean 11.4 days [SD 6.9; range 4-30 days]) and clinical participants (mean 16.3 days [SD 11.2; range 2-42 days];
A total of 33 (78.6%) studies utilized only time-based sampling protocols, 1 study (2.4%) used only an event-based sampling protocol, and 8 studies (19.0%) used a combination of both time- and event-based sampling protocols.
Among the 41 studies with a time-based sampling component, prompting schedules included random prompts during predetermined time intervals (n=31, 75.6%; eg, one random prompt for each 2 h interval), prompts at a fixed schedule (n=8, 19.5%; eg, every 30 min during waking hours), prompts at a personalized time (n=1, 2.4%; eg, participant’s own blood glucose check schedule), and one study did not report the prompting scheme. Prompting schemes of which participants were prompted is shown in
Among the 9 studies with an event-based protocol, participants in 8 studies were asked to initiate self-report after occurrence of certain thoughts or emotions such as positive feelings [
A majority of the included studies (95%, 39/41) prompted their participants 2-9 times during each day of EMA data collection. It was found that 2 (4.9%) studies prompted participants more than 25 times each day and 10 (24.3%) studies reported prompting participants in different frequencies on weekdays versus weekend days. Excluding the two studies that prompted participants more than 25 times each day, the average prompting frequency was 4.2 times per day (range 2-9) for studies with nonclinical participants and 3.6 times per day (range 2-7) for studies with clinical samples.
EMA data was collected using electronic diaries (n=1, 2.3%), wearable platforms (n=1, 2.3%), iPods (n=2, 4.6%), PDAs (n=10, 23.8%), palmtop computers (n=12, 28.4%), and mobile phones (n=16, 48.1%). A small proportion of studies (n=5, 11.9%) reportedly used participants’ own phone or mobile phones to implement EMA data collection. Four of these studies sent text messages to participants’ own phones or mobile phones for EMA data collection and one allowed participants to choose between using a mobile phone provided by the study and their own smart devices. A small proportion of studies (n=5, 11.9%) used wearable devices in addition to the EMA data collection platform. Devices utilized in addition to the EMA data collection platform included accelerometers [
A majority of the reviewed studies (n=28, 66.67%) reported the strategy used for incentivizing study participants. Among these studies, most of them (n=26, 92.86%) provided monetary incentive to their participants. Two studies reported using other nonmonetary incentive strategies, for example, raffle [
Among studies with a time-based sampling protocol component (N=41), the majority (n=36, 87.8%) defined participant compliance as percentage of prompts to which participants responded. Two studies included response latency, or the time difference between a prompt and participant’s response to that particular prompt, as part of the definition of compliance (eg, percentage of prompts responded within 30 min of the first notification [
Among studies with an event-based sampling protocol component (n=9), the majority (n=8, 88.9%) asked participants to initiate self-report. These events to initiate self-reports included occurrence of behaviors [
Among studies with clinical participants (n=17), 8 examined correlates of compliance and reported no significant association between prompt completion rate and day of the week [
Reported significant correlates of compliance among studies with nonclinical participants included gender [
A total of 36 studies with a time-based EMA protocol were included in the meta-analysis portion of the study. After accounting for the cluster effect of momentary assessments within participants, the average compliance rate across the included studies was 78.26% (95% CI 75.49-80.78%), and the average compliance rate was not associated with the average age or gender proportion of the study participants. The average compliance rates were not statistically different between (1) studies with EMA data collected using one mobile platform (77.44%, 95% CI [73.59-80.88%]) compared with studies using a mobile platform with additional wearable devices (n=5, 73.00%, 95% CI [61.75-81.91%];
Daily prompting frequency significantly moderated the compliance rates among clinical (
Prompting frequency by intensity category.
Setting | Prompting frequency |
n | Compliance (95% CI) | |||
Clinical | 2-3 times | 11 | 73.47 (67.45-78.73)d | 0.74 | 14.97 (df=12) | 19.82 |
4-5 times | 4 | 66.94 (53.50-78.09)e | ||||
6+ times | 2 | 89.28 (78.83-94.90)c | ||||
Nonclinical | 2-3 times | 6 | 91.73 (85.48-95.44)g | 0.44 | 38.27 (df=18) |
52.96 |
4-5 times | 13 | 74.42 (59.37-85.29)e | ||||
6+ times | 5 | 75.00 (59.21-86.12)f |
a
c
c
d
d
e
f
d
Length of monitoring by week.
Settings | Length of EMAa monitoring | n | Compliance (95% CI) | |||
(number of weeks) | ||||||
Clinical | 1 | 6 | 78.13 (64.37-87.61) | 0 | 26.67 (df=12) | 55.01 |
2 | 5 | 73.46 (53.74-86.84) | ||||
3+ | 6 | 75.47 (56.86-87.78) | ||||
Nonclinical | 1 | 14 | 75.81 (70.39-80.52) | 0.11 | 51.43 (df=18) | 65 |
2 | 5 | 76.77 (61.30-87.33) | ||||
3 | 5 | 83.95 (74.69-90.71) |
aEMA: ecological momentary assessment.
bQresidual: test for residual between-study variance (not explained by the moderator) against zero.
c
There were no significant differences in reported compliance between studies that engaged participants in an EMA protocol for 2 and 3 or more weeks compared with studies that engaged participants for 1 week or less, among both studies with participants from clinical (
The aim of this study was to provide an up-to-date review of evidence on youths’ compliance to real-time EMA protocols operated on mobile platforms. Interest in using EMA with mobile technology in youth is growing rapidly, as documented by the sizable number of mobile-EMA studies conducted to capture various aspects of youth’s life. In the reviewed studies, we estimated an average compliance rate of 78.3% across studies using time-based prompting protocols. Although this rate is comparable with the rate of 71% (range 44-96%) observed by Liao et al [
The study designs varied considerably both in terms of the overall length of EMA monitoring and in terms of the frequency with which youths were prompted to complete momentary assessments per day. This allowed us to examine whether these specific EMA study design factors moderate compliance rates. Our meta-analytic findings provided evidence that the compliance rates are significantly different among studies of different daily frequency of assessments. Importantly, although being significant for both nonclinical and clinical samples, the effect was in opposite directions for clinic and nonclinic participants. Among the 17 studies with participants from
We can only speculate on the potential reasons for this result. One possibility is that studies in nonclinical and clinical settings differ in the content of the questions and how meaningful they are to respondents. Clinical studies commonly tap into medical and disease-related aspects of daily life that may be intrinsically relevant to the young patients. On the other hand, the content of EMA prompts in nonclinical studies may appear less intrinsically relevant to respondents, which may decrease compliance when the assessments are more frequent.
On the other hand, the meta-analytic results indicate that the overall compliance rates were similarly moderate among studies with different lengths (number of weeks) of EMA monitoring in either setting. However, as several reviewed studies with clinical [
Several studies with a time-based sampling protocol in this review incorporated wearable or deployable devices such as accelerometers [
Among the event-based protocols reviewed, only a small proportion of studies (n=1, 12.5%) used a protocol that emits event-based prompts based on objectively measured behavior of interest using wearable devices [
Another finding from this review is that there are areas where compliance-related results and procedures were inadequately or inconsistently reported. First, among the time-based protocols, participant compliance was considered to be synonymous with average prompt completion rate (ie, mean percentage of prompt answered). The distribution of compliance rates is often negatively skewed (with the mass of the distribution concentrated at the higher end). If this is the case, the arithmetic mean provides a conservative representation of overall compliance in the sample, and robust measures of central tendency (median or geometric mean) should be reported. In addition, this relatively vague definition of compliance does not account for important information, such as response latency, that could allow for assessment of response patterns or approximation of item cognitive load. Response latency, or the time difference between a prompt and its corresponding response, is especially relevant when assessing experiences that are time-varying and context-dependent (eg, pain, emotion, and hunger). For example, past emotions or experiences like pain are prone to be distorted by events or experiences occurred during the active reconstruction process of recall [
Second, although a number of studies examined correlates of compliance (ie, quantitative assessment of compliance) and participant-reported reasons for noncompliance (qualitative assessment of compliance), results of both quantitative and qualitative assessment of compliance were inconsistently reported across studies, in part, because the data may not have been collected in the study. Obtaining and reporting information about how individual (eg, age and gender), technological (eg, software malfunction, device power depletion, and network connectivity), or time-varying (eg, time of day, environmental factors, and activities) factors relate to compliance and to missing data is important for at least two reasons. For one, identifying these factors is necessary for improvement of compliance in future EMA data collection. For example, by understanding which participant groups need to be specifically targeted (and when they need to be targeted), improved retention strategies can be formulated to address the challenges unique to participants of specific demographic groups and to facilitate overall compliance rates. In addition, without this information one cannot determine if missing data in EMA studies is merely random “noise” or if it is systematically linked to individual or situational characteristics. Systematic noncompliance is clearly a major threat to the validity of conclusions and analytic steps can be taken to attenuate the bias if the attributing factors are known. Therefore, we encourage future EMA studies to report both quantitative and qualitative compliance results.
Third, several studies reported compliance rates only among those in the final analytic sample, after removing participants with low compliance. Many studies provided rationales for excluding low- or noncompliant participants. Nonetheless, the compliance rate reported with these subsamples can be viewed as inflated and would be likely to affect our ability to accurately estimate the average compliance across studies. Therefore, we recommend that future studies report compliance rates before and after removing the participants from analyses to enhance transparency of the analysis process.
A major strength of this study was that we were able to quantitatively assess the compliance rates for mobile-EMA studies of various health-related behaviors and the association between the reported prompt completion rate and some design factors. Our findings, however, only pertain to two aspects of a real-time EMA protocol (ie, prompting frequency and sampling length) that may affect participants’ compliance [
Using mobile technologies as data collection platforms in EMA studies has demonstrated generally moderate, but suboptimal, compliance rates among the youth population. In this review, we have further identified that sampling intensity, a possible proximate of participant burden, might impact compliance of participants from different settings. The study results suggest that youth from nonclinical settings may comply better with mobile-EMA protocol with a lowered daily prompting frequency, whereas youth from clinical settings comply better otherwise. Nonetheless, the nonexperimental nature of this review limits our ability to make further recommendations and highlights the need for experimental studies to investigate the impact of these study design factors on participant compliance. Moreover, this review identified several areas of compliance-related results that are currently inconsistently or inadequately reported among the reviewed studies. This suggests the need for thorough reports of participant compliance, which would potentially advance the current understanding of participants’ compliance to EMA protocols and to aid development of future EMA study designs. Therefore, we suggest that future studies use the proposed reporting guidelines by Liao et al (
We further emphasize the importance for future studies to report results in several areas that have been most inconsistently reported. These areas include (1) the reporting of EMA design features that were used to reduce participant burden or potentially improve data quality (eg, minimizing item “over exposure” by administering items in rotated order); (2) the number of prompts delivered and actually received by the participants, and whether nonresponse was due to technical issues or participant noncompliance; (3) response latency, or the amount of time from prompt signal to prompt answering; (4) distributional characteristics of noncompliance rates (ie, standard deviation and skewness of participant prompt completion rates), participant compliance results based on the full sample to improve the transparency and consistency in reporting prompt response rate; and (5) demographic and time-varying correlates of EMA compliance. Furthermore, we suggest that future studies should incorporate the time-frame information when defining participant compliance. As one of the central promises of EMA the collection of data with a reduced recall bias, providing this information could aid future studies and meta-analytic reviews to determine the effect of latency on data collected, which may further improve the current understanding of participant compliance.
Study characteristics.
Ecological momentary assessment (EMA) study with time-based design.
Ecological momentary assessment (EMA) study with event-based design.
analysis of variance
ecological momentary assessment
effect size
experience sampling methods
Global Positioning System
high function autism and Asperger’s syndrome
intraclass correlation coefficient
intelligence quotient
juvenile idiopathic arthritis
personal digital assistant
preferred reporting items for systemic review and meta-analysis
type 1 diabetes
Author CKFW is supported by the University of Southern California Ph.D Fellowship and the National Institute of Health (NCCIH: R01AT008330). Author SS and AAS are supported by the National Institute of Health (NIAMS: R01AR066200 and NIA: R01AG042407). Author DSM is supported by the National Science Foundation CISE/SCH (1521722) and the National Institutes of Health (U24OD23176).
None declared.