Original Paper
Abstract
Background: Accurate measurement of physical activity (PA) is key to identifying determinants of health and developing appropriate interventions. Self-reports of PA (eg, in surveys or diary studies) often suffer from measurement error. Providing study participants with wearable devices that passively track PA reduces reactivity and recall error but participants’ noncompliance and high device costs are problematic. Many older adults now have smartphones that track PA. Based on legal requirements, data controllers (eg, health apps) must provide users with access to their data, and individuals can request and donate these data for research. This user-centric approach provides researchers with access to individual-level data, and it gives users control over what data are shared.
Objective: We conduct a first test of the data donation approach for PA data among older adults. We study (1) how willing and successful older adults are to donate their PA data from different smartphone apps, (2) what drives donation of PA data at the different stages of participation, and (3) what biases arise from selective data donation.
Methods: To answer our research questions, we use cross-sectional observational data from a probability-based online panel of the Dutch general population. A total of 2086 members of the Longitudinal Internet Studies for the Social Sciences panel aged 50 years and older completed a web survey in 2024. All iPhone and Android smartphone owners were asked to download passively collected PA data from their devices (Apple Health, Google Location History, or Samsung Health) and donate them via the Port platform.
Results: Out of the 2086 survey participants, 1889 (91%) reported owning an iPhone or Android phone compatible for data donation, 606 (29%) reported willingness to donate PA data, 354 (17%) started the data donation, and 256 (12%) successfully provided a data package. Gender, age, educational attainment, monthly personal net income, smartphone usage behavior, privacy- and trust-related attitudes, and type of health app from which the data were requested correlated with behavior at the different stages of study participation. Self-reported reasons for nonwillingness to donate related mainly to expected technical issues, privacy concerns, and perceived usefulness. Compared with the entire sample, data donors reported better health, fewer health-related limitations, fewer difficulties performing tasks, and more PA.
Conclusions: Our study shows that data donation from smartphones as part of a probability-based web survey of older adults is a feasible alternative for the measurement of PA, especially for iPhone owners younger than 70 years. Limitations relate to nonparticipation which correlates strongly with characteristics of smartphone ownership and comfort with device use. Substantive bias in health and PA outcomes persists for those who donated in comparison with all survey respondents.
doi:10.2196/69799
Keywords
Introduction
Engagement in physical activity (PA) is the foundation of a healthy lifestyle, elevated immune and psychological function, and decreased mortality [,], especially for aging populations []. While the accurate measurement of PA is key to identifying determinants of health and developing appropriate interventions, reliable and valid measurement of PA is not trivial.
Traditional Methods of Measuring PA
To measure PA, researchers usually rely on participants’ self-reports []. Large population studies often use standardized PA inventories such as the International Physical Activity Questionnaire []. In the Netherlands, the Short QUestionnaire to Assess Health enhancing physical activity is commonly used in the assessment of PA levels [,]. While these instruments allow for a relatively fine-grained measurement of different types of PA, they are prone to measurement error due to social desirability, forgetting, and wrongful estimation of the frequency and duration of activities [,]. To reduce respondent burden and save questionnaire space, sometimes global measures of PA (eg, average daily hours of moderate or vigorous activity and sedentary behavior) are applied. Apart from the loss of information compared with more fine-grained instruments, one major limitation of these global measures is that they can suffer from misclassification (eg, walking the dog not considered PA) [,].
As an alternative to survey-based PA measures, researchers instruct study participants to document all behaviors at fine-grained temporal levels in a diary, the so-called day reconstruction method []. While this method should, in theory, provide more accurate estimates of PA than retrospective survey measures, people’s tendency to postpone filling out the diary can produce recall error []. Diary studies also induce high respondent burden leading to participant fatigue and dropout [], one of the reasons why they suffer from particularly low response rates []. To reduce burden, diary studies are usually restricted to short reference periods (eg, 1 week).
Recent technological advancement in wearable technology has led to an increase in the use of sensor-based measurements of PA. Researchers now often provide study participants with wrist-, waist-, or thigh-worn accelerometers [-], or they specifically recruit people who already own consumer-grade wearables such as smartwatches and fitness trackers []. The advantage of passively tracking PA via sensors is that it allows for continuous, fine-gained measurement that alleviates reactivity and error due to forgetting leading to higher accuracy compared with self-reports []. However, handing out devices to study participants comes with high costs and noncompliance among participants who are not used to wearing an unfamiliar device [,]. Relying entirely on samples of people who bring their own tracking devices leads to the underrepresentation of certain subpopulations, such as men, older people, and those with formally lower educational attainment and lower income, as well as minorities [,] and an overrepresentation of people with better health status [].
In this paper, we will introduce the concept of data donation [] as an alternative for the measurement of PA in older adults. Data donation builds on the legal requirement of data-collecting platforms (such as health apps) and devices (such as fitness trackers) to give their users access to their own data. The approach combines the advantages of objective measuring PA with sensors from a device that is widely used in the general population—the smartphone—with giving study participants full control over what data are shared with the researchers. While originally developed and applied in the field of communication research [], we will assess its feasibility for the measurement of PA among older adults in an empirical study with a large sample of members of a probability-based online panel in the Netherlands.
Data Donation as an Alternative for PA Measurement
Overview
Data donation describes a data collection method where people voluntarily contribute their retrospective personal data that were originally generated for a different purpose to research []. This data collection approach has gained traction through recent legislative changes such as the European Union’s General Data Protection Regulation [] and the California Consumer Privacy Act [] that grant data subjects (ie, individual users) the right of access and transportability to a copy of their personal data from a data controller (eg, health apps, social media platforms, etc) in a machine-readable format. In the literature, the term “Data Download Packages” (DDPs) [] is now commonly used for the copies of data that a data subject can retrieve from a data controller and share with others, including researchers.
Going directly to the users and asking them to retrieve and donate their DDPs from smartphone apps has several advantages over other methods of digital trace data collection. First, data donation is a user-centered approach that allows researchers to circumvent the increasingly restrictive access that data-controlling platforms have put on Application Programming Interfaces, which used to be a convenient resource for researchers to access individual-level data in the past [-]. Second, asking for data donation from smartphones, a device widely used in the general population across the globe [-], provides access to data from a much broader population than recruiting people who own smartwatches and other wearable devices []. Third, implementing the data donation task as part of a survey allows researchers to further link attitudinal data and other information not recorded in the DDP []. Finally, the active involvement of the study participants when downloading and donating retrospective data for research gives them control over what data are actually shared with the researcher. Designated data donation platforms such as the Open-Source Data Donation Framework [] and Port [] enable researchers to execute scripts that preprocess DDPs (eg, aggregate data and delete all personally identifiable information) locally on a user’s device and give study participants the option to review and curate their data before donation. Data donation is thus considered a more ethical way of collecting personal digital data than other passive data collection methods that continuously collect prospective data in the background of the device [].
When conceptualizing data donation as a technology, we can use the Technology Acceptance Model (TAM) [,] and its extension [] to explain participation behavior, similar to other studies that use smartphone apps for data collection []. According to the TAM, technology adoption is a function of perceived ease of use, perceived usefulness, and perceived trustworthiness of the technology. Data donation requires participants to perform a series of steps: requesting and retrieving their DDPs from a data controller and subsequently donating them for research []. Depending on the platform from which the DDPs are downloaded, these tasks might vary in complexity, meaning that researchers will need to provide clear and detailed guidance, especially for less tech-savvy participants [,,]. The potentially rather onerous multistep process of data donation (ease of use) and concerns about the sensitivity of the data (trustworthiness) might reduce the willingness of individuals to donate PA data for research. At the same time, individuals might be motivated by the information they receive as part of the data donation process about their own user behavior or a monetary incentive (usefulness). However, if certain subgroups are more or less likely to donate their data, then systematic nonparticipation error (ie, bias) might arise.
So far, the literature on data donation as a method for digital data collection has focused primarily on applications in communications research with generally younger users of specific social media platforms, including Facebook [-], Instagram [,,], and WhatsApp [,]. It is well known that digital skills are negatively correlated with age [-]. However, the adoption of technology, in particular smartphones, among older age groups has grown markedly recently []. It is thus crucial to understand whether data donation is feasible in populations of older adults, what drives the decision to donate data in this population, and what consequences selective nonparticipation has for data quality.
Data Donation Participation Rates
From research with users of specific social media platforms, we know that while a relatively large proportion of individuals initially states that they are willing to donate their data, a much smaller proportion usually successfully does so. For example, in a study with 388 adolescent Instagram users in the Netherlands, 287 (74%) obtained parental assent, 148 (38%) gave informed assent, and 102 (26%) eventually uploaded 110 useable Instagram DDPs []. In a study with 913 German Facebook users, 725 (79%) reported being willing to donate Facebook data, and eventually 345 (38%) provided at least 1 DDP []. Another 2 studies in Germany showed that while 63% and 52% of participants reported an intention to donate data from 1 of 4 social media platforms, only 20% and 12% actually donated their data [].
So far, few studies have collected PA data using data donation. Toepoel et al [] asked 1119 Dutch smartphone and fitness tracker owners about their willingness to copy daily PA data from the device into a questionnaire or upload it to a portal. Willingness to upload varied between 33% and 49% depending on whether participants used only the smartphones to monitor PA or had an activity tracker. Kompier et al [] invited 615 members of the Longitudinal Internet Studies for the Social Sciences (LISS), a probability-based online panel in the Netherlands, to wear an ActivPAL PA tracker for 1 week. A total of 503 (82%) panel members did so of whom 221 (45%) indicated that they also wore their own PA tracker (Apple watch, Samsung watch, Garmin watch, or a Fitbit) during the study period. Those who wore their own devices were asked to manually copy the data from their personal PA tracker for 1 week into a web survey, of whom 155 (221, 70%) did so. Sixty-one individuals owned a Fitbit and were then also asked to provide a DDP from Fitbit as part of the survey, which 34 (15% of those who wore their own device) individuals did. Silber et al [] asked 2040 German smartphone owners to donate their health app data. Out of 1020 iPhone owners, 374 (37%) consented to donate Apple Health data, and 94 (9%) successfully provided a DDP. Out of 1020 Samsung smartphone owners, 425 (42%) consented to donate Samsung Health data, and 26 (3%) successfully provided a DDP.
In summary, earlier research finds different donation rates ranging from 3% to 49%. These studies differed in a number of characteristics, including the type of data requested and the study population. With our first research question (RQ), we extend the existing research on data donation in 2 directions: first, we specifically study willingness and data donation behavior among older adults, and second, we focus on PA data from health apps on smartphones.
- RQ1: How willing and able are older adults to donate their PA data from different smartphone apps?
Predictors of Data Donation at Different Stages
Several studies have shown that different sociodemographic (eg, age, educational attainment, and income), behavioral (eg, frequency of technology and media use), and attitudinal characteristics (eg, privacy concerns and trust in research organizations) predict whether individuals donate their data for research. However, the findings are not always consistent in how these variables influence the donation decision [,,,]. One possible explanation for this inconsistency in findings is that the global assessment of the data donation outcomes might obscure the mechanisms at play at the individual stages of the process. Consenting to a request to data donation for research and actually going through the process of downloading data and successfully donating them are conceptually separate and independent steps that might be influenced by different factors in line with the TAM. Studies that focused on the first step by asking about the (hypothetical) willingness of people to provide digital personal data for research (either through data donation or other means of data collection) show that concerns about privacy of the data and security of the data-handling process as well as trust in the organization that collects the data (ie, trustworthiness of the technology) play an important role in the initial decision to consent to study participation [,,-]. This is corroborated by self-reports of people who declined the request to participate in actual data donation studies [,]. The success of the second step, that is, donation conditional on consent, is likely influenced by how cumbersome the process is and whether respondents encounter any technical problems [,,] (ie, ease of use of the technology). Thus, factors such as age, education, and familiarity and comfort with technology might be predictive of successful data donation at this stage.
In our study, we expand the existing literature by deliberately separating out the individual steps in the data donation process and studying the factors that influence participation outcome in a more nuanced way.
- RQ2. What drives donation of PA data at different stages of participation?
As part of the investigation into the factors that drive data donation, we are also interested in the effect of showing respondents a stylized example of the data they will be donating as part of the request. Studies on the willingness to share and actual sharing of prospective data from apps and wearables show that participants are more likely to do so if they are promised to have agency over their data, for example, when given the option to view data, revoke consent, receive feedback, and temporarily turn off data collection [,,]. We might find similar effects for donation of retrospective PA data. Seeing an example of what data will be shared with the researcher might also increase the perceived usefulness of the data donation task in line with the TAM. At the same time, showing a visualization of PA data to potential study participants might have an adverse effect: seeing how these data look may trigger privacy concerns among participants that lead them to decide not to donate altogether.
- RQ2a. What effect does showing respondents an example of the type of PA data they will be donating as part of the request have on the data donation behavior?
Bias Due to Selective Data Donation Participation
Eventually, low rates of willingness to donate and actual donation are just 1 of 2 components determining whether the results of a study suffer from bias. Another factor to consider is whether the people who successfully donate their data differ systematically from the target population in the variables of interest of a study [,]. However, few data donation studies have assessed whether the sample of donors is biased in the variables that should be measured with the donated data. In a study among Facebook users, no systematic differences in self-reported frequency of Facebook use were found between donors and nondonors []. Another media and communications study found that politically interested and left-leaning respondents and those with higher social media use were more likely to donate data from social media platforms []. Studies in the health domain often suffer from a “healthy volunteer bias,” that is, those who decide to participate in health research tend to be, on average, more health conscious than nonparticipants [-]. With our study, we expand the current literature on data donation by exploring not only bias regarding sociodemographics and privacy-related concerns but also substantive PA and health outcomes.
- RQ3. What bias, if any, does arise from selective data donation participation?
Methods
Web Survey
To answer our RQs, we implemented PA data donation in a web survey among members of the LISS panel. The LISS panel is a probability-based online panel in the Netherlands that administers monthly web surveys to respondents who were recruited through drawing an initial random sample from the Dutch population register in 2007 []. Since the original recruitment, the LISS panel has been refreshed several times by drawing stratified and random samples of the Dutch population []. In January and March of 2024, 2345 members of the LISS panel aged 50 years or older were invited to participate in the web survey. The questionnaire contained questions about sociodemographics, self-rated health, chronic illness, health-related difficulties and limitations, PA in the past 7 days, general privacy concerns, perceived privacy of different types of data, trust in various institutions, and smartphone ownership and use (see for the full questionnaire). Respondents who indicated that they own an iPhone or an Android smartphone were asked at the end of the questionnaire whether they would be willing to donate PA data from their smartphones.
Out of the 2345 invited LISS panel members 2094 (89.3%) started the survey. A total of 2086 (89.9%) members completed the questionnaire up until at least the section on smartphone ownership and use right before the data donation section—the analysis sample for our study (see for descriptive statistics of the sample). To test whether our analysis sample differs from the gross sample invited to the survey, we conducted additional analyses using data from the LISS panel sociodemographic profile data. We find small effects only for age and employment status and correcting for them using weights in our analytical models does not change the substantive conclusions we draw from the findings ().
Data Donation
Respondents were asked to donate PA data from 1 of 3 apps, depending on their smartphone. Respondents who reported owning an iPhone were asked to donate daily step count data from the Apple Health app. Android smartphone owners who had indicated owning a Samsung smartphone and using the Samsung Health app were asked to donate daily step count data from the Samsung Health app. All remaining Android smartphone respondents were asked to donate daily traveled distance data from Google Location History. Note that while Google Location History also includes information on tracks, locations, and places [], we requested information only on the daily distance traveled in kilometers. Depending on how long respondents had used an app, data from up to 2017 until the day of the request made to the data controller (ie, Apple, Samsung, or Google) could be donated.
As part of the introduction to the data donation task, we implemented an experiment varying whether or not respondents saw an example screenshot of the PA data they would donate (RQ2a; ). The screenshot shows a stylized example of a dummy dataset with dates and number of steps per day (for Apple Health and Samsung Health) or daily distance traveled (for Google Location History). The image shown in the experiment has the exact same design as the last screen that study participants would see right before donating their data. For all 3 app conditions, one random half of the respondents received the data donation introduction with and the other half without the screenshot.

Those who were not willing to donate were asked for their reason for refusal in an open-ended question. Those who agreed to donate their PA data were guided through the data donation process via multiple pages including screenshots and detailed instructions in the web survey. Depending on the app from which the data were to be downloaded, the description included 9 (Apple Health), 11 (Google Location History), or 17 (Samsung) steps. After walking respondents through the data donation process, respondents were asked whether they were able to successfully request the DDP from the data controller and store it on their phone (for Apple and Samsung). For Google, respondents needed to wait to receive an email that the data were ready until they could continue with the questionnaire. The LISS panel sent up to 2 reminders to panel members who had not completed all parts of the study.
Those who had successfully accessed a DDP received a URL and a QR code that sent them to the data donation portal. The data donation was implemented on the Port platform [] where the DDP was locally processed on the users’ device, meaning that the data did not leave respondents’ device at this stage. At this point, the DDP was stripped of all personally identifiable information and data were aggregated to daily step counts (Apple Health and Samsung Health) or daily distance traveled in kilometers (Google Location History). Before the actual data donation step, that is, before uploading the data extracted from the DDP to the server, respondents were shown a table with all the information that would be shared with the researchers, and respondents were allowed to delete specific entries. To finally send the data to the researchers, respondents needed to click “donate” or they could abandon the data donation at this stage. After the data donation, respondents were automatically rerouted to the web survey, and they were asked whether they were successful in donating the data, whether they experienced any problems when trying to donate the data, and if so, what kind of problems.
Ethical Considerations
All procedures of the study were approved by the ethics committee of the Faculty of Social and Behavioural Sciences of Utrecht University (23-0494). Informed consent was obtained twice: first, when participants joined the LISS panel (covering panel participation and data use), and again at the start of the web survey (covering this specific study). Participants could opt out of the study at any point. PA data from smartphones were collected only from those participants who requested and downloaded their data from the platform and then donated their DDPs. In a preprocessing step locally executed on the participants device, DDPs were stripped of all personally identifiable information and only aggregated data (daily step counts or daily distance traveled in kilometers) could be donated. Anonymized data for this project are available in the LISS Data Archive []. Respondents received an equivalent of €15 per hour for completing the survey. As an additional incentive, respondents received €5 for a successfully donated data package. On January 1, 2024, €1 was valued at US $1.08.
Statistical Analysis
All analysis was conducted using R (version 4.5.1; R Core Team) [], and all analysis code can be found on the web []. To answer RQ1 about participation rates in the data donation study, we calculate the share of respondents who (1) reported owning an iPhone or an Android smartphone necessary to participate in the data donation study, (2) reported being willing to share their PA data, (3) started the data donation process on the Port platform, and (4) successfully provided data.
To answer RQ2 about the predictors of participation, we estimate 4 logistic regression models, 1 for each of the 4 outcomes from RQ1 as the dependent variable. We use five sets of predictors in our models: (1) sociodemographic characteristics (gender, age, educational attainment, employment status, monthly personal net income, household size, and urbanicity), (2) privacy- and trust-related measures (general privacy concern and 3 indexes for perceived privacy of personal information, trust in government and research institutions, and trust in technology companies), (3) health measures (self-rated health, chronic illness, classification based on BMI [], and indexes for health-related limitations in 3 domains and for difficulties performing 10 different tasks in everyday life), and (4) PA-related variables (number of days with strenuous and moderate PA in the past 7 days; number of days with walking, biking, and running in the past 7 days; number of hours of sedentary behavior per day in the past 7 days; and dummy for having spent time outside the home yesterday). Except for model 1 (predicting ownership of a smartphone) we also included (5) number of smartphone activities and type of data requested as predictors. To answer RQ2a about the effect of showing an example of the data on donation behavior, we added a dummy variable to models 2-4 that indicated whether or not a respondent was shown an example of data before the willingness question.
For some categorical variables, we collapsed categories due to small cell counts. provides an extensive overview of all original and recoded variables and the creation of indexes. We used casewise deletion for missing values in our models. For ease of interpretation, we calculated average marginal effects (AMEs) using the margins package in R [], and we report AMEs and their 95% CIs to quantify the relationship between the independent and the dependent variables in our models. Given the large number of predictors in our models, we also specify more parsimonious models that include only the sociodemographic variables together with all the variables from 1 of the 4 other thematic blocks (privacy and trust, health, PA, and smartphone-related variables). The results of these models () show that our findings are robust, and we thus interpret the results of the full models here.
Based on the answers to an open-ended question asked to people who were not willing to donate their PA data, we further analyze reasons for refusal to participate in the study. We code the answers using an updated coding scheme developed for an earlier data donation study [].
To study bias due to selective nonparticipation (RQ3), we compare the full sample including all respondents to the web survey with the individuals who provided PA data on all variables that were used as predictors in models 1-4. To facilitate interpretation of bias, we created terciles (low, medium, and high) for all noncategorical variables (ie, all index measures, number of days respondents had performed different PAs, and number of sedentary hours; ) to show potential overrepresentation and underrepresentation among donors in these groups. Following earlier work [-], we calculate bias in a variable of interest (y) as the difference in proportions between the full sample (f) and the donors (d) as
The SE of these differences is calculated as
where nd denotes nondonors.
The results of the multivariate models provide information about which variables predict participation when controlling for various factors, allowing for theoretical exploration of participation decisions (RQ2). The bias measures will demonstrate from a practical perspective what systematic errors arise in these variables, in particular, the ones that ought to be measured with the donated data, given selective participation.
Results
Data Donation Participation Rate
As shown in , out of the 2086 respondents in the survey, 1889 (91%) reported owning an iPhone or an Android smartphone necessary for PA data collection. Between the question about smartphone ownership and the willingness to donate data question, 6 participants dropped out of the survey, leaving 1883 who were asked to donate PA data. About one-third of these respondents (606, 29% of the full survey sample) indicated willingness to donate data and were then guided through the data donation process. A total of 354 respondents (17% of the full survey sample or 58% of the willing sample) started the data donation process on the Port platform. About three-quarters of respondents who started the donation process successfully donated data (256, 12% of the full survey sample).
| Sample | Values, n (%) |
| Full sample | 2086 (100) |
| Owns iPhone or Android phone | 1889 (91) |
| Willingness question asked | 1883 (90) |
| Willing to donate data | 606 (29) |
| Started donation | 354 (17) |
| DDPa received | 256 (12) |
aDDP: Data Download Package.
Predictors of Data Donation at Different Stages
Four logistic regression models predict participation behavior in the various stages of data donation conditional on the previous stage: owning an iPhone or an Android smartphone (model 1), willing to donate data conditional on owning an iPhone or an Android smartphone (model 2), starting data donation conditional on willingness (model 3), and successfully donating data conditional on starting data donation (model 4). The main results of the 4 models are summarized in . The full regression results for all 4 models can be found in .
Model 1 (Table S1 in ) shows that for our sample of adults aged 50 years and older in the Netherlands, the predicted probabilities of owning an iPhone or an Android smartphone among men are 5 percentage points (PP) lower than those among women (AME –0.05, 95% CI –0.08 to –0.02). Device ownership decreases significantly with age; compared with the youngest age group in our sample (aged 50-54 years), the predicted probabilities of owning an iPhone or an Android smartphone are 8 PP (AME –0.08, 95% CI –0.14 to –0.03) and 19 PP (AME –0.19, 95% CI –0.27 to –0.12) lower for the 2 oldest groups, those aged 75-79 years and 80 years and older, respectively. While household size, urbanicity, and employment status are not significantly associated with owning the necessary device, ownership significantly increases with educational attainment and monthly personal net income. For example, the predicted probabilities of owning the device among the highest income group (more than €3000 per month) are 11 PP (AME 0.11, 95% CI 0.05-0.17) higher than those among the lowest income group (up to €1000 per month). For people with high educational attainment, the predicted probabilities are 6 PP (AME 0.06, 95% CI 0.03-0.10) higher compared with people with low educational attainment. Interestingly, we find that people who say that they are not very concerned (AME 0.06, 95% CI 0.01-0.11) and those who say they are a little concerned (AME 0.08, 95% CI 0.03-0.14) about their privacy in general have a significantly higher likelihood of owning an iPhone or an Android smartphone than those who are not at all concerned. We find very little relationship between owning an iPhone or an Android smartphone and health-related and PA-related measures. One exception is that people who score higher on the index based on difficulties with 10 tasks have significantly lower predicted probabilities of owning a smartphone; with every additional point on the index, the predicted probability decreases by 4 PP (AME –0.04, 95% CI –0.07 to –0.02).
Based on model 2 (Table S2 in ), we find that men are significantly more likely to be willing to donate their PA data than women (AME 0.07, 95% CI 0.02-0.11). Likelihood to positively respond to the data donation request significantly decreases with age (75-79 years: AME –0.14, 95% CI –0.24 to –0.03; 80 years and older: AME –0.14, 95% CI –0.26 to –0.02), household size (2-person household: AME –0.07, 95% CI –0.12 to –0.01; 3- and more person household: AME –0.10, 95% CI –0.16 to –0.03), and degree of urbanicity (moderately urban: AME –0.08, 95% CI –0.15 to –0.00; strongly urban: AME –0.11, 95% CI –0.18 to –0.05; and very strongly urban: AME –0.10, 95% CI –0.18 to –0.03). Compared with people with low educational attainment, the predicted probabilities of reporting willingness to donate are 10 PP higher for people with high educational attainment (AME 0.10, 95% CI 0.04-0.16). Willingness to donate decreases with higher values on the index for perceived privacy of personal information (AME –0.04, 95% CI –0.06 to –0.02). Higher scores on the trust toward government and research organizations index are associated with more willingness to donate (AME 0.05, 95% CI 0.01-0.09). We find significantly higher likelihood of willingness among people who report chronic illness (AME 0.06, 95% CI 0.01-0.11), but none of the other health-related or PA-related measures showed a significant correlation with willingness to donate data in the multivariate model. With every additional activity someone reports using their smartphone, the predicted probabilities of being willing to donate increase by 3 PP (AME 0.03, 95% CI 0.02-0.04). Compared with iPhone owners who were asked to donate Apple Health data, the predicted probabilities of being willing to donate data from Google Location History among Android smartphone owners are 14 PP lower (AME –0.14, 95% CI –0.18 to –0.09). This difference is also visible in . Interestingly, respondents who were shown a stylized example of how the donated PA data would look like were significantly less likely to be willing to donate (AME –0.05, 95% CI –0.09 to –0.01).
Model 1: Owning an iPhone or an Android phone
- Female
- Younger age
- Higher monthly personal net income
- Higher educational attainment
- Medium general privacy concern
- More difficulties with tasks
Model 2: Willingness to donate physical activity data conditional on being asked
- Male
- Younger age
- Single-person household
- Less urban
- Higher educational attainment
- Lower perceived privacy of personal information
- Higher trust in government and research institutions
- Chronic illness
- More smartphone activities
- Request for Apple Health data versus Google Location History data
- Not seeing data example
Model 3: Starting to donate physical activity data conditional on willingness
- Younger age
- Higher monthly personal net income
- Higher trust in research and government organizations
- Lower trust in technology companies
- Not being overweight
- Fewer limitations by health in activities
- More smartphone activities
- Request for Apple Health data
Model 4: Successful data donation conditional on starting donation process
- Younger age
A total of 1252 survey respondents who were not willing to donate their PA data answered an open-ended question on the reason for their refusal to donate. shows that 27% (339/1252) of the responses pertained to anticipated difficulties with the data donation process, followed by the users’ wish to protect their privacy (269/1252, 21%) and doubts about the usefulness of the data donation for research (257/1252, 20%; see Table S3 in for coded subcategories).
Model 3 (Table S4 in ) shows that among those who were willing to donate PA data, the likelihood to start the donation process significantly and sharply decreased after age 65 years (65-69 years: AME –0.20, 95% CI –0.34 to –0.06; 70-74 years: AME –0.19, 95% CI –0.35 to –0.04; 75-80 years: AME –0.41, 95% CI –0.59 to –0.24; and 80 years and older: AME –0.44, 95% CI –0.65 to –0.23). We also see that the likelihood to start the data donation process increases with personal monthly net income, with people earning €2501 to €3000 having predicted probabilities that are 18 PP higher (AME 0.18, 95% CI 0.00-0.35) compared with people in the lowest income group (up to €1000). None of the other sociodemographic characteristics show a significant association with the likelihood of starting the data donation process. Trust in government and research organizations is significantly positively associated with starting the data donation process (AME 0.17, 95% CI 0.10-0.23) while trust in technology companies is negatively associated (AME –0.13, 95% CI –0.19 to –0.07). Again, few of the health-related and PA-related predictors show a significant correlation with starting the data donation conditional on willingness. People with a BMI that would categorize them as overweight have a significantly lower likelihood of starting the data donation process (AME –0.09, 95% CI –0.17 to –0.01) than people with a BMI indicating underweight or healthy weight. Lower scores on the index for health-related limitations are associated with higher likelihood to start the data donation (AME –0.05, 95% CI –0.11 to –0.00). The more activities someone reports to do on their smartphone, the higher the likelihood of starting the data donation (AME 0.02, 95% CI 0.00-0.03). Android smartphone owners who were willing to donate PA data from Google Location History or from Samsung Health have predicted probabilities of starting the data donation process that are 24 PP (AME –0.24, 95% CI –0.32 to –0.17) and 22 PP (AME –0.22, 95% CI –0.34 to –0.10) lower, respectively, compared with iPhone owners who were willing to donate Apple Health data ().
| Code | Reason | Values, n (%)a |
| 1 | Need to protect privacy | 269 (21.3) |
| 2 | Concern about data misuse or data breach | 44 (3.5) |
| 3 | Anticipating difficulties with data donation | 339 (26.8) |
| 4 | Perceived usefulness of data or research | 257 (20.3) |
| 5 | Personal reasons | 106 (8.4) |
| 6 | Other reasons | 96 (7.6) |
| 7 | Unconcrete refusal | 238 (18.8) |
| 8 | No reasons provided | 100 (7.9) |
aEach answer could be coded in multiple categories; thus, percentages can add up to more than 100.

Finally, model 4 (Table S5 in ) predicts successful data donation conditional on starting the data donation process. The results again indicate a decrease in the likelihood of donating data by age. The predicted probability for people aged 70-74 years (AME –0.30, 95% CI –0.55 to –0.06) and 75-79 years (AME –0.38, 95% CI –0.72 to –0.05) is significantly lower than that for the youngest reference group (aged 50-54 years). Neither of the other sociodemographic nor any of the privacy-related, trust-related, and smartphone-related variables are significantly correlated with successful donation at the 5% level. Similarly, successfully completing the data donation is not significantly associated with any of the health-related and PA-related measures in our model.
Bias Due to Selective Data Donation Participation
The analysis of predictors of participation in our multivariate models sheds light on the correlates of the participation behavior at the individual stages of the data donation process. From a practical perspective, it is also important to understand how the attrition of different subgroups at the individual stages adds up to a systematic deviation in characteristics between those who donated their data and those who did not, that is, bias. Thus, we next move to comparing the full survey sample to those who successfully donated their PA data. We find bias in several sociodemographic, privacy-related, trust-related, smartphone-related, health-related, and PA-related measures (see Table S6 in for detailed data).
In terms of sociodemographics (), men are overrepresented among donors by almost 7 PP (95% CI 1.2-12.2), while women are underrepresented. We also see a strong positive bias for the 2 younger age groups (50-54 years: +11.4 PP, 95% CI 7.1-15.7; 55-59 years: +7.7 PP, 95% CI 3.4-12.0) and a negative bias for the 3 oldest age groups (70-74 years: –5.0 PP, 95% CI –8.9 to –1.1; 75-80 years: –10.0 PP, 95% CI –12.5 to –7.5; and 80 years and older: –8.0 PP, 95% CI –9.9 to –6.1). Donors also overrepresent people who are employed for pay by almost 23 PP (+22.9 PP, 95% CI 17.8-28.0), people with high educational attainment by 17 PP (+17.4 PP, 95% 12.0-22.7), and people who live in households with 3 or more people (+5.4 PP, 95% CI 0.7-10.1). At the same time, the group of donors underrepresents those in unpaid work (–5.3 PP, 95% CI –8.5 to –2.2), those unemployed, retired, or disabled (–17.5 PP, 95% CI –23.2 to –11.8), and those with low educational attainment (–17.5 PP, 95% CI –21.5 to –13.5). We find a similar trend for personal monthly net income; those in low-income groups are underrepresented by up to 9 PP (no income/not available: –4.2 PP, 95% CI –7.0 to –1.4; up to €1000: –6.5 PP, 95% CI –9.2 to –3.7; and €1001 to €1500: –8.6 PP, 95% CI –11.8 to –5.4) and those in higher income groups are overrepresented by up to 15 PP (€2501 to €3000: +6.7 PP, 95% CI 2.6-10.9; more than €3000: +15.5 PP, 95% CI 10.7-20.3). We find no significant bias in urbanicity.

As shown in , there is very little consistent bias in the measures related to privacy. Compared with the entire sample, people who score in the highest tertile of the trust toward government and research organizations index are overrepresented (+17.1 PP, 95% CI 11.8-22.4) and people who score in the lowest tertile are underrepresented (–13.4 PP, 95% CI –18.4 to –8.4). The bias estimates for the trust measures on technology companies have CIs that include 0.

As shown in , donors are overrepresenting people who report excellent or very good health by 6.3 PP (95% CI 1.7-10.9) and underrepresenting people reporting moderate or bad health by 5.0 PP (95% CI –9.7 to –0.2). We find more people who report low limits by their health in activities (+8.5 PP, 95% CI 3.1-13.9) and low difficulties with different tasks (+10.0 PP, 95% CI 4.6-15.4) among donors compared with the entire sample. People who fall into the highest groups of health-related activity limits (–11.0 PP, 95% CI –16.0 to –5.9) and difficulties performing tasks due to health issues (–13.7 PP, 95% CI –18.6 to –8.7) are underrepresented among donors. We find no bias in BMI and chronic illness.

shows that when asked about PA over the course of the past 7 days, fewer donors than people in the entire sample report a number in the lowest tertile for days with moderate PA (–5.9 PP, 95% CI –11.2 to –0.6), strenuous PA (–9.6 PP, 95% CI –14.7 to –4.4), walking (–7.6 PP, 95% CI –12.8 to –2.3), and biking (–6.8 PP, 95% CI –12.1 to –1.5). At the same time, donors overrepresent people in the medium tertile of number of days with moderate PA (+6.5 PP, 95% CI 1.0-11.9) and, interestingly, the highest tertile of hours with sedentary time (+8.5 PP, 95% CI 3.1-13.9). Donors are also overproportionally reporting to have spent some time outdoors yesterday (+6.2 PP, 95% CI 2.7-9.6). The estimates for days with running are not biased.

Discussion
Principal Results
We have shown that donation of PA data from smartphones as part of a probability-based online panel is feasible for older adults with certain limitations. In particular, PA data donation works well for iPhone owners younger than 70 years when requesting Apple Health data. Overall, the rates of donation differ substantially between 7% and 23%, depending on what digital platform the data were requested from: Apple Health works much better than Google Location History and Samsung Health. The donation rate for Apple Health in our study is as high, if not higher, compared with studies that have asked people in the general population to download apps that continuously track their activity in the background of the phone [,,]. The large range in participation outcomes between Apple and the other 2 providers might be a reflection of differences in how many steps are necessary and how much time it takes on the individual platforms to request and download PA data. Nonparticipation in data donation also correlates strongly with characteristics related to smartphone ownership and comfort with device use. At all stages of data donation, we observe age effects within the age group of people aged 50 years and older: the older an individual, the less likely to own a smartphone, the less willing to donate, and the less likely to start and to successfully complete the donation process. In sum, these effects lead to stark differences in overall donation rates by age groups. While around 36% of 50- to 54-year-old iPhone owners donated their Apple Health data, the share of people who donated Apple Health data drops to around 17% for people aged 70-74 years and to a mere 3% for people aged 80 years and older. Educational attainment and personal monthly net income are also positively correlated with the likelihood of donating PA data.
Controlling for sociodemographic characteristics, we still see that experience with technology (measured as the number of different smartphone activities users report) is positively correlated with being willing to and starting the donation. Those who are not willing to participate often say that they anticipate technical problems or find the task too difficult. These findings are in line with the idea of perceived ease of technology use from the TAM [-]. The more private older adults perceive their PA data to be, the less willing they are to donate, and privacy concerns are the second most named reason for not donating. Again, these findings align with the notion of trustworthiness in the technology postulated by the TAM. Providing a visualization of how the extracted data would look reduces willingness to donate. This finding differs from previous research where showing respondents at the stage of asking for consent how the donated data about their transportation behavior from Google Location History would look like had no effect on donation []. Whether this difference is driven by the type of data requested (PA vs mobility), the target population (older adults vs general population), or something else should be the concern of future research.
From a theoretical perspective, it is noteworthy to highlight that the TAM seems to be particularly useful in this age group to explain participation in PA data donation. Reasons related to perceived ease of use, trustworthiness, and usefulness of the data donation all came up prominently in responses to an open-ended question on why people were not willing to donate their PA data. Interestingly, a recent qualitative study with a younger sample did not detect perceived ease of use as one of the main explanations of successful requests to donate data []. We could not find that highlighting the usefulness of data donation by showing potential participants a stylized example of the data to be donated would increase participation, quite the opposite. Thus, future research should further test TAM assumptions in different age groups and populations.
Beyond differences in sociodemographics and other covariates, we found that selectivity in participation in the data donation produces substantive bias in health and PA outcomes. Compared with the entire sample of older adults, people who donated their smartphone-based PA data report fewer health-related limitations, less difficulty in performing various tasks, higher levels of PA behavior, and better self-reported health. This finding indicates that while smartphone sensor–based PA measures can potentially improve measurement quality compared with self-report, a “healthy donor bias” exists. Our next step is thus to evaluate the quality of the donated PA data compared with and in combination with self-reports to predict health outcomes. Given the limitations of both data sources (ie, donated sensor–based PA data and self-reports), we currently recommend to integrate data from multiple sources where one source “borrows strength” from the other to address both issues of representation [] and measurement [].
From a study design perspective, those who wish to implement data donation should strive to reduce participant burden. Our study shows that more burden correlates with lower participation (in line with earlier research []). We see that the additional steps required to request and download data from Samsung Health and especially from Google Location History come with substantially reduced data donation rates in comparison with Apple Health. Self-selection through additional burden might further contribute to differences in biases that stem from differential device brand ownership and use of these platforms. Researchers need to be aware of the multiple sources of bias when interpreting substantive results, especially in studies where data from more than 1 platform are collected. Standard adjustment procedures such as poststratification weighting based on sociodemographic variables or adding these variables as controls to regression models might need to be extended by attitudinal measures of comfort and familiarity with a specific technology to account for these differences [,]. Regardless of the platform, there are recommendations on how to implement data donation studies [], which aim at improving the participants experience of the data donation process. At the same time, data donation through DDPs (ie, user-centric data donation) will have limitations imposed by data controllers who can decide on how they would like to shape their compliance with the law []. For the effective use of data donation for research purposes, Hase et al [] provide a set of potential solutions to challenges that result from platforms’ decisions on how to comply with the law.
Comparison With Prior Work
Most methodological studies around data donation have focused on the collection of usage data from social media platforms in the domain of communication research []. Only 3 studies so far have focused on PA-related and health-related measures: Kompier et al [] and Toepoel et al [] have asked individuals in the general population to donate data from their personal fitness trackers, either by uploading data from Fitbit users or by copying the information from their Garmin, Apple, and Samsung watches into the questionnaire. Silber et al [] asked smartphone owners in the general German population to donate their health app data. To our knowledge, our study is the first to concentrate on measurement of PA using data donation from smartphones from older adults. Strikingly, we find participation rates that fall right into the middle of the range of participation rates reported in various earlier research (see section “Data Donation Participation Rates”). In general, our findings are in line with research that showed that initial stated willingness to donate data is strongly correlated with privacy concerns about the requested data [,,,,-]. We confirm that the actual donation conditional on consent is related to technical limitations and perceived burden [,,]. In our case, requesting and donating data from Google Location History and Samsung Health included more steps than doing so from Apple Health. These additional steps might be one of the factors why we find significantly lower donation rates for Google Location History and Samsung Health.
For PA data donation, Kompier et al [] concluded that given the low prevalence and selective group of fitness tracker owners as well as the technical challenges participants faced, data donation may not yet be a feasible way to collect PA data in the general population. Our results are more optimistic, in particular, in light of using a more user-friendly method for data donation. In the study by Kompier et al [], respondents were asked to create a .csv data export from their wearables themselves that then had to be sent to the researchers. These steps created unrealistic technical barriers for many brands (except for Fitbit). A similar issue led to an extremely small donation rate in the study by Silber et al []. In this study, we requested 1 DDP and asked participants to donate through Port offering a much smoother experience for study participants. While we see selectivity in sociodemographic characteristics, the willingness to donate is high, especially among Apple smartphone owners younger than 70 years. Further optimizing the data donation process and making it more user-friendly, especially for those with low tech skills, will have the potential to provide researchers with objective PA data.
Limitations
Older adults in our sample might be somewhat more technologically savvy than the general older population in the Netherlands and older populations in other countries. Although the LISS panel draws their sample from the Dutch population registry and non–internet users are provided with internet access and devices, having participated in monthly web surveys might have decreased the barriers of using technology in everyday life. In addition, the smartphone ownership rate in the Netherlands is quite high, although comparable with other Western European countries and the United States [,]. At the same time, PA data donation is feasible only for users of smartphones and the digital platforms that collect these data, and the implications of excluding certain populations should be further studied []. We also acknowledge that device ownership and, in particular, which brand of smartphones people own and thus from which health app they were asked to donate data were not randomly assigned in our study. We thus cannot make any causal claims about the relationship between the platform and participation rates. Furthermore, there were initially some technical problems during data collection with Google Location History DDPs that were solved timely. Nevertheless, this might have resulted in the loss of a few DDPs. Finally, not all LISS panel members invited to the survey that included the data donation request also responded. Additional analysis reported in shows that those who did not respond tend to be slightly younger and more likely to be employed than our analytical sample (Table S1 in ). Adjusting the results of our original model 1 predicting smartphone ownership with weights that account for these differences does not substantively change the results (Table S2 in ).
Conclusions
Data donation affords researchers the opportunity to collect PA data from smartphones and digital platforms (eg, Google) that are already automatically recorded on the user’s device, thus reducing the need to design and implement lengthy and burdensome questionnaires on PA. Importantly, even when donation rates are modest, the amount of history-rich data obtained from a single donation can be substantial. Participants who donate data from Apple Health, Samsung Health, or Google Location History can often provide months or even years of retrospective data. In contrast, app-based data collection requires participants to prospectively install and maintain an application, often yielding shorter observation windows and higher dropout rates. This makes data donation not only a feasible but also an efficient and comparatively less burdensome approach for collecting large volumes of relevant behavioral data.
PA data collection via data donation could thus be of interest not only for researchers who study PA in older adults but also for government agencies and NGOs who want to determine whether citizens adhere to the World Health Organization norms. On a societal level, understanding PA patterns of older adults in their natural setting will help inform better, more targeted public policies. For example, differences in PA patterns of different groups (based on gender, race and ethnicity, socioeconomic background, and neighborhood) that will be uncovered in the donated data can inform specific policies for these groups. Our study findings contribute another step to the general understanding of whether data donation can indeed replace self-report of PA over time. In particular, we show that PA data donation from the Apple Health app on smartphones is feasible for people younger than 70 years, but bias toward a healthier sample has to be considered. A next step is to study the quality of the data provided in the DPPs to better understand the full potential of data donation to achieve true cost savings and better quality (more detailed data and higher validity as activities will not be misreported or forgotten) compared with traditional self-reports. In the meantime, PA data donation can be considered a valuable complement to self-reported PA in surveys, providing a more comprehensive picture of individuals’ PA.
Acknowledgments
Research reported in this paper was supported by the National Institute on Aging of the National Institutes of Health (award R24AG077012). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The authors are thankful for a pilot project grant from the Network for Innovative Methods in Longitudinal Aging Studies. This study is part of the Digital Data Donation Infrastructure Project (D3I), made possible by the Platform Digitale Infrastructuur Social Science and Humanities, from which Bella Struminskaya received funding. The authors thank Adriënne Mendrik for support with the data collection. The authors would also like to acknowledge the help of their student research assistants Ella Häcker and Leonard Bek. The publication of this paper was funded by the University of Mannheim.
Authors' Contributions
FK and BS contributed to conceptualization, funding acquisition, and methodology. JM and SJ contributed to data curation and investigation. FK contributed to the formal analysis, visualization, and writing the original draft. BS, JM, and SJ contributed to writing, review, and editing. BS contributed to project administration.
Conflicts of Interest
None declared.
Questionnaire.
PDF File (Adobe PDF File), 255 KBDescriptive statistics.
PDF File (Adobe PDF File), 348 KBAdditional analysis.
PDF File (Adobe PDF File), 221 KBSensitivity analysis.
PDF File (Adobe PDF File), 718 KBAnalysis results.
PDF File (Adobe PDF File), 464 KBReferences
- Pate RR, Pratt M, Blair SN. Physical activity and public health. JAMA. 1995;273(5):402. [CrossRef]
- Warburton DER, Nicol CW, Bredin SSD. Health benefits of physical activity: the evidence. CMAJ. 2006;174(6):801-809. [FREE Full text] [CrossRef] [Medline]
- DiPietro L. Physical activity in aging: changes in patterns and their relationship to health and function. J Gerontol A Biol Sci Med Sci. 2001;56 Spec No 2:13-22. [CrossRef] [Medline]
- Strath SJ, Kaminsky LA, Ainsworth BE, Ekelund U, Freedson PS, Gary RA, et al. American Heart Association Physical Activity Committee of the Council on LifestyleCardiometabolic HealthCardiovascular‚ Exercise‚ Cardiac RehabilitationPrevention Committee of the Council on Clinical Cardiology‚Council. Guide to the assessment of physical activity: clinical and research applications: a scientific statement from the American Heart Association. Circulation. 2013;128(20):2259-2279. [CrossRef] [Medline]
- Craig CL, Marshall AL, Sjöström M, Bauman AE, Booth ML, Ainsworth BE, et al. International physical activity questionnaire: 12-country reliability and validity. Med Sci Sports Exerc. 2003;35(8):1381-1395. [CrossRef] [Medline]
- Wendel-Vos GCW, Schuit AJ, Saris WHM, Kromhout D. Reproducibility and relative validity of the short questionnaire to assess health-enhancing physical activity. J Clin Epidemiol. 2003;56(12):1163-1169. [CrossRef] [Medline]
- de Hollander EL, Zwart L, de Vries SI, Wendel-Vos W. The SQUASH was a more valid tool than the OBiN for categorizing adults according to the Dutch physical activity and the combined guideline. J Clin Epidemiol. 2012;65(1):73-81. [CrossRef] [Medline]
- Helmerhorst HHJ, Brage S, Warren J, Besson H, Ekelund U. A systematic review of reliability and objective criterion-related validity of physical activity questionnaires. Int J Behav Nutr Phys Act. 2012;9:103. [FREE Full text] [CrossRef] [Medline]
- Sallis JF, Saelens BE. Assessment of physical activity by self-report: status, limitations, and future directions. Res Q Exerc Sport. 2000;71 Suppl 2:1-14. [CrossRef] [Medline]
- Bauman A, Bull F, Chey T, Craig CL, Ainsworth BE, Sallis JF, et al. The international prevalence study on physical activity: results from 20 countries. Int J Behav Nutr Phys Act. 2009;6:21. [FREE Full text] [CrossRef] [Medline]
- Farrell L, Hollingsworth B, Propper C, Shields MA. The socioeconomic gradient in physical inactivity: evidence from one million adults in England. Soc Sci Med. 2014;123:55-63. [CrossRef] [Medline]
- Kahneman D, Krueger AB, Schkade DA, Schwarz N, Stone AA. A survey method for characterizing daily life experience: the day reconstruction method. Science. 2004;306(5702):1776-1780. [CrossRef] [Medline]
- Elevelt A, Bernasco W, Lugtig P, Ruiter S, Toepoel V. Where You at? Using GPS locations in an electronic time use diary study to derive functional locations. Soc Sci Comput Rev. 2019;39(4):509-526. [CrossRef]
- Schmidt T. Consumers' Recording behaviour in payment diaries—empirical evidence from Germany. Surv Methods Insights Field SMIF. 2014. [CrossRef]
- Jäckle A, Wenz A, Burton J, Couper M. Increasing participation in a mobile app study: the effects of a sequential mixed-mode design and in-interview invitation. J Surv Stat Methodol. 2022;10(4):922. [CrossRef]
- Lauderdale DS, Philip Schumm L, Kurina LM, McClintock M, Thisted RA, Chen J, et al. Assessment of sleep in the national social life, health, and aging project. J Gerontol B Psychol Sci Soc Sci. 2014;69 Suppl 2(Suppl 2):S125-S133. [FREE Full text] [CrossRef] [Medline]
- Chaturvedi RR, Angrisani M, Troxel WM, Gutsche T, Ortega E, Jain M, et al. American life in realtime: a benchmark registry of health data for equitable precision health. Nat Med. 2023;29(2):283-286. [FREE Full text] [CrossRef] [Medline]
- Doherty A, Jackson D, Hammerla N, Plötz T, Olivier P, Granat MH, et al. Large scale population assessment of physical activity using wrist worn accelerometers: the UK biobank study. PLoS One. 2017;12(2):e0169649. [FREE Full text] [CrossRef] [Medline]
- All of Us Research Program Investigators, Denny JC, Rutter JL, Goldstein DB, Philippakis A, Smoller JW, et al. The "All of Us" Research Program. N Engl J Med. 2019;381(7):668-676. [FREE Full text] [CrossRef] [Medline]
- Kapteyn A, Banks J, Hamer M, Smith JP, Steptoe A, van Soest A, et al. What they say and what they do: comparing physical activity across the USA, England and the Netherlands. J Epidemiol Community Health. 2018;72(6):471-476. [FREE Full text] [CrossRef] [Medline]
- Montoye AHK, Pivarnik JM, Mudd LM, Biswas S, Pfeiffer KA. Validation and comparison of accelerometers worn on the hip, thigh, and wrists for measuring physical activity and sedentary behavior. AIMS Public Health. 2016;3(2):298-312. [FREE Full text] [CrossRef] [Medline]
- Schneller M, Bentsen P, Nielsen G, Brønd JC, Ried-Larsen M, Mygind E, et al. Measuring children's physical activity: compliance using skin-taped accelerometers. Med Sci Sports Exerc. 2017;49(6):1261-1269. [CrossRef] [Medline]
- Vogels E. About one-in-five Americans use a smart watch or fitness tracker. Pew Research Center. 2020. URL: https://www.pewresearch.org/fact-tank/2020/01/09/about-one-in-five-americans-use-a-smart-watch-or-fitness-tracker/ [accessed 2021-07-30]
- Guyer H, Moakley M, Keusch F, Struminskaya B. Provide or bring your own wearable device? An assessment of compliance, adherence, and representation in a national study. 2023. Presented at: BigSurv23 (Big Data Meets Survey Science); October 26-29, 2023; Quito, Ecuador.
- Boeschoten L, Mendrik A, van der Veen E, Vloothuis J, Hu H, Voorvaart R, et al. Privacy-preserving local analysis of digital trace data: a proof-of-concept. Patterns (N Y). 2022;3(3):100444. [FREE Full text] [CrossRef] [Medline]
- Loecherbach F, Otto L. Introduction to the special issue on participant-centered behavioral traces. Comput Commun Res. 2024;6(2):1. [CrossRef]
- Bietz M, Patrick K, Bloss C. Data donation as a model for citizen science health research. Citiz Sci Theory Pract. 2019;4(1). [CrossRef]
- Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA relevance). European Union. 2016. URL: http://data.europa.eu/eli/reg/2016/679/oj/eng [accessed 2023-03-22]
- California Consumer Privacy Act of 2018 [1798.100-1798.199.100]. California State Legislature. 2018. URL: https://leginfo.legislature.ca.gov/faces/codes_displayText.xhtml?division=3.&part=4.&lawCode=CIV&title=1.81.5 [accessed 2024-06-03]
- Boeschoten L, Voorvaart R, Van Den Goorbergh R, Kaandorp C, De Vos M. Automatic de-identification of data download packages. Data Sci. 2021;4(2):101-120. [CrossRef]
- Bruns A. After the ‘APIcalypse’: social media platforms and their fight against critical scholarly research. Inform Commun Soc. 2019;22(11):1544-1566. [CrossRef]
- Calma J. Scientists say they can't rely on Twitter anymore. The Verge. 2023. URL: https://www.theverge.com/2023/5/31/23739084/twitter-elon-musk-api-policy-chilling-academic-research [accessed 2023-08-06]
- Ledford H. Researchers scramble as Twitter plans to end free data access. Nature. 2023;614(7949):602-603. [CrossRef] [Medline]
- Sarraf I. Briefing: third-party Reddit app to shut down. The Information. 2023. URL: https://www.theinformation.com/briefings/third-party-reddit-app-apollo-to-shut-down [accessed 2023-08-06]
- Silver L, Smith A, Johnson C, Jiang J, Anderson M, Rainie L. Mobile connectivity in emerging economies. PEW Research Center. 2019. URL: https://www.pewresearch.org/internet/2019/03/07/mobile-connectivity-in-emerging-economies/ [accessed 2020-07-22]
- Digital economy and society: ICT usage in households and by individuals. Eurostat. 2024. URL: https://ec.europa.eu/eurostat/statistics-explained/index.php?title=Digital_economy_and_society_statistics_-_households_and_individuals [accessed 2025-09-06]
- Mobile fact sheet. Pew Research Center. 2024. URL: https://www.pewresearch.org/internet/fact-sheet/mobile/ [accessed 2025-02-11]
- Ohme J, Araujo T, Boeschoten L, Freelon D, Ram N, Reeves BB, et al. Digital trace data collection for social media effects research: APIs, data donation, and (Screen) tracking. Commun Methods Measures. 2023;18(2):124-141. [CrossRef]
- Araujo T, Ausloos J, Atteveldt WV, Loecherbach F, Moeller J, Ohme J, et al. OSD2F: an open-source data donation framework. Comput Commun Res. 2022;4(2):372-387. [CrossRef]
- Boeschoten L, de Schipper NC, Mendrik AM, van der Veen E, Struminskaya B, Janssen H, et al. Port: a software tool for digital data donation. JOSS. 2023;8(90):5596. [CrossRef]
- Halavais A. Overcoming terms of service: a proposal for ethical distributed research. Inform Commun Soc. 2019;22(11):1567-1581. [CrossRef]
- Davis FD. A technology acceptance model for empirically testing new end-user information systems: theory and results. In: MIT Sloan School of Management. Cambridge, MA. Massachusetts Institute of Technology; 1986.
- Davis FD. Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Q. 1989;13(3):319. [CrossRef]
- Gefen, Karahanna, Straub. Trust and TAM in online shopping: an integrated model. MIS Q. 2003;27(1):51-90. [CrossRef]
- Wenz A, Keusch F. Increasing the acceptance of smartphone-based data collection. Public Opin Q. 2023;87(2):357-388. [FREE Full text] [CrossRef] [Medline]
- Boeschoten L, Ausloos J, Möller J, Araujo T, Oberski D. A framework for privacy preserving digital trace data collection through data donation. Comput Commun Res. 2022;4(2):423. [CrossRef]
- van Driel II, Giachanou A, Pouwels JL, Boeschoten L, Beyens I, Valkenburg PM. Promises and pitfalls of social media data donations. Commun Methods Measures. 2022;16(4):266-282. [CrossRef]
- Carrière TC, Boeschoten L, Struminskaya B, Janssen HL, de Schipper NC, Araujo T. Best practices for studies using digital data donation. Qual Quant. 2025;59(Suppl 1):389-412. [CrossRef] [Medline]
- Keusch F, Pankowska PK, Cernat A, Bach RL. Do you have two minutes to talk about your data? Willingness to participate and nonparticipation bias in Facebook data donation. Field Methods. 2024;36(4):279-293. [CrossRef]
- Breuer J, Kmetty Z, Haim M, Stier S. User-centric approaches for collecting Facebook data in the ‘post-API age’: experiences from two studies and recommendations for future research. Inform Commun Soc. 2022;26(14):2649-2668. [CrossRef]
- Kmetty Z, Németh R. Which is your favorite music genre? A validity comparison of Facebook data and survey data. Bull Sociol Methodol. 2022;154(1):82-104. [CrossRef]
- Hase V, Struminskaya B, Araujo T, Boeschoten L, Ozornina N, Lechner M, et al. Why Do People Self-Select Out of Data Donation Studies? Cross-National Insights from Germany and the Netherlands. Amsterdam, Netherlands. University of Oxford; 2024.
- Razi A, Alsoubai A, Kim S, Naher N, Ali S, Stringhini G, et al. Instagram data donation: a case study on collecting ecologically valid social media data for the purpose of adolescent online risk detection. New York, NY. Association for Computing Machinery; 2022. Presented at: Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems; April 29, 2022:1-9; New Orleans, LA. [CrossRef]
- Kohne J, Montag C. ChatDashboard: a framework to collect, link, and process donated WhatsApp chat log data. Behav Res Methods. 2024;56(4):3658-3684. [FREE Full text] [CrossRef] [Medline]
- Corten R, Boeschoten L, Carrière T, Jongerius S, Struminskaya B, Mulder J, et al. Assessing mobile instant messenger networks with donated data. OSF. 2024. [CrossRef]
- van Deursen A, van Dijk J. Internet skills and the digital divide. New Media Soc. 2010;13(6):893-911. [CrossRef]
- van DJ. The Digital Divide. Hoboken, NJ. John Wiley & Sons; 2020.
- Hargittai E, Dobransky K. Old dogs, new clicks: digital inequality in skills and uses among older adults. Can J Commun. 2017;42(2):195-212. [CrossRef]
- Faverio M. Share of those 65 and older who are tech users has grown in the past decade. Pew Research Center. 2022. URL: https://www.pewresearch.org/fact-tank/2022/01/13/share-of-those-65-and-older-who-are-tech-users-has-grown-in-the-past-decade/ [accessed 2022-10-07]
- Hase V, Haim M. Can we get rid of bias? Mitigating systematic error in data donation studies through survey design strategies. Comput Commun Res. 2024;6(2):1-29. [CrossRef]
- Toepoel V, Luiten A, Zandvliet R. Response, willingness, and data donation in a study on accelerometer possession in the general population. Surv Pract. 2021;14(1):1-19. [CrossRef]
- Kompier M, Elevelt A, Luiten A, Mulder J, Toepoel V. Data donation of personal physical activity trackers. Surv Pract. 2024;17(1):1-25. [CrossRef]
- Silber H, Breuer J, Beuthner C, Gummer T, Keusch F, Siegers P, et al. Linking surveys and digital trace data: insights from two studies on determinants of data sharing behaviour. J R Stat Soc Ser A Stat Soc. 2022;185(Suppl 2):S387-S407. [CrossRef]
- Ohme J, Araujo T, de Vreese CH, Piotrowski JT. Mobile data donations: assessing self-report accuracy and sample biases with the iOS Screen Time function. Mobile Media Commun. 2020;9(2):293-313. [CrossRef]
- Pfiffner N, Friemel TN. Leveraging data donations for communication research: exploring drivers behind the willingness to donate. Commun Methods Measures. 2023;17(3):227-249. [CrossRef]
- Keusch F, Struminskaya B, Antoun C, Couper MP, Kreuter F. Willingness to participate in passive mobile data collection. Public Opin Q. 2019;83(Suppl 1):210-235. [FREE Full text] [CrossRef] [Medline]
- Struminskaya B, Toepoel V, Lugtig P, Haan M, Luiten A, Schouten B. Understanding willingness to share smartphone-sensor data. Public Opin Q. 2020;84(3):725-759. [FREE Full text] [CrossRef] [Medline]
- Gomez Ortega A, Bourgeois J, Hutiri WT, Kortuem G. Beyond data transactions: a framework for meaningfully informed data donation. AI Soc. 2023;40(2):1-18. [CrossRef]
- Tourangeau R. Presidential address: paradoxes of nonresponse. Public Opin Q. 2017;81(3):803-814. [CrossRef]
- Groves RM, Peytcheva E. The impact of nonresponse rates on nonresponse bias: a meta-analysis. Public Opin Q. 2008;72(2):167-189. [CrossRef]
- Stamatakis E, Owen KB, Shepherd L, Drayton B, Hamer M, Bauman AE. Is cohort representativeness passé? Poststratified associations of lifestyle risk factors with mortality in the UK biobank. Epidemiology. 2021;32(2):179-188. [FREE Full text] [CrossRef] [Medline]
- Davis KAS, Coleman JRI, Adams M, Allen N, Breen G, Cullen B, et al. Mental health in UK biobank—development, implementation and results from an online questionnaire completed by 157 366 participants: a reanalysis. BJPsych Open. 2020;6(2):e18. [FREE Full text] [CrossRef] [Medline]
- Fry A, Littlejohns T, Sudlow C, Doherty N, Adamska L, Sprosen T, et al. Comparison of sociodemographic and health-related characteristics of UK biobank participants with those of the general population. Am J Epidemiol. 2017;186(9):1026-1034. [FREE Full text] [CrossRef] [Medline]
- Andreeva VA, Salanave B, Castetbon K, Deschamps V, Vernay M, Kesse-Guyot E, et al. Comparison of the sociodemographic characteristics of the large nutriNet-santé e-cohort with French Census data: the issue of volunteer bias revisited. J Epidemiol Community Health. 2015;69(9):893-898. [CrossRef] [Medline]
- Brown WJ, Bryson L, Byles JE, Dobson AJ, Lee C, Mishra G, et al. Women's health Australia: recruitment for a national longitudinal cohort study. Women Health. 1998;28(1):23-40. [CrossRef] [Medline]
- Couper MP, Jäckle A. Participation and selection bias in the SHARE Accelerometer Study (SAS). 2023. URL: https://share-eric.eu/fileadmin/user_upload/SHARE_Working_Paper/SHARE_WP_92-2024.pdf [accessed 2025-09-06]
- Scherpenzeel A, Das M. "True" longitudinal and probability-based internet panels: evidence from the Netherlands. In: Social and Behavioral Research and the Internet. Oxfordshire. Routledge/Taylor & Francis Group; 2011:77-104.
- Methodology. LISS Panel. URL: https://www.lissdata.nl/methodology [accessed 2024-09-09]
- Struminskaya B, Keusch F. Health app data donation and physical activity of older adults. LISS Data Archive. 2024. [CrossRef]
- R: a language and environment for satistical computing. Core Team. 2024. URL: https://www.R-project.org/ [accessed 2025-09-06]
- Keusch F. Fkeusch0. 2024. URL: https://github.com/fkeusch01/PA_data_donation [accessed 2024-12-09]
- Adult BMI categories. CDC BMI. 2024. URL: https://www.cdc.gov/bmi/adult-calculator/bmi-categories.html [accessed 2025-06-27]
- Leeper T, Arnold J, Arel-Bundock V, Long J, Bolker B. Margins: marginal effects for model objects. 2024. URL: https://cran.r-project.org/web/packages/margins/ [accessed 2025-08-07]
- Keusch F, Bähr S, Haas G, Kreuter F, Trappmann M. Coverage error in data collection combining mobile surveys with passive measurement using apps: data from a German national survey. Sociol Methods Res. 2020;52(2):841-878. [CrossRef]
- Struminskaya B, Lugtig P, Toepoel V, Schouten B, Giesen D, Dolmans R. Sharing data collected with smartphone sensors: willingness, participation, and nonparticipation bias. Public Opin Q. 2021;85(Suppl 1):423-462. [FREE Full text] [CrossRef] [Medline]
- Couper MP, Gremel G, Axinn W, Guyer H, Wagner J, West BT. New options for national population surveys: the implications of internet and smartphone coverage. Soc Sci Res. 2018;73:221-235. [FREE Full text] [CrossRef] [Medline]
- McCool D, Lugtig P, Mussmann O, Schouten B. An app-assisted travel survey in official statistics: possibilities and challenges. J Off Stat. 2021;37(1):149-170. [FREE Full text] [CrossRef]
- Keusch F, Bähr S, Haas G, Kreuter F, Trappmann M, Eckman S. Non-participation in smartphone data collection using research apps. J R Stat Soc Ser A Stat Soc. 2022;185(S2):S225-S245. [CrossRef]
- Struminskaya B. Willingness and nonparticipation biases in data donation. Utrecht, The Netherlands. 2022. URL: https://odissei-data.nl/en/2022/09/odissei-conference-for-social-science-in-the-netherlands-2022/ [accessed 2025-09-06]
- Groot Kormelink T, Houwing F, Struminskaya B, Boeschoten L, de Schipper N, Welbers K. Meaningful informed consent? How participants experience and understand data donation. Inf Commun Soc. 2025:1-18. [CrossRef]
- Salvatore C, Biffignandi S, Sakshaug J, Wiśniowski A, Struminskaya B. Bayesian integration of probability and nonprobability samples for logistic regression. J Surv Stat Methodol. 2024;12(2):492. [CrossRef]
- Cernat A, Keusch F, Bach RL, Pankowska PK. Estimating measurement quality in digital trace data and surveys using the multitrait multimethod model. Soc Sci Comput Rev. 2024. [CrossRef]
- Kmetty Z, Stefkovics Á, Számely J, Deng D, Kellner A, Pauló E, et al. Determinants of willingness to donate data from social media platforms. Inform Commun Soc. 2025;28(7):1324-1349. [CrossRef]
- Janssen H, Carrière T, Ausloos J, Struminskaya B, Boeschoten L. Interpretive Entrepreneurship Shaping Data Access Under the GDPR: An Empirical Analysis of Platform Data Download Packages' Deficiencies Under the GDPR. Rochester, NY. Social Science Research Network; 2025.
- Hase V, Ausloos J, Boeschoten L, Pfiffner N, Janssen H, Araujo T. Fulfilling data access obligations: how could (and should) platforms facilitate data donation studies? Internet Policy Review. 2024. URL: https://policyreview.info/articles/analysis/fulfilling-data-access-obligations [accessed 2024-10-31]
Abbreviations
| AME: average marginal effect |
| DDP: Data Download Package |
| LISS: Longitudinal Internet studies for the Social Sciences |
| PA: physical activity |
| RQ: research question |
| TAM: Technology Acceptance Model |
Edited by J Sarvestan; submitted 08.Dec.2024; peer-reviewed by G-C Haas, B Schouten, M Doerr; comments to author 30.Apr.2025; revised version received 30.Jun.2025; accepted 21.Aug.2025; published 26.Sep.2025.
Copyright©Florian Keusch, Bella Struminskaya, Joris Mulder, Stein Jongerius. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 26.Sep.2025.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research (ISSN 1438-8871), is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.



