Understanding Uptake of Digital Health Products: Methodology Tutorial for a Discrete Choice Experiment Using the Bayesian Efficient Design

doi:10.2196/32365

Tutorial

¹Behavioural and Implementation Science Group, School of Health Sciences, University of East Anglia, Norwich, United Kingdom

²Norwich Medical School, University of East Anglia, Norwich, United Kingdom

³National Institute for Health Research, Applied Research Collaboration East of England, Cambridge, United Kingdom

⁴Department of Behavioural Science and Health, University College London, London, United Kingdom

⁵SPECTRUM Consortium, London, United Kingdom

Corresponding Author:

Dorothy Szinay, MSc

Behavioural and Implementation Science Group

School of Health Sciences

University of East Anglia

Norwich Research Park Earlham Road

Norwich, NR4 7TJ

United Kingdom

Phone: 44 1603593064

Email: d.szinay@uea.ac.uk

Understanding the preferences of potential users of digital health products is beneficial for digital health policy and planning. Stated preference methods could help elicit individuals’ preferences in the absence of observational data. A discrete choice experiment (DCE) is a commonly used stated preference method—a quantitative methodology that argues that individuals make trade-offs when engaging in a decision by choosing an alternative of a product or a service that offers the greatest utility, or benefit. This methodology is widely used in health economics in situations in which revealed preferences are difficult to collect but is much less used in the field of digital health. This paper outlines the stages involved in developing a DCE. As a case study, it uses the application of a DCE to reveal preferences in targeting the uptake of smoking cessation apps. It describes the establishment of attributes, the construction of choice tasks of 2 or more alternatives, and the development of the experimental design. This tutorial offers a guide for researchers with no prior knowledge of this research technique.

J Med Internet Res 2021;23(10):e32365

doi:10.2196/32365

Keywords

discrete choice experiment; stated preference methods; mHealth; digital health; quantitative methodology; uptake; engagement; methodology; preference; Bayesian; design; tutorial; qualitative; user preference

Understanding how the public values different aspects of digital health tools, such as smoking cessation or physical activity apps, can help providers of the tools to identify functionality that is important to users, which may improve uptake (ie, selection, download, and installation of apps) [1]. This is important because uptake of digital tools is generally low. More information regarding the preferences of users when selecting a digital health tool, for example via an app store, may allow providers to present their products in such a way that may increase their uptake. However, pragmatic challenges, such as examining how each potentially modifiable aspect of a digital health product (eg, presentation, design, and features that it offers) or intervention design will impact preference or the choice of uptake, often mean this is not feasible or practical [2]. Therefore, increasing attention is being paid toward stated preference methods to understand preferences when designing digital health products and services, with examples including COVID-tracing apps [3,4], sun protection apps to prevent skin cancer [5], and the uptake of health apps in general [6].

Stated preference methods are survey-based methods aiming to elicit individuals’ preferences toward a specific behavior, particularly those that are not well understood. The most widely used type of stated preference method is the discrete choice experiment (DCE) [7]. According to Spinks et al [8], Louviere and Hensher (1982) and Louviere and Woodworth (1983) originally developed DCEs to study the marketing and economics of transport, and the fields of psychology and economics have profoundly influenced the DCE methodology since it was developed. In recent years, DCEs have been increasingly used in health and health care settings [9,10], as well as in addiction research [11] and digital health [4-6]. The increasing number of DCEs in digital health highlights their potential, although they are currently underused.

Discrete choice differentiates from other stated preference methods in the way that responses are elicited [12]. The DCE uses a survey-based experimental design, where participants are presented with a series of hypothetical scenarios. In these scenarios, participants are shown situations, known as choice tasks. Attempting to mimic real-world decision making, in each choice task, participants then have to choose a product or a service from two or more options, known as alternatives [13]. Each alternative consists of a set of characteristics, known as attributes, with at least two types, known as attribute levels [13]. Participants are asked to choose a preferred alternative in each choice task, which allows researchers to quantify the relative strength of preferences for improvements in certain attributes [8,14].

The outputs from statistical models developed using DCE data can be beneficial for estimating uptake of new products or services, including digital health tools, where observational data are not available or are difficult to obtain otherwise [15,16]. Lack of observational data often implies a requirement to seek scientific views and comments from experts in order to generate predictions of a target behavior [17]. However, DCEs can provide an empirical alternative to expert opinions, while accounting for possible interactions between attributes (eg, design of a product and brand name), which are otherwise often ignored [18].

In our research, we wanted to understand how to present health apps on curated health app portals to increase their uptake. This paper describes the development of a DCE in digital health that aims to elicit potential user preferences on smoking cessation app uptake. It explains how the attributes and their levels are selected and describes the construction of choice tasks and the experimental design. The study protocol of the research this paper is based on is registered on the Open Science Framework [19].

The development of a DCE should follow published recommendations, including the checklist for good research practices [9], guides on the development of a DCE [13,20], recommendations on how to construct the experimental design [7,20-23], and which statistical methods can be used [24].

Establishing Attributes

An important step in designing a DCE is the identification of the relevant attributes for the subject matter. Attributes in a DCE can be quantitative, such as cost, or qualitative, such as the design of a product [25]. The identification of attributes is typically based on primary and secondary data collection to ensure that the DCE is tailored to the study setting [13]. It should ideally commence with a literature review that will inform qualitative research to identify relevant attributes [26]. Although there is no set limit on the number of attributes that can be included in a DCE, to ensure that the cognitive load of the participants is manageable, it should be less than 10 [13], with a general expectation to include 5-7 attributes [27].

Our DCE was based on a comprehensive systematic review investigating factors influencing the uptake and engagement with health and well-being smartphone apps [28] and a qualitative research component that consisted of a think-aloud and interview study to examine further the previously identified factors or attributes [29]. The importance of qualitative research lies in ensuring inclusion of attributes that are relevant to most participants [25]. Of the 14 factors initially identified as being relevant for the uptake of health and well-being apps, 5 were retained and included in the DCE: the monthly price of the app, who developed the app, the star ratings of the app, the description of the app, and images shown. These factors were chosen due to their perceived importance during our previous qualitative research and for pragmatic reasons, including how easily measurable and presentable they were within the DCE.

An important step in designing a DCE is in ensuring the content validity of the instrument: the identification of relevant attributes for the subject matter. Following administration of the survey, methods are available for the measurement and assessment of the content validity of the instrument, although their use is not widely reported [30].

Establishing Attribute Levels

The next step is to establish attribute levels. The level of an attribute must also be of a range that ensures a trade-off between attributes. A trade-off is defined as an exchange in which a participant gives up some amount of one attribute to gain more of another. It has been suggested that increasing the number of levels for an attribute increases the relative importance of that attribute [31] and that imbalance in the numbers of levels across attributes raises the importance of the attributes with higher levels [32]. Yang et al [32] suggested that a balance exists between simpler designs with lower numbers of levels, which reduce the respondent burden (and consequently measurement error) and are useful for identifying attribute rankings, and more complex designs with higher levels (and higher statistical precision) and is more sensitive to identifying trade-offs between attributes. Based on this, and the commonly adopted practices in the research field, we aimed to include at least three levels for each attribute.

If a range is not suitable, participants might consider the differences between levels unimportant [25]. For example, the difference between the star ratings of 4.8 and 4.7 for a smoking cessation app is not as relevant as the difference between 4.8 and 4. In our research, to refine attribute levels, a survey was conducted with 34 participants. In the survey, the levels of two attributes we were unsure of (the monthly price of the app and the ratings) were carefully considered in order to specify at a sufficiently wide range so that the difference between the levels would likely make a difference in response. When a range is not wide enough, there is a risk that participants could ignore the attributes because they judge the difference between levels to be insignificant [20]. See Figure 1 for the final list of attributes and levels included in our DCE.

Figure 1. Attributes and attribute levels in our DCE. DCE: discrete choice experiment.

Choice Tasks

Once the attributes and their levels are identified, the decision to develop full- or partial-profile tasks with or without an opt-out option needs to be made. A full profile refers to the display of all five attributes in both alternatives in each choice set. A partial profile DCE will not present certain attributes for certain alternatives. For example, if a DCE is used to investigate the trade-off between a higher number of attributes (eg, a total of nine attributes), it could be beneficial to limit the number of attributes shown at one time (eg, five attributes) to limit participant cognitive load. Five attributes are generally considered low enough to complete a full-profile choice task, which consequently maximizes the information about trade-offs [33]. Hence, in our research, we applied a full-profile DCE.

A neutral option (“Neither of these 2”), known as an opt-out alternative, was included, in addition to selecting alternative apps. The opt-out option has the potential to make the choices more realistic [34] by simulating a real-world context where individuals can exercise their right not to take up an app, given the apps on offer [20]. In our DCE, a participant had the option to choose or reject the hypothetical uptake of a smoking cessation app. However, when a participant selects the opt-out option, no information is provided on how they trade-off attribute levels or alternatives [13]. In some situations, a forced-choice scenario can be included, where participants who chose the opt-out option are prompted to make a choice regardless. An example of a scenario with an opt-out option is shown in Figure 2.

Figure 2. An example of a scenario with an opt-out option used in our DCE. DCE: discrete choice experiment.

Experimental Design

An experimental design is a systematic method of generating choice sets that are presented to respondents. This enables the specification of the choice sets that respondents see, with the objective of obtaining a high-quality data set [7]. When creating the experimental design, there are several aspects that need to be taken into consideration, including (1) the analytical model specification, (2) whether the aim is to estimate main effects only or interaction effects as well, (3) whether the design is labeled or unlabeled, (4) the number of choice tasks and blocking options to be used, (5) which type of design of the choice matrix to use (eg, full factorial or fractional factorial, orthogonal or efficient), and (6) how the attribute-level balance will be achieved. These are now considered.

Analytical Model Specification

The first step in the generation of an experimental design is to specify the analytical model to estimate the parameters of the DCE. This step is an important component of choosing the type of choice matrix design, described later in this paper. The approach selected here needs to be accounted for when generating the structure of the experimental design.

A discrete choice model describes the probability that an individual will choose a specific alternative. This probability is expressed as a function of measured attribute levels specific to the alternative and of characteristics of the individual making the choice. This probability is represented by the dependent variable (the choice variable), which indicates the choice made by participants [8]. In this modeling framework, the attributes are the independent variables [8,13].

As part of the analytical model specification, knowing what type of statistical analysis will be used is key. Data analysis involves regression modeling in a random utility framework [8]. The random utility model conventionally used is also based on the Lancaster theory of consumer demand [35], which together assume that individuals make trade-offs when making a decision and would choose an option that offers the greatest utility [36], determined by how much importance they place on the attributes associated with the product [37].

The multinomial logit (MNL) model has been previously described as the “workhorse” of DCE estimation [38,39], and it typically serves as a starting point for basic model estimation (although alternative models, such as probit, may be used). It is important to note that MNL requires some important assumptions and limitations—for example, independence of irrelevant alternatives, homogeneity of preferences, and independence of observed choices [40,41]. Extensions of MNL (eg, nested logit, mixed logit, and latent class models) may be used to account for these limitations [39,40].

Based on the model specified in our DCE, the underlying utility function for alternative j [38] is shown in Textbox 1.

The utility function used in our DCE research. DCE: discrete choice experiment.

U_j = (β_cost × X_j_cost) + (β_developer × X_j_developer) + (β_ratings × X_{j ratings}) + (β_description × X_j_description) + (β_images × X_nj_images) + ε

Note:

1) U is the overall utility derived from alternative j.

2) β is the coefficient attached to X_j estimated in the analysis and represents the part-worth utility attached to each attribute level.

3) ε is the random error of the model—in other words, the unmeasured factors influencing the variation of preferences.

Textbox 1. The utility function used in our DCE research. DCE: discrete choice experiment.

Main Effects or Interaction Effects

The next step in model specification is deciding whether main effects or interaction effects will be investigated. The main effects, the most commonly used, investigate the effect of each attribute level on the choice variable. The effect on the choice variable gained by combining two or more attribute levels (eg, app developer and the app's monthly cost) refers to an interaction effect [13]. In our DCE, given the novel nature of the research on the uptake of health apps and the lack of empirical evidence to suggest the presence of potential interactions between attributes, we decided to only look at main effects.

Labeled or Unlabeled Experiment

In a labeled experiment, the alternatives are specific and different (eg, smartphone app-based smoking cessation intervention vs website-based smoking cessation intervention) and alternative specific attributes could be used (eg, some attributes relevant only for apps and others for websites). This is in contrast to an unlabeled experimental design, where the alternatives are unspecified (eg, smoking cessation app alternative 1 vs smoking cessation app alternative 2) and also must have the same attributes. Given that a DCE model estimates parameters for each of the alternatives being considered, these alternative specific parameters must be included in the structure of the experimental design (described in the next section) in a labeled experiment; in an unlabeled experiment, because alternative specific parameters are arbitrary, they are excluded [22,42,43]. In health economics, the unlabeled approach is the most common. In our DCE, the unlabeled approach was deemed logical here as we were comparing different presentations of the same app. Therefore, our DCE design applied an unlabeled approach.

Generation of the Structure of the Experimental Design

Once the model is specified, the structure of the experimental design can be generated. For this stage, hypothetical alternatives are generated and combined to form choice tasks, based on the chosen attributes and their levels. Several different software packages may be used to generate the experimental design of a DCE, such as Ngene, SAS, SPEED, SPSS, and Sawtooth. For our DCE, Ngene software was used [44].

Number of Choice Tasks and Blocking

The next step in the generation of an experimental design is to decide on the choice task and blocking. To minimize respondent and cognitive burden, and the risk of participants losing interest during the DCE task, consideration must be paid to the target population, the number of tasks, and their complexity [13]. The higher the number of attributes, alternatives, and choice tasks, the higher the task complexity [20]. The literature suggests that a feasible limit is 18 choice sets per participant [45,46]. In the review by Marshall et al [27], most studies included between 7 and 16 choice sets. In our DCE, we administered 12 choice tasks per participant, which were deemed a number low enough to avoid excessive cognitive load but high enough to establish sufficient statistical precision.

We developed 48 choice tasks and blocked them into 4 survey versions (12 choice tasks for each). Each block represented a separate survey, and participants were randomly assigned to one of the four survey versions. Blocking is a technique widely used in DCEs to reduce cognitive burden by partitioning large experimental designs into subsets of equal size, thereby reducing the number of choice tasks that any one respondent is required to complete [47]. Blocks were generated in Ngene software, which allows for the minimization of the average correlation between the versions and attributes’ levels [48]. For the blocking to be successful, the number of choice tasks included in one block must be divisible by the number of attribute levels; in our DCE, attributes had either three or four levels.

It is noteworthy that to undertake the sample size calculation, it is crucial to know the number of alternatives per choice set, the largest number of levels of any attribute (for DCEs looking at main effects only) or the largest level of any two attributes (for a DCE looking at interaction effects), and the number of blocks [38]. Therefore, DCEs using blocking require a larger sample size [47].

Type of Choice Matrix Design

Depending on the number of attributes and their levels, a full- or fractional-factorial design can be applied. A full-factorial design would include all possible combinations of the attributes’ levels and allow the estimation of all main effects and interaction effects independent of one another [20]. However, this type of design is often considered impractical due to the high number of choice tasks required [20]. To illustrate this, the formula of calculation of the possible unique choice alternatives for a full-factorial design is L^A, where L represents the number of levels and A the number of attributes [39]. If the attributes in the DCE have a different number of levels, these need to be calculated separately and multiplied together. To reduce response burden, in our DCE, we generated a fractional-factorial design in Ngene [44], representing a sample of possible alternatives from the full-factorial design. This way, we were able to reduce the total 432 alternatives in the full design (given by L^A = 4² × 3³) to a fractional sample of 96 alternatives, arranged in 48 choice pairs.

Systematic approaches for generation of fractional-factorial designs may be further categorized into orthogonal design and efficient design. An orthogonal design is a column-based design based on orthogonal arrays that present properties of orthogonality (attributes are statistically independent of one another) and level balance (levels of attributes appear an equal number of times) and does not introduce correlation between the attributes [38]. An orthogonal array is an optimal design that is often used for DCEs examining main effects when the number of attributes and their levels is small.

For studies with five or more attributes with two or more levels, an orthogonal design may not be practical. There has therefore been a recent change in thinking toward a nonorthogonal and statistically more efficient design [38]. When perfect orthogonality and balance cannot be achieved or are not desirable, an efficient design can be applied [20]. In contrast to an orthogonal design, an efficient design aims to increase the precision of parameter estimates for a given sample size (ie, minimizing the standard errors of the estimated coefficients), while allowing some limited correlation between attributes. The most widely used efficiency measure is the D-error, which may be easily estimated using various software packages, such as Ngene, and refers to the efficiency of the experimental design in extracting information from respondents [21]. Experimental designs generated using this approach are known as D-efficient designs. A D-efficient experimental design is also recommended to maximize statistical efficiency and minimize the variability of parameter estimates [7].

An efficient design requires that known prior information about the parameters (known as priors) be made available to the algorithm and also requires the analyst to specify the analytical model specification, as described previously. Depending on what information is available, one of three types of D-efficient design can be generated [21]:

D_z-efficient design (z stands for zero priors): If no prior information about the magnitude or directions of the parameters is available. D_z-efficient design is an orthogonal design. This design assumes the parameters are zero.
D_p-efficient design (p stands for priors): This assumes a fixed, certain value and direction for the parameters.
D_b-efficient design (b stands for Bayesian): A Bayesian approach is whereby the parameter is not known with certainty but may be described by its probability distribution.

The best practice is to pilot the DCE. For the pilot phase, there is limited information available and using the D_z-efficient or D_p-efficient design is sensible. In our DCE, we chose to apply a D_p-efficient design, as the direction of priors of the app was known from the previously conducted survey, to narrow down the attribute levels and to provide prior estimates of the parameters for the attribute levels. For example, we knew that a trusted organization will likely positively influence uptake and cost estimated negatively so. The direction of priors was assumed to be a small near-zero negative or a positive value for the design.

The pilot phase provided the estimation that we used to generate a D_b-efficient design for the final DCE. It is noteworthy that when the parameter priors are different from zero, the efficient design generated produces smaller prediction errors than orthogonal designs [21,49,50]. Hence, a D-efficient design will outperform an orthogonal design, and (given reliable priors), a D_p-efficient design will outperform a D_z-efficient design [21]. Further, when reasonable assumptions about the distributions are made, a D_b-efficient design will outperform a D_p-efficient design. Therefore, it may be advisable to start piloting with a D_p-efficient design and to generate a D_b-efficient design for the final DCE. The DCE literature provides a detailed and more comprehensive description of orthogonal and efficient designs [21] and the approximation of the Bayesian efficient design [23].

Attribute-Level Balance in the Model

The attribute-level balance aims to ensure all attribute levels ideally appear an equal number of times in the experimental design. The allocation of the attribute levels within the experimental design can affect statistical power; if a certain level is underrepresented in the choice sets generated, then the coefficient for that level cannot be easily estimated. How attributes levels are distributed is therefore an important consideration when designing the choice sets. Dominant alternatives, where all attribute levels of one alternative are more desirable than all attribute levels in the others, do not provide information about how trade-offs are made, as individuals usually would select the dominant alternatives. Therefore, avoiding dominant alternatives in the experimental design is important and can be achieved by consulting the software manual to ensure the correct algorithm is used. The syntax used in Ngene to generate choice sets of the pilot phase and more information about the algorithm used can be accessed on the Open Science Framework [19].

Piloting the DCE and Generating the Bayesian Design

In addition to providing estimations for the choice matrix design described above, piloting offers an opportunity to ensure that the information is presented clearly and that the choices are realistic and meaningful. It also provides insight into how cognitively demanding it is for respondents to complete. This can be achieved by gathering feedback on the survey completion process. The findings of the pilot may suggest that the DCE needs to be amended, such as reducing the number of choice sets or the number of attributes, so that the responses are a better reflection of the participants’ preferences and improve the precision in the parameter estimates [13].

There is no formal guidance on how large the pilot sample should be, and this is largely guided by the budget and complexity of the experimental design. Accuracy of the priors will improve with increasing sample size, but as few as 30 responses may be sufficient to generate useable data [44]. In our pilot study conducted with 49 individuals, feedback from the participants suggested that with the initial order of the attributes, there was a tendency to ignore the last two attributes, app description and images of the app, the most text-heavy attributes. This may have compromised the examination of the relative importance of those two attributes (app description and images of the app). Therefore, we decided to change the final order of the attributes from (1) monthly price of the app, (2) the ratings of the app, (3) who developed the app, (4) the description, and (5) images shown to the one listed in Figures 1 and 2. The longest completion time for the survey was under 12 min. Thus, we concluded that the number of choice tasks did not need to be reduced.

In our research, the data from the pilot phase were analyzed using the freely available Apollo package in R software [51]. The coefficients and their standard errors from the output were used as priors to generate the final choice sets using the Bayesian efficient design following the steps described previously. The syntax used in R used to analyze the pilot data and that used to generate the Bayesian efficient design in Ngene can be accessed on the Open Science Framework [19].

Internal Validity

Assessing the internal validity of a DCE can help with understanding the consistency and trade-off assumptions made by participants [52]. There are several ways to examine the internal validity of a DCE. For example, in the stability validity test, a choice task would be repeated later in the sequence to investigate the consistency of the participants’ decision, whether they would choose the same alternative [52]. Another way to test internal validity is the within-set dominated pairs type of internal validity, in which one alternative is a dominant alternative in which all attributes are the most desirable ones. The choice sets designed to measure internal validity are excluded from the analysis. There are several internal validity tests that are built into software packages such as MATLAB [52], although these can be produced manually as well. In our research, we used the stability validity test to check the internal validity by repeating a randomly generated choice task (in our case, it was the fourth). Therefore, participants were shown 12 choice tasks, plus an additional hold-out task. The data from the randomly generated hold-out task were excluded from the analysis.

Although internal validity checks provide some measure of data quality, it should be noted that answering a repeat choice inconsistently is not a violation of random utility theory [53]. Furthermore, there is no consensus on what to do with the data from responses that fail validity tests. Following the advice of Lancsar and Louviere [54], we did not exclude participants who failed the internal validity check, as that might have caused statistical bias or affected statistical efficiency. However, we reported data on internal validity to enable the reader to make a judgement on likely biases.

All additional study materials used in our example, including the full data set and the results of the DCE, can be accessed on Open Science Framework [19].

Summary

This paper describes the development of a DCE, following the stages required to establish attributes and their levels, construct choice tasks, define the utility model, decide on labeled and unlabeled choices to apply, decide on the number of choice tasks that need to be generated, and make decisions on the structure of the experimental design, how to achieve attribute-level balance, how to assess the internal model validity, and how to pilot-test. In doing so, the intention is to advance methodological awareness of the application of stated preference methods in the field of digital health, as well as to provide researchers with an overview of their application using a case study of a DCE of smoking cessation app uptake.

Although DCEs are widely used to understand patient and provider choices in health care [8,10,15,55], they have only recently started to gain popularity in digital health [4-6] and as such represent an underused approach in digital health. With the growing evidence of the benefit of digital health initiatives, there are clear benefits to widening the application of DCEs so that they may more routinely inform digital health development, inform digital tool presentation, and, most importantly, predict uptake and engagement with digital products. Although several attempts have been made to measure engagement with digital tools using a wide range of methodologies [56-58], the insights we have from them that can be translated to uptake are limited. One plausible explanation is that uptake of digital tools is difficult to empirically measure.

Benefits and Limitations of DCEs

DCEs bring several benefits to help overcome the issue of measuring uptake in digital health or in other areas where the measurement of the predictors of uptake in a good or service is required. For example, as illustrated by the case study here, they enable the researcher to gain measurable insights into situations in which quantitative measures are hard to otherwise obtain, such as the factors impacting the uptake of health apps on curated health app portals. A DCE also helps to quantify preferences to support more complex decisions [59]. An example would be the consideration of how to plan the development of an app that would provide appealing looks or features that would promote uptake. The DCE methodology is also considered a convenient approach to investigate the uptake of new interventions, including digital health interventions [38], for example, digital behavior change interventions using a health and well-being smartphone app. Therefore, DCEs can be used in hypothetical circumstances, enabling the measurement of preferences for a potential policy change or digital health system change before it is implemented [13], such as the recent investigation of the uptake of a COVID-19 test-and-trace health app [3,4]. The experimental nature of the DCE also means that participants’ preferences can be recorded based on controlled experimental conditions, where attributes are systematically varied by researchers to obtain insight into the marginal effect of attribute changes on individuals’ choices [7].

Despite their benefits, the application of DCEs presents several challenges. As with all expressed preference methodologies, the hypothetical nature of the DCE choice set raises concerns about external validity and the degree to which real-world decisions might equate to those made by study participants under experimental conditions, a phenomenon known as the intention-behavior gap [60]. As such, participants may believe they would choose a scenario presented and described in a choice task, but in real life, there might be other factors that would influence their behaviors, such as the aesthetics of the app [28]. This limitation can at least partially be overcome by developing convincing and visually appealing choice tasks. Nevertheless, to date, there has been limited progress in testing for external validity due to the difficulty in investigating preferences in the real world [38]. Indeed, a recent systematic review of the literature on DCEs in health care reported that only 2% of the included studies (k=7) report details of the investigation of external validity [47], while an earlier systematic review and meta-analysis (k=6) found that DCEs have only a moderate level of accuracy in predicting behaviors of health choices [61]. To our knowledge, no study has been published that investigates the external validity of a DCE developed in digital health. One potential opportunity to undertake some testing would be through a curated health app portal, where the same health app is presented in two or more different ways. With the help of website analytics, actual user behavior could be measured in this situation.

A final significant concern associated with the use of a DCE is that any single choice set is unlikely to be able to present the user with all relevant attributes, regardless of how well it has been developed [61]. Choosing the most relevant attributes to test in a DCE, therefore, requires comprehensive preparatory research, which can lengthen the time required to undertake the development phase of any piece of work.

Conclusion

In summary, DCEs have significant potential in digital health research and can serve as an important decision-making tool in a field where observational data are lacking. We hope that the content of this paper provides a useful introduction and guide to those interested in developing such experiments in digital health.

Acknowledgments

We are grateful to two experts in discrete choice experiments, Prof. Michiel Bliemer from the University of Sydney, Australia, a co-developer of Ngene software, and Prof. Stephane Hass from the University of Leeds, United Kingdom, a co-developer of the Apollo package in R software, for their advice on the syntax used to generate the choice tasks in Ngene and on the code used in the Apollo package. JAW and RC received funding from the National Institute for Health Research (NIHR) Applied Research Collaboration East of England. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR, or the Department of Health and Social Care.

Authors' Contributions

DS prepared the manuscript. All authors have reviewed the draft for important intellectual content and approved the final version.

Conflicts of Interest

JB has received unrestricted funding to study smoking cessation from Pfizer and J&J, who manufacture smoking cessation medications.

Danner M, Hummel JM, Volz F, van Manen JG, Wiegard B, Dintsios C, et al. Integrating patients' views into health technology assessment: analytic hierarchy process (AHP) as a method to elicit patient preferences. Int J Technol Assess Health Care 2011 Oct;27(4):369-375. [CrossRef] [Medline]
Hall J, Viney R, Haas M, Louviere J. Using stated preference discrete choice modelling to evaluate health care programs. Journal of Business Research 2002 Jun;57(9):1026-1032. [CrossRef]
Wiertz C, Banerjee A, Acar OA, Ghosh A. Predicted adoption rates of contact tracing app configurations: insights from a choice-based conjoint study with a representative sample of the UK population. SSRN Journal 2020 Apr 30:1-19. [CrossRef]
Jonker M, de Bekker-Grob E, Veldwijk J, Goossens L, Bour S, Rutten-Van Mölken M. COVID-19 contact tracing apps: predicted uptake in the Netherlands based on a discrete choice experiment. JMIR Mhealth Uhealth 2020 Oct 09;8(10):e20741-e20715 [FREE Full text] [CrossRef] [Medline]
Nittas V, Mütsch M, Braun J, Puhan MA. Self-monitoring app preferences for sun protection: discrete choice experiment survey analysis. J Med Internet Res 2020 Nov 27;22(11):e18889 [FREE Full text] [CrossRef] [Medline]
Leigh S, Ashall-Payne L, Andrews T. Barriers and facilitators to the adoption of mobile health among health care professionals from the United Kingdom: discrete choice experiment. JMIR Mhealth Uhealth 2020 Jul 06;8(7):e17704 [FREE Full text] [CrossRef] [Medline]
Reed Johnson F, Lancsar E, Marshall D, Kilambi V, Mühlbacher A, Regier DA, et al. Constructing experimental designs for discrete-choice experiments: report of the ISPOR Conjoint Analysis Experimental Design Good Research Practices Task Force. Value Health 2013;16(1):3-13 [FREE Full text] [CrossRef] [Medline]
Spinks J, Chaboyer W, Bucknall T, Tobiano G, Whitty JA. Patient and nurse preferences for nurse handover-using preferences to inform policy: a discrete choice experiment protocol. BMJ Open 2015 Nov 11;5(11):e008941 [FREE Full text] [CrossRef] [Medline]
Bridges JFP, Hauber AB, Marshall D, Lloyd A, Prosser LA, Regier DA, et al. Conjoint analysis applications in health—a checklist: a report of the ISPOR Good Research Practices for Conjoint Analysis Task Force. Value Health 2011 Jun;14(4):403-413 [FREE Full text] [CrossRef] [Medline]
Clark MD, Determann D, Petrou S, Moro D, de Bekker-Grob EW. Discrete choice experiments in health economics: a review of the literature. Pharmacoeconomics 2014 Sep;32(9):883-902. [CrossRef] [Medline]
Kotnowski K, Fong GT, Gallopel-Morvan K, Islam T, Hammond D. The impact of cigarette packaging design among young females in Canada: findings from a discrete choice experiment. Nicotine Tob Res 2016 May;18(5):1348-1356. [CrossRef] [Medline]
Lambooij MS, Harmsen IA, Veldwijk J, de Melker H, Mollema L, van Weert YWM, et al. Consistency between stated and revealed preferences: a discrete choice experiment and a behavioural experiment on vaccination behaviour compared. BMC Med Res Methodol 2015 Mar 12;15:19 [FREE Full text] [CrossRef] [Medline]
Mangham LJ, Hanson K, McPake B. How to do (or not to do) ... designing a discrete choice experiment for application in a low-income country. Health Policy Plan 2009 Mar;24(2):151-158. [CrossRef] [Medline]
Trapero-Bertran M, Rodríguez-Martín B, López-Bastida J. What attributes should be included in a discrete choice experiment related to health technologies? A systematic literature review. PLoS One 2019;14(7):e0219905 [FREE Full text] [CrossRef] [Medline]
Hall J, Kenny P, King M, Louviere J, Viney R, Yeoh A. Using stated preference discrete choice modelling to evaluate the introduction of varicella vaccination. Health Econ 2002 Jul;11(5):457-465. [CrossRef] [Medline]
Fiebig DG, Knox S, Viney R, Haas M, Street DJ. Preferences for new and existing contraceptive products. Health Econ 2011 Sep;20 Suppl 1:35-52. [CrossRef] [Medline]
Terris-Prestholt F, Hanson K, MacPhail C, Vickerman P, Rees H, Watts C. How much demand for new HIV prevention technologies can we really expect? Results from a discrete choice experiment in South Africa. PLoS One 2013;8(12):e83193 [FREE Full text] [CrossRef] [Medline]
Terris-Prestholt F, Quaife M, Vickerman P. Parameterising user uptake in economic evaluations: the role of discrete choice experiments. Health Econ 2016 Feb;25 Suppl 1:116-123 [FREE Full text] [CrossRef] [Medline]
Szinay D, Rory C, Jones A, Whitty J, Chadborn T, Brown J, et al. Adult Smokers' Preferences for the Uptake of Smoking Cessation Apps: A Discrete Choice Experiment. 2021 Mar 12. URL: https://osf.io/5439x/ [accessed 2020-11-01]
Lancsar E, Louviere J. Conducting discrete choice experiments to inform healthcare decision making: a user's guide. Pharmacoeconomics 2008;26(8):661-677. [CrossRef] [Medline]
Rose JM, Bliemer MCJ. Constructing efficient stated choice experimental designs. Transport Rev 2009 Sep;29(5):587-617. [CrossRef]
de Bekker-Grob EW, Hol L, Donkers B, van Dam L, Habbema JDF, van Leerdam ME, et al. Labeled versus unlabeled discrete choice experiments in health economics: an application to colorectal cancer screening. Value Health 2010;13(2):315-323 [FREE Full text] [CrossRef] [Medline]
Bliemer MC, Rose JM, Hess S. Approximation of bayesian efficiency in experimental choice designs. J Choice Model 2008;1(1):98-126. [CrossRef]
Hauber AB, González JM, Groothuis-Oudshoorn CGM, Prior T, Marshall DA, Cunningham C, et al. Statistical methods for the analysis of discrete choice experiments: a report of the ISPOR Conjoint Analysis Good Research Practices Task Force. Value Health 2016 Jun;19(4):300-315 [FREE Full text] [CrossRef] [Medline]
Kløjgaard ME, Bech M, Søgaard R. Designing a stated choice experiment: the value of a qualitative process. J Choice Model 2012;5(2):1-18. [CrossRef]
Buchanan J, Blair E, Thomson KL, Ormondroyd E, Watkins H, Taylor JC, et al. Do health professionals value genomic testing? A discrete choice experiment in inherited cardiovascular disease. Eur J Hum Genet 2019 Nov 11;27(11):1639-1648 [FREE Full text] [CrossRef] [Medline]
Marshall D, Bridges JFP, Hauber B, Cameron R, Donnalley L, Fyie K, et al. Conjoint analysis applications in health: how are studies being designed and reported? An update on current practice in the published literature between 2005 and 2008. Patient 2010 Dec 01;3(4):249-256. [CrossRef] [Medline]
Szinay D, Jones A, Chadborn T, Brown J, Naughton F. Influences on the uptake of and engagement with health and well-being smartphone apps: systematic review. J Med Internet Res 2020 May 29;22(5):e17572-e17523 [FREE Full text] [CrossRef] [Medline]
Szinay D, Perski O, Jones A, Chadborn T, Brown J, Naughton F. Influences on the uptake of health and well-being apps and curated app portals: think-aloud and interview study. JMIR Mhealth Uhealth 2021 Apr 27;9(4):e27173 [FREE Full text] [CrossRef] [Medline]
Rakotonarivo OS, Schaafsma M, Hockley N. A systematic review of the reliability and validity of discrete choice experiments in valuing non-market environmental goods. J Environ Manage 2016 Dec 01;183:98-109 [FREE Full text] [CrossRef] [Medline]
Ratcliffe J, Longworth L. Investigating the structural reliability of a discrete choice experiment within health technology assessment. Int J Technol Assess Health Care 2002;18(1):139-144.
Yang J, Reed SD, Hass S, Skeen MB, Johnson FR. Is easier better than harder? An experiment on choice experiments for benefit-risk tradeoff preferences. Med Decis Making 2021 Feb;41(2):222-232. [CrossRef] [Medline]
Mühlbacher A, Johnson FR. Choice experiments to quantify preferences for health and healthcare: state of the practice. Appl Health Econ Health Policy 2016 Jun;14(3):253-266. [CrossRef] [Medline]
Watson V, Becker F, de Bekker-Grob E. Discrete choice experiment response rates: a meta-analysis. Health Econ 2017 Jun;26(6):810-817. [CrossRef] [Medline]
Ryan M, Gerard K, Amaya-Amaya M. Discrete choice experiments in a nutshell. In: Ryan M, Gerard K, Amaya-Amaya M, editors. Using Discrete Choice Experiments to Value Health and Health Care. Dordrecht, the Netherlands: Springer; 2008:13-46.
McFadden D. Conditional logit analysis of qualitative choice behavior. In: Zarembka P, editor. Frontiers in Econometrics. Cambridge, MA: Academic Press; 1973:105-142.
Potoglou D, Burge P, Flynn T, Netten A, Malley J, Forder J, et al. Best-worst scaling vs. discrete choice experiments: an empirical comparison using social care data. Soc Sci Med 2011 May;72(10):1717-1727. [CrossRef] [Medline]
de Bekker-Grob EW, Ryan M, Gerard K. Discrete choice experiments in health economics: a review of the literature. Health Econ 2012 Feb;21(2):145-172. [CrossRef] [Medline]
Hensher DA, Rose JM, Green W. Applied Choice Analysis: A Primer. Cambridge, UK: Cambridge University Press; 2005:1.
Hensher DA, Greene WH. The mixed logit model: the state of practice. Transportation 2003;30(2):133-176. [CrossRef]
Train K. Logit. Discrete Choice Methods with Simulation. 2 ed. Cambridge: Cambridge University Press; 2009:34-75.
Kruijshaar ME, Essink-Bot M, Donkers B, Looman CW, Siersema PD, Steyerberg EW. A labelled discrete choice experiment adds realism to the choices presented: preferences for surveillance tests for Barrett esophagus. BMC Med Res Methodol 2009 May 19;9(1):31 [FREE Full text] [CrossRef] [Medline]
Jin W, Jiang H, Liu Y, Klampfl E. Do labeled versus unlabeled treatments of alternatives' names influence stated choice outputs? Results from a mode choice study. PLoS One 2017;12(8):e0178826 [FREE Full text] [CrossRef] [Medline]
ChoiceMetrics. NGene.1.1.1 User Manual & Reference Guide A. Online: ChoiceMetrics; 2012:1.
Christofides NJ, Muirhead D, Jewkes RK, Penn-Kekana L, Conco DN. Women's experiences of and preferences for services after rape in South Africa: interview study. BMJ 2006 Jan 28;332(7535):209-213 [FREE Full text] [CrossRef] [Medline]
Hanson K, McPake B, Nakamba P, Archard L. Preferences for hospital quality in Zambia: results from a discrete choice experiment. Health Econ 2005 Jul;14(7):687-701. [CrossRef] [Medline]
Soekhai V, de Bekker-Grob EW, Ellis AR, Vass CM. Discrete choice experiments in health economics: past, present and future. Pharmacoeconomics 2019 Feb;37(2):201-226 [FREE Full text] [CrossRef] [Medline]
Janssen EM, Hauber AB, Bridges JFP. Conducting a discrete-choice experiment study following recommendations for good research practices: an application for eliciting patient preferences for diabetes treatments. Value Health 2018 Jan;21(1):59-68 [FREE Full text] [CrossRef] [Medline]
Ferrini S, Scarpa R. Designs with a priori information for nonmarket valuation with choice experiments: a Monte Carlo study. J Environ Econ Manage 2007 May;53(3):342-363. [CrossRef]
Kessels R, Jones B, Goos P, Vandebroek ML. An efficient algorithm for constructing Bayesian optimal choice designs. SSRN J 2006:1 [FREE Full text] [CrossRef]
Hess S, Palma D. Apollo: a flexible, powerful and customisable freeware package for choice model estimation and application. J Choice Model 2019 Sep;32:100170. [CrossRef]
Johnson FR, Yang J, Reed SD. The internal validity of discrete choice experiment data: a testing tool for quantitative assessments. Value Health 2019 Feb;22(2):157-160 [FREE Full text] [CrossRef] [Medline]
Hess S, Daly A, Batley R. Revisiting consistency with random utility maximisation: theory and implications for practical work. Theory Decis 2018 Jan 2;84(2):181-204 [FREE Full text] [CrossRef] [Medline]
Lancsar E, Louviere J. Deleting 'irrational' responses from discrete choice experiments: a case of investigating or imposing preferences? Health Econ 2006 Aug;15(8):797-811. [CrossRef] [Medline]
Quaife M, Terris-Prestholt F, Eakle R, Cabrera Escobar MA, Kilbourne-Brook M, Mvundura M, et al. The cost-effectiveness of multi-purpose HIV and pregnancy prevention technologies in South Africa. J Int AIDS Soc 2018 Mar;21(3):1 [FREE Full text] [CrossRef] [Medline]
Perski O, Blandford A, Garnett C, Crane D, West R, Michie S. A self-report measure of engagement with digital behavior change interventions (DBCIs): development and psychometric evaluation of the "DBCI Engagement Scale". Transl Behav Med 2020 Feb 03;10(1):267-277 [FREE Full text] [CrossRef] [Medline]
Craig Lefebvre R, Tada Y, Hilfiker SW, Baur C. The assessment of user engagement with eHealth content: the eHealth Engagement Scale. J Comput Mediat Commun 2010 Jul 01;15(4):666-681. [CrossRef]
O'Brien HL, Toms EG. The development and evaluation of a survey to measure user engagement. J. Am. Soc. Inf. Sci 2009 Oct 19;61(1):50-69. [CrossRef]
Brett Hauber A, Fairchild AO, Reed Johnson F. Quantifying benefit-risk preferences for medical interventions: an overview of a growing empirical literature. Appl Health Econ Health Policy 2013 Aug;11(4):319-329. [CrossRef] [Medline]
Ajzen I. The theory of planned behavior. Organ Behav Hum Decis Process 1991 Dec;50(2):179-211. [CrossRef]
Quaife M, Terris-Prestholt F, Di Tanna GL, Vickerman P. How well do discrete choice experiments predict health choices? A systematic review and meta-analysis of external validity. Eur J Health Econ 2018 Nov;19(8):1053-1066. [CrossRef] [Medline]

‎

DCE: discrete choice experiment

MNL: multinomial logit

Edited by G Eysenbach; submitted 24.07.21; peer-reviewed by HL Tam, R Marshall; comments to author 16.08.21; revised version received 24.08.21; accepted 18.09.21; published 11.10.21

©Dorothy Szinay, Rory Cameron, Felix Naughton, Jennifer A Whitty, Jamie Brown, Andy Jones. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 11.10.2021.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

This paper is in the following e-collection/theme issue:

Understanding Uptake of Digital Health Products: Methodology Tutorial for a Discrete Choice Experiment Using the Bayesian Efficient Design