This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
What is the next frontier for computer-tailored health communication (CTHC) research? In current CTHC systems, study designers who have expertise in behavioral theory and mapping theory into CTHC systems select the variables and develop the rules that specify how the content should be tailored, based on their knowledge of the targeted population, the literature, and health behavior theories. In collective-intelligence recommender systems (hereafter recommender systems) used by Web 2.0 companies (eg, Netflix and Amazon), machine learning algorithms combine user profiles and continuous feedback ratings of content (from themselves and other users) to empirically tailor content. Augmenting current theory-based CTHC with empirical recommender systems could be evaluated as the next frontier for CTHC.
The objective of our study was to uncover barriers and challenges to using recommender systems in health promotion.
We conducted a focused literature review, interviewed subject experts (n=8), and synthesized the results.
We describe (1) limitations of current CTHC systems, (2) advantages of incorporating recommender systems to move CTHC forward, and (3) challenges to incorporating recommender systems into CTHC. Based on the evidence presented, we propose a future research agenda for CTHC systems.
We promote discussion of ways to move CTHC into the 21st century by incorporation of recommender systems.
Are there aspects of the Web 2.0 phenomenon that can be marshaled by public health practitioners to improve community and individual health or advance scientific goals?
Theory-based, computer-tailored health communication (CTHC) is a tool that is frequently used to support behavior change [
John Smith, a 38-year-old smoker, has been smoking for 15 years. He has made multiple quit attempts in the past, but during each attempt he gained between 10 and 20 pounds. Fear of weight gain is a significant barrier to another quit attempt.
John is trying to quit again and registers on the Decide2Quit.org Web-assisted tobacco intervention. For 8 weeks, the system sends 2 tailored emails per week to John to help him quit.
Current CTHC
In this approach, tailoring is based on information that John provides when he registers. For this example, we focus on 1 characteristic only: gender.
Since women are typically more concerned about weight gain after quitting [
After registering on Decide2Quit.org, John receives the first email that targets weight loss in the second week (third message) of the intervention. John likes the message and finds the tips it offers useful. He looks forward to receiving similar messages. However, the next 5 messages he receives focus on other topics. The next message with information on weight gain arrives only in week 5.
John does not think the system helped and fails in his attempt to quit.
Recommender CTHC
In this approach, the message is selected based on the collective-intelligence data, not on preset rules.
After registering on Decide2Quit.org, John visits the weight loss page on the website (implicit data). The system uses these data and selects 1 of the messages targeting weight loss and sends it to John in week 2 (third message). John likes the messages and rates the message highly (explicit data). The system then notes both of these items of implicit and explicit feedback and regularly sends messages targeting weight gain to John. The system also repeats the message that John rates highly.
Because the intervention targeted his needs more specifically, John finds these messages useful and succeeds in his attempt to quit.
We have kept the example simple to be easily understandable. We have not included in this example how the group’s feedback can help John.
New approaches to tailoring based on collective intelligence may be able to build on the successes and lessons learned from past tailoring efforts, and may overcome the limitations inherent in current CTHC systems. Many people already encounter collective-intelligence tailoring as they interact with companies like Netflix and Amazon. These companies have developed a special class of machine learning algorithms (recommender systems) to tailor content. These systems tailor content based on collective-intelligence data (ie, data derived from the behavior of users as they interact with the system) in addition to user profiles [
Collective-intelligence data include implicit and explicit user feedback. Implicit data are derived from user actions (eg, the website view patterns of each individual accessing the system). Explicit data consist of self-reported item ratings (eg, ratings provided by users for items such as books or movies, often on a 5-star scale). However, in the health-promotion arena, patients could be asked to rate relevance, influence, or other properties of a message or product. Using these data, along with user demographic characteristics, the algorithms underlying the system generate personalized item recommendations for each user. As these systems learn more about the user, they can continually adapt to improve the recommendations.
Recommender systems can be implemented using 3 approaches: a content-based approach [
The lower portion of
We present information gained through a focused literature review and through interviews with subject experts. We begin with a description of the limitations of current CTHC systems. We then describe the potential advantages and challenges of using a recommender systems approach. Based on the evidence presented, we propose a future research agenda. Our goal is to promote discussion of techniques to improve current CTHC through use of recommender systems.
We conducted a focused literature review and interviewed experts to explore whether and how recommender systems might enhance CTHC approaches. This study was conducted between October 2012 and September 2015.
We conducted a focused literature review to identify white papers, conceptual papers, and peer-reviewed papers describing both current CTHC systems and recommender systems, as well as information for the following categories: limitations of current CTHC systems, advantages of recommender systems over rule-based systems, and challenges of implementing recommender systems for health promotion. We excluded papers that only described a specific intervention or a specific method for implementing these systems. Papers published in peer-reviewed journals and conferences between 1985 and 2015 from several disciplines, including clinical, health promotion, behavioral medicine, computer engineering, and recommender systems, were considered for the secondary literature review that was conducted from August 2015 through October 2015. The following databases were searched: PubMed, ACM Digital Library, and IEEE Xplore. Search terms for the Boolean search techniques were computer tailoring, health message tailoring, recommender systems, content-based, collaborative filtering, hybrid systems, and their combinations with health, overview, challenges, and barriers. Additionally, we reviewed the reference lists of all of the identified papers for additional relevant papers (
Literature review study flow diagram.
We interviewed a purposive sample of experts in academia and at the National Institutes of Health (NIH) (n=8). We chose a sample size of 8 to assure representation in the 2 domains of interest (4 each): (1) computer engineering and recommender systems, and (2) health behavioral change, health communication, and computer tailoring. Interviewees were recruited through personal contacts and personal outreach at conferences, such as the Society of Behavioral Medicine, American Medical Informatics Association, and recommender systems annual conferences. We conducted individual interviews and used an open-ended interview format structured around the 3 themes: the limitations of current CTHC systems, potential advantages of recommender systems over rule-based systems, and challenges of implementing recommender systems for health promotion. In the beginning of the interview, the interviewer described the 2 types of systems (current CTHC and recommender systems) to promote discussion. Our literature findings organized around the 3 categories (limitations of current CTHC systems, advantages of recommender systems over rule-based systems, and challenges of implementing recommender systems for health promotion) were presented to the experts. We then used open-ended questions designed to solicit feedback from the experts around the 3 categories. Example questions were (1) Thinking about your last CTHC study, tell us how current CTHC systems limited your efforts in your study? (2) Thinking about your last CTHC study, tell us how you think recommender CTHC systems would have addressed current CTHC limitations? (3) What do you think are the challenges for using recommender systems in health interventions? Prompts were used when necessary. Example prompts were (1) Were you able to implement all the tailoring rules in your current CTHC study? (2) Do you think we have sufficient data to implement recommender systems? Detailed notes of each interview were taken. We used a process similar to the literature synthesis to summarize and extract information from the interviews. Specifically, the same 2 authors summarized key points and issues that were raised during the interviews (also organized into limitations of CTHC, potential advantages of recommender systems, and challenges) and presented these to the group for further synthesis.
We present the results of our data synthesis below.
Current CTHC frameworks use theory-driven, rule-based systems to provide different messages to patient subsets [
In the
Once these questions are addressed and the messages written, study designers use metadata to describe and categorize the messages. This step allows the CTHC system to select appropriate messages for a patient subset. Metadata is defined as data about data; it describes the structure or content of a particular resource, object, or entity [
As study designers address the above questions and develop the intervention, they also have to balance several factors, including time and cost. This study designer-driven, rule-based approach may lead to 3 important limitations, detailed in the expert interviews.
Structure of a current rule-based computer-tailored health communication (CTHC) system.
Leaders in the field of CTHC have demonstrated that high tailoring (tailoring on many variables) is better than low tailoring (using fewer variables) [
While theory provides important guidance to CTHC investigators, current theories may underrepresent the complexity of factors that influence health behaviors [
A user’s personal preferences and behaviors can change over the duration of the intervention. An optimal CTHC system needs to have the capabilities to adapt in order to remain relevant and engaging. While the ability of current systems to collect real-time behavior has improved (eg, ecological momentary assessment and use of sensors), current CTHC rule-based approaches are limited in how they can adapt to this information. CTHC rule-based systems typically adapt only to anticipated and predicted changes in behavior (ie, how the study designers think users will behave). For example, current CTHC systems can be easily programmed to adapt to changes in a smoker’s motivation to quit. However, to adapt to all the behavioral patterns of the individual and the group, existing rules would need to be modified or new rules added. This approach quickly becomes resource intensive and often infeasible.
Using the
The use of complex algorithms to generate the tailoring recommendations based on collective-intelligence data allows tailoring based on the “observed behavior” of the users—how the users are responding to the intervention collected through user feedback, rather than how the study designers predict the users are going to respond. User feedback data can be in the form of explicit or implicit data. As noted, recommender systems can be implemented using 3 approaches: a content-based approach [
In contrast to content-based recommender systems, collaborative filtering recommender systems match users to items by directly leveraging feedback ratings data (implicit or explicit) of the item (ie, messages in the case of CTHC). The simplest examples of this approach are nearest-neighbor methods [
Hybrid recommender systems merge the strengths of content-based and collaborative filtering recommender systems [
When seeking to develop recommender system-based CTHC, study designers must face the following questions (see also
In the aforementioned smoking cessation example (
As in current CTHC, in the recommender system study designers will have to develop metadata describing message characteristics that will be used for message selection. Study designers do not typically have to consider the selection of the tailoring variables and the rules, as these will be derived from the data collected by the algorithms underlying the recommender systems. This data-driven approach has the potential to provide several advantages. These are as follows.
Components of recommender systems for computer tailoring. The primary differences between the 2 systems depicted in
Sophisticated machine learning algorithms are potentially able to consider all of the available user variables and to tailor based on these variables. As noted above, rule-based systems are limited in the number of variables that can be used. The recommender system approach potentially reduces the possibility of any key variable being excluded and allows for tailoring on more variables. The number of variables that can be effectively incorporated or is meaningful to the participant has to be empirically tested. Systems also have to be designed to collect all potential user data to take advantage of this ability of the recommender systems.
A recommender systems approach would be an ideal complement to theory-based approaches because it would identify important variables from user data and behavior. The machine learning algorithms of recommender systems recommend messages based on the data and are not limited to the study designer-written rules.
In contrast to rule-based approaches, the machine learning algorithms of the recommender system can more easily adapt to unpredicted changes in individual as well as group user behavior. As noted in the
Rule-based computer-tailored health communication (CTHC) versus recommender systems.
Feature | Rule-based CTHC | Recommender systems |
Intervention development questions | (1) Message writing: What are the important concepts for the targeted population? | (2) Message writing: What are the important concepts for the targeted population? |
|
(2) Tailoring variables: How should the target population be segmented? | (2) Tailoring variables: What collective-intelligence data (implicit and explicit data) should be collected and how? |
|
(3) Rules: How should messages for the participant patient segment be selected? |
|
Message selection | Rules-driven: Study designers develop rules based on the literature and theory. These rules link user profiles to the metadata of the messages, selecting messages for a patient subset. | Data-driven: Sophisticated machine learning algorithms derive the tailoring rules from the collective-intelligence data of the individual, as well as the group. |
Complexity (number of variables) | The number of variables incorporated can become quickly unmanageable. It is limited by the sophistication of the study designers in the team, project’s timeline, and budget. | Sophisticated algorithms can potentially consider all the variables collected in the intervention. |
Use of theory | Tailoring is limited to theoretical constructs. | Theory is augmented by deriving recommendations from the user data. |
Adaptation | System is limited to predicted changes in behavior. | System can continuously adapt, potentially improving with each message delivered. Responds to the user’s behavior and to the group’s behavior over time. |
The potential is there, but can recommender systems be adapted to CTHC systems? There are several challenges or potential barriers to widespread adoption of this approach including.
When companies such as Netflix and Amazon deployed their recommender systems, they had already collected collective-intelligence data on several thousands of users. In contrast, most CTHC interventions do not have access to such data sets, such as prior ratings of motivational messages, use and effectiveness data of an intervention, or sensor data from physiological measures of recipients’ reactions. The lack of such collective-intelligence data at the start affects the ability to reach and maintain sufficient momentum in the early stages of an intervention.
The sample size and study timeline of a typical behavioral health intervention impose additional challenges to a recommender system. In 2012, Amazon.com reported having a client base of over 100 million customers worldwide, while Netflix boasted 29.4 million users in that same year. In contrast, CTHC research settings draw on much smaller initial populations, often with limited user interaction. CTHC interventions have shorter timeframes, often due to the dictates of limited research funding. Small study populations and limited data collection may threaten both the generalizability of the messages and the precision of the algorithms.
Attrition rates tend to be very high in technology-assisted health interventions [
There are potential unintended consequences of using a data-driven approach to tailor messages for users. Web 2.0 companies have developed over the years a sophisticated approach to collecting feedback data and channeling these data into their recommender systems. Explicit ratings in the form of “like” functions and implicit ratings, such as user webpage visits or purchase of a product, provide detailed ongoing feedback that informs subsequent messages sent to customers. While effectiveness of a message promoting online merchandise may be measured by users’ purchasing decisions, assessing the effectiveness of behavioral health messages is more complex. For example, users’ preferences could possibly tend toward information that reinforces the behavior that is being targeted for reduction. In other words, a user liking a message may not mean that the message will influence behavior change in the desired direction.
For example, triggers for smoking can vary among smokers [
We propose the following research agenda to respond to the above challenges and to advance the field of CTHC using recommender systems approaches.
As noted, complicating the generation of collective-intelligence data is the lack of clarity of what constitutes appropriate feedback for health behavioral messages. Studies are needed to evaluate the research questions associated with this issue. We need to understand whether message feedback ratings on a single question (or dimension) are sufficient, or whether we need ratings on multiple questions. For example, a study designed to address this question could be to recruit users to rate messages on multiple dimensions (
Motivational influence
This message influences me to change my behavior. (yes/no)
Emotional engagement
This message affected me emotionally. (positive and negative emotions)
Relevance
This message was personally relevant to me.
Preference
I would like more messages like this one.
Second, as noted above, using the wrong feedback data might lead to unintended consequences (see Results). Assessing whether a message might lead to unintended consequences could be challenging. One approach is to use technological advancements in data collection (eg, ecological momentary assessment or sensors) to assess the user’s behavior after receiving a message. For example, in a smoking cessation study, smokers who are attempting to quit could be provided with a mobile app to record any smoking and the reasons for smoking during the intervention. This information could then be compared with the messages that were sent immediately preceding the smoking event to assess whether that particular message was correlated with the smoking.
There are also a few strategies that can be studied to overcome the limited availability of collective-intelligence data. For example, the preintervention stage of a study can be used for explicit data collection. Research is necessary to determine the minimum amount of explicit data needed to develop a reasonably functioning CTHC algorithm
Technological advances can also be used to generate additional collective-intelligence data. The considerable data warehouse technologies can be used to aggregate collective-intelligence data from multiple interventions. A new investigator can then use these data collected by other investigators to initiate this CTHC intervention. Another interesting development in recent years is the development of large social networks around health issues. For example, BecomeAnEx and QuitNet are social networks focused on helping smokers [
As noted, recommender systems can be implemented using 3 approaches: content based, collaborative, or hybrid. Each of these has distinct advantages. While content-based systems are similar to rule-based approaches, content-based systems can use the rich metadata that can be developed for a particular message. While metadata is primarily used for flagging the messages to a particular tailoring condition, use of metadata in content-based systems can be more powerful. CTHC messaging can be described in several ways, including its relevance to particular concepts in a behavioral theory (eg, self-efficacy), the message polarity (positive, negative, or neutral sentiment), and the topical content of the message (eg, mentioning weight loss or cravings). In theory, a content-based system can use all this information in developing a matching function. However, in practice the cost of explicitly specifying large amounts of metadata for each message can be prohibitively expensive.
Collaborative filtering methods can bypass the need to match users to items based on explicitly defined metadata and instead derive recommendations based directly on items that similar users have rated highly. As a result, collaborative filtering recommender systems have been successful in domains such as book and movie recommendation, where enumerating all relevant characteristics of the users and items is difficult, if not impossible. However, there are certain disadvantages to using a purely collaborative filtering approach. This approach would imply that the tailoring is purely data driven and may lead to unintended consequences (see Results).
Hybrid systems can bridge theory-based, rule-based tailoring with the recommender empirical tailoring. While this might appear to be the best fit, it might not be feasible to develop hybrid models for all projects, given the limitations of time, content, and available collective-intelligence data. Thus, research is needed to identify the best recommender approach for an intervention and what approach would provide an advance over current rule-based approaches, make the intervention most engaging within the project constraints, and most influence the targeted behavior.
Studies are needed to compare the performance of all 3 approaches. For example, a study could directly compare the performance of all 3 approaches by randomly assigning participants to receive messages tailored by either a content-based, a collaborative, or a hybrid recommender system. Such a study could be evaluated in terms of several different outcomes. In a pilot study, the outcome could simply be a comparison of the ratings provided by participants for a period of time (eg, 30 days) or of the use of the intervention functions. Ratings could be in the form of explicit ratings (eg,
Will recommender systems be better than current CTHC? There is no evidence regarding the use of recommender systems in CTHC. Research is needed to understand the benefit of incorporating recommender approaches into CTHC, in terms of increased engagement as well as behavior change. Comparative effectiveness studies are needed to evaluate the relative impact of rule-based tailoring versus recommender systems tailoring across different health behavior targets. The outcome of such a study would be assessing the behavior change of interest, as well as increased engagement and satisfaction with the intervention.
To achieve these agenda, we may need changes in our training and funding models with an increased focus on supporting interdisciplinary research bridging behavioral science and computing. As with any interdisciplinary teams, researchers must be conscious of differences between disciplines in terms of terminology to ensure clear communication across team members. More fundamentally, researchers also need to be conscious of differences between disciplines in terms of where research challenges lie. For example, behavioral scientists may not be familiar with the challenges of developing, implementing, and deploying new algorithms and systems. On the other hand, computer scientists may not be familiar with the challenges involved in conducting behavior change intervention research, such as the time and effort needed to recruit subjects and ensure adequate levels of adherence to study protocols.
While this divide between disciplines has decreased with the increasing number of collaborations, additional training would speed up this merging. A model similar to the US National Science Foundation (NSF)/NIH mHealth Summer Institute training model might be a suitable approach to address some of these issues [
As mentioned above, modifying existing funding models should also be considered. Developing recommender systems will require considerable time, which the typical NIH funding model does not facilitate. Substantial preintervention work will be needed to develop these systems, including collective-intelligence data collection through pilot surveys, recommendation algorithm development and validation, Web system design, message creation, and metadata creation. A joint NSF/NIH model similar to the Big Data Request for Applications that provides an additional development cycle and also stresses collaboration across disciplines might be a potential funding model for advancing the research agenda of using recommender systems in CTHC [
The views presented in this paper are limited. Research on the incorporation of recommender systems is in its infancy. Therefore, few papers relevant to this work have been published. We wrote this paper hoping it would start the conversation. Our hope is that the research community will consider the points presented in this paper and respond with additional issues that we have not yet considered.
Recent technological advances and the widespread use of recommender systems outside health care present an incredible opportunity to improve on an already effective CTHC approach, and to reach and affect billions of users through Web and mobile technologies. Multiple challenges must be addressed to adapt recommender systems for CTHC. In this paper, we have attempted to start a discussion that we hope will help to move CTHC into the 21st century of these recommender systems.
computer-tailored health communication
National Institutes of Health
National Science Foundation
Dr Sadasivam is funded by a National Cancer Institute Career Development Award (K07CA172677). Funding for these studies was also received from the Patient-Centered Outcomes Research Institute (PI12-001), National Cancer Institute grant R01 CA129091, and the National Center for Advancing Translational Sciences of the National Institutes of Health under award number UL1TR000161. Dr Houston is also supported by the US Department of Veterans Affairs eHealth Quality Enhancement Research Initiative (eHealth QUERI) that he directs. Dr Cutrona receives grant funding from the Agency for Healthcare Research and Quality (1 R21 HS023661-01), Pfizer Independent Grants for Learning & Change (9713747-01), and the National Center for Advancing Translational Sciences of the National Institutes of Health (KL 2 RR031981). Dr Marlin is also funded by a National Science Foundation CAREER award (1350522). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute on Aging or the National Institutes of Health, or the Department of Veterans Affairs or the United States government. We thank Erin Borglund, Clinical Research Coordinator at the Department of Quantitative Health Sciences, University of Massachusetts Medical School, for her help in the literature search and proofreading the manuscript.
None declared.