This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
In the era of data-rich medicine, an increasing number of domains of people’s lives are datafied and rendered usable for health care purposes. Yet, deriving insights for clinical practice and individual life choices and deciding what data or information should be used for this purpose pose difficult challenges that require tremendous time, resources, and skill. Thus, big data not only promises new clinical insights but also generates new—and heretofore largely unarticulated—forms of work for patients, families, and health care providers alike. Building on science studies, medical informatics, Anselm Strauss and colleagues’ concept of patient work, and subsequent elaborations of articulation work, in this article, we analyze the forms of work engendered by the need to make data and information actionable for the treatment decisions and lives of individual patients. We outline three areas of data work, which we characterize as the work of supporting digital data practices, the work of interpretation and contextualization, and the work of inclusion and interaction. This is a first step toward naming and making visible these forms of work in order that they can be adequately seen, rewarded, and assessed in the future. We argue that making data work visible is also necessary to ensure that the insights of big and diverse datasets can be applied in meaningful and equitable ways for better health care.
big datadata workmedical informaticsinternetdata interpretationdecision support systemsIntroducing Data Work
With health care becoming increasingly data driven, more and more domains of people’s lives are datafied, that is, they are translated into a format that lends itself to automatic processing and computation. Examples range from data generated by individuals using health and lifestyle smartphone apps, the digitalization of health records, data from direct-to-consumer testing or drug trials, to biobanking research and clinical genetic testing. Data from increasingly diverse sources are thus rendered, at least in principle, usable for health care purposes. Yet, deriving insights for clinical practice and individual life choices, and deciding what data or information should be used for these purposes, poses difficult challenges. Indeed, it has been argued that “big data won’t cure us” [1]; turning data into meaningful information for clinical practice requires tremendous time, resources, and skill. Thus, big data not only promises new clinical insights but also generates new—and largely unarticulated—forms of work for patients, families, and health care providers alike.
Building on insights from science studies, medical informatics, as well as on the concept of patient work and subsequent elaborations of articulation work [2-4], in this article, we analyze the forms of work engendered by the need to make data and information actionable in the health care context [5]. Doing so brings the perspective of social and ethical studies of biomedicine into conversations around digital medicine, emerging technologies, medical devices, apps, engineering, and informatics. We outline 3 areas of data work, which we characterize as the work of (1) supporting digital data practices; (2) interpretation and contextualization; and (3) inclusion and interaction. We argue that it is necessary to name and make visible these forms of data work for them to be adequately acknowledged, assessed, and rewarded. Making data work visible can also help to ensure that the insights of big and diverse datasets can be applied in meaningful and equitable ways for better health care. Although this paper primarily aims to highlight emerging forms of work in the era of data-rich medicine that have not been explicitly or comprehensively considered heretofore, we close by outlining avenues for future practice and policy.
Data Work: A Persistent Challenge for the Era of Data-Rich MedicineEmerging Forms of Data Work
Controversies surrounding data use, storage, and sharing illustrate the important ethical questions that emerge when data collection and analyses are applied to new ends. Examples in the news abound, for instance, the rise of direct-to-consumer genetic testing for diseases such as cancer, seen recently through the example given on National Public Radio of an uninsured American woman concerned about her risk of breast cancer [6]. Upon reading her results from 23andMe, the woman admitted feeling less urgency about getting additional testing or mammograms with her physician—something that geneticists worry could pose problems for individuals carrying variants undetected by tests offered by commercial sources, or for those who receive summary advice from individuals without proper training, possibly leading to clinical harm in the future. Other disputes have emerged when health technologies are applied to new ends, such as the recent identification of the Golden State Killer in California, United States, in April 2018. Detectives were able to identify the perpetrator by matching crime scene evidence with a family member’s DNA profile that the family member had uploaded to a genealogy website. The incident, and subsequent admission that private companies have shared access to their database with law enforcement to find potential suspects, spurred controversy among experts and the public over the legitimacy of using the personal data of volunteers who had not consented to such law enforcement applications [7,8]. Controversies such as these—as well as others surrounding privacy and, for example, the hacking of medical devices [9], or matters of justice and fairness in algorithms [10]—point to the centrality of data at the heart of negotiations over the public good; the status of data generated outside of official forums of science and medicine; and central ethical questions of privacy, consent, and benefit that are emerging in new configurations [11,12].
By data work, we are referring broadly to the forms of technological, analytical, and emotional work undertaken by all actors within the health care system that is necessary to make data clinically and personally meaningful. Here, we focus on the emerging forms of data work undertaken by patients and health professionals. This work is already occurring, for example in the interpretation of direct-to-consumer genetic tests [13], efforts to improve patient understanding of broad consent in biobanking [14], or as researchers define proteomic markers of risk, such as for ovarian cancer [15], albeit in an often unrecognized and patchwork manner. Although science studies scholarship has explored various determinants and conditions of data production in the health sphere [16-18], the types of work that are necessary to make diverse forms of health data actionable in daily life by patients and health professionals have not been systematically addressed or conceptually analyzed [19]. Data work is ongoing and constitutes a formidable yet underresearched challenge in the era of data-rich medicine. But what kinds of work does this entail, and for whom? What divisions of work or tools would be necessary for addressing ethical and equitable applications of data in everyday life?
Empirical studies examining the organization and structure of medical work from a sociological perspective [20-24] have been helpful to draw attention to the often invisible contributions that patients and their family members make to all aspects of health care. However, conceptualizations of such patient work in the era of data-driven medicine are, as of yet, largely missing [25]. As debate grows in medicine over how to best actualize voluminous and diverse data for better outcomes in health care [26,27], many of the biggest challenges are of a social, rather than technical, nature [1]. In this context, more systematic attention to the ways in which professional and nonprofessional actors within the health care system help, for example, to create and interpret data, would fill an important gap. In the following section, we outline and describe three areas of emerging forms of work that have accompanied the turn toward big data in medicine, identify who does this work, and sketch potential ways of addressing concerns that arise in connection with this work. For each area of data work, we offer one vignette to illustrate the forms of data work that are already ongoing. Although the boundaries between these different types of data work are fluid, we posit that there is analytic value in drawing out the key features that characterize each activity to see what challenges they pose and how we might address these.
More Than a Click Away: Supporting Digital Data Practices
G is excited about a new app that promises to keep track of his heartbeat, steps taken, and minutes slept, and to aggregate these data with his weight, blood pressure, and glucose levels. Yet, after looking at the Terms of Service, he realizes that by using the app he signs the rights to his data over to the company. G wonders if there is another option. Finding himself mired in pages of legalese, he starts to think, “maybe I’m just too uptight—what could they really do with all this data, anyways?”
Advances in mobile devices have changed how health information and support services are being accessed, communicated, monitored, and acted upon [28], offering potential gains ranging from clinical oncology [29] to improving health outcomes for low-income populations [30]. As a result, patients create and engage with health data not only in medical institutions but also in their homes and in other places outside the clinic, via wearable or portable devices, or other tools. Patients and health care professionals alike are faced with ever wider types and larger volumes of data that could potentially be relevant for health care, without a clear understanding of the implications of specific forms of personal data [31]. In the domain of mobile apps, one form of emergent data work is the work done by patients who search through the fine print of Terms of Services of new devices and apps to decide whether or not to use them. This is often not easy to do; for instance, the interests of a company providing a digital health device or platform might be hard to fathom for a user, posing potential concerns for individuals who are consenting to data use agreements for a health app, or uploading their medical history to a Web portal for a rare-disease patient community.
Furthermore, the ability to learn about genetic traits—which can now be done with ever lower expense on the internet —raises profound ethical challenges. As Kung and Wu ask, “if we discover certain genetic risk factors in our genome sequences, do we (or our health care providers) have a responsibility to inform our family members who might have similar genetic risks?” [32]. Privacy matters, and the effects of new health technologies on future generations all become important concerns with which individuals have to grapple, while very little or no guidance may be available. The work that people are doing when navigating the landscape of available offers, and in deciding what test they should take and what behavior they should track in an attempt to maintain or increase their health, should not be trivialized. There are increasing expectations that individuals make informed decisions as responsible managers of their health, and now also as owners (morally or legally) of their data.
In addition to such new data work for patients, another novel form of data work emerges for health professionals. This consists of assisting patients and their families in navigating the landscapes of available offers for tests, devices, and services, and helping them to decide whether they should datafy certain aspects of their lives and bodies in the first place. This data work includes engaging patients in conversations about the implications of their potential data contributions before patients have had practical experience with these digital practices, and about whether and how they should consider engaging in certain activities. Steering patients through the multitude of options is an important yet complex task. Recent studies have also shown that socioeconomic status, age, English literacy, and digital literacy all play important roles in the uptake of new mobile technologies such as health apps [28,29,33] in engaging in Web-based participatory medical research [34] and in efforts to counter the digital divide [35-37]. Importantly, these differences also influence whose data are missing from the broader evidence base upon which future decisions in medicine might be made [25]. This points to the growing need to ensure that such digital health practices and technologies do not exacerbate existing inequalities in society or health and the critical role that health professionals are called upon to play in mediating digital engagements.
Looking forward, we thus anticipate that the data work of professionals in this space will include not only assisting patients in navigating this digitalized network of health-relevant services but also assisting those who cannot, or choose not to, engage digitally [38]. As noted, people who do not make use of digital tools to collect, view, and share data and information about themselves can become missing bodies in today’s health care environments, meaning that their bodies, needs, and behaviors remain unaccounted for in decisions made on the basis of new digital health sources [25,39]. Especially when the stakes are so high, neither offering guidance on patient use of digital tools and new health products nor understanding the advantages and disadvantages of the many new products on the market every day is intuitive. To be effective, these activities require time and appropriate training, which are in very short supply in today’s time-starved health care environment [40-43].
One possibility, as we have argued elsewhere, to better support both patients and providers in the era of data-rich medicine would be the creation of a new, intermediary profession entirely, which we have termed health information counselors (HICs) [44]. With a broad knowledge of various kinds of health data and data quality evaluation techniques, as well as analytic skills in statistics and data interpretation, our vision is that HICs would be trained also in interpersonal communication, health management, insurance systems, and medico-legal aspects of data privacy. Operating as a clinical consultancy, HICs would have the ability to translate the complex language of data into intelligible and actionable information for both patients and physicians. The creation and implementation of such a specialty would enable patients to make educated, truly autonomous choices about how these novel forms of health data can inform their personal care decisions. Although certainly not the only option for addressing the aforementioned concerns, the creation of this new specialty would go a long way in assisting individuals such as G from our opening vignette, as well as health care professionals, to consider their options and make more informed choices about how increasing amounts of health data and information can or should inform health care.
How to Tell It All Apart: The Work of Interpretation and Contextualization
A brother informs his sister, L, that he has done a commercial DNA test that revealed that he could be a carrier for a particular condition. Because L is considering having a child with her partner, she wonders if she should undergo testing, and what this would mean for their decisions going forward. In reading the leaflet provided by a company offering the testing, she is not sure what is meant by the information that carrier reports may vary in detection accuracy by ethnicity (L has Ashkenazi heritage), and that carrier testing does not include all possible variants for a given condition. L wonders: “What would this information mean for me personally? Who could I ask about this?” She is unsure if her primary care physician is the right person to ask, and who else she could turn to.
Testing practices such as the one described in this vignette have become a means through which individuals understand themselves and their relationship to society. For some patients, the quantified self can allow people to see new patterns or make changes in their lives: counting steps might lead one to take the stairs, and tracking sleep patterns might lead another to try and get an extra hour of sleep. For others, finding out the percentages of one’s global ancestry or likelihood that they could be a carrier for a genetic condition represents personally significant information. Yet, the effects of health-related data and information are often difficult to anticipate and understand. Randomized controlled trials have studied the clinical impact of patients’ use of mobile and digital health tools, such as the effectiveness of smartphone apps for weight loss and self-applied therapies [45-47]. Other studies have shown the necessity of looking at patient experience of digital tools to understand how mobile health affects self-management of chronic conditions or changes in well-being [48-50]. In some cases, certain forms of health information may have personal utility for some people even if they lack clinical utility [51]. Overall, such research shows that further work—such as prescreening or offering hands-on assistance and consultation—is needed to turn a health app or Web-based service such as direct-to-consumer testing into a meaningful tool for an individual patient [52].
Data science holds the potential to offer important predictive and diagnostic information that can be used to improve decisions taken by clinicians to reduce error or support estimates, such as the likelihood of medication adherence or organ rejection [53,54]. Yet, from body temperature to steps taken, heartbeats, and hydration levels, it is not yet clear what the biometric data collected via devices such as wearables or smartphones will mean for medical practice and health practitioners. The same is true for nonmedical grade testing services. Both the quality of the data and the possibilities of data interpretation are relevant here. Commercial devices are often not calibrated to the standards of medical grade devices, particularly if not used exactly as intended, which means that data collected through them cannot be used as reliable evidence for health care decisions. Internet communities and apps that offer peer-to-peer support can also be problematic when inaccurate or purely anecdotal information is shared, for example, how-to-hack Web-based tutorials or the increasing use of YouTube as a platform for disseminating misleading health information or offering problematic interpretations of existing data on conditions such as anorexia and bulimia [55-58].
The complex task of discerning irrelevant, unreliable, or misleading health information from relevant, valid, and clinically actionable personalized health resources and then interpreting and contextualizing these for specific patients and their families is emerging as a significant, and time-consuming, activity for health care providers. In our survey of health professionals working in the region of Schleswig-Holstein, Germany, providers expressed repeated concerns about the increasing amount of time devoted in patient encounters to explaining why data from a Web-based genetic test are not relevant, or why a novel therapy reported on a patient community website is not the best choice for a family member [59]. These findings are echoed by recent reports that have pointed to the need for new and improved decision aids to situate the most personally relevant and high-quality digital tools for patients [28,60]. Although some standardization work regarding this issue is currently undertaken by groups such as the Consumer Technology Association, the creation of new devices, apps, and programs and the demands these pose regarding data interpretations and contextualization continues to exceed regulatory processes and physician workloads.
In this context, data work includes deciding which data or information are reliable and relevant for a given context of a specific patient—including contexts outside of the clinic—to decide which intervention, tool, or device might be appropriate or helpful in a given situation, or in future. Again, this is a complex task. For example, discerning whether data brought in by patients derived from commercial or hacked devices can be clinically relevant involves researching devices, analyzing the information they collect, and deciding if, and how, the information generated could be used to inform individual case decisions. In some instances, such data work could include contacting the company producing the device for more information, or seeking out additional resources to evaluate the reliability of the data generated. The same is true for commercially available genetic testing, or the results derived from nonstandard forms of research occurring on patient platforms, such as in some citizen science initiatives [61].
The work of contextualization also increasingly extends to the analysis of the algorithms used to produce data in the health care context. Algorithms are neither ‘objective’ nor intrinsically neutral and they can exacerbate societal inequities. Biases—regarding race, gender, educational status, body mass index, and so on—are programmed into systems, and the characteristics of datasets that these systems use to learn might reproduce inequities [10,62]. As more and more parts of our lives are being datafied, there is an increasing need for contextualization of the health data gained through Web-based tests, mobile, and digital technologies [63]. This includes making the context of data explicit, and asking questions such as: What data was collected, from whom, and how? What do these data represent, and what do these leave out? How has it been made legible for computation, and what has been lost or gained in the process? Such questions are increasingly necessary given the growing ubiquity of domains of everyday life being understood through computational practices. All of the above forms of evaluation require a significant degree of analytical and computational literacy and reflection on whether a particular process of meaning-making relies on evidence that is accurate and reliable in a technical sense, if it is mostly personal and social, or if it is indeed faulty or misleading [64].
Patients, in addition to health care professionals, are also increasingly participating in specific forms of work, including outside of clinical settings. This is the case, for example, when patients do internet searches and seek assistance in making sense of reports or articles found on the internet, thus engaging in the work of sorting, interpreting, and analyzing diverse and often competing sources of information. Often this type of work is undertaken by family members or caregivers to support a patient’s health care choices. The work of contextualization will remain a persistent challenge in years to come as more devices, apps, health-related services are offered to individuals outside the supervision of medical professionals. As an area that is in need of robust investigation and public debate, it would be productive to have greater involvement by scientific and academic societies in conducting and sharing analysis of how data can and should be used. Although some of this work is already ongoing, such as recent reports addressing the opportunities, risks, and ethical questions associated with use of good artificial intelligence (AI) in health care, or developing specific suggestions that can be taken up by stakeholders and policy makers at national and international levels [65,66], further work is needed on different aspects of the use of big data in medicine. By fostering greater debate, and providing material that is available for lay readership to engage with the stakes of their data engagement, academic scholarship can better support digital literacy in this area.
Facilitating Conversations About Aims and Interests: The Work of Inclusion and Interaction
Upon entering the hospital for an inpatient stay, P, an elderly patient, is asked to opt-in to the institutions’ efforts to improve efficiency and calculate predictive health and frailty scores for patients [67]. P is not sure what this means, or how his personal information will be stored and used in the future. [67]
The prior areas of data work that we have outlined have emphasized the need for a strong awareness of what new data, tests, and technologies are available and how they work. Data-rich medicine highlights a number of ethical issues [11], not least of which is the cross-cutting work of addressing different aims, goals, and interests. As data are increasingly accessible, distributed, revealing, and reidentifiable, ethical concerns pertaining to digital health, large datasets, and precision medicine are multiplying, including issues of consent, protecting participant privacy concerns, and maintaining public trust [68]. Given that many of data-driven practices track new territory in health, questions of power asymmetries and social-economic value are emerging with new relevance [12,69]. An important form of data work thus involves fostering conversations with and across stakeholder groups around these concerns.
As precision medicine moves away from one size fits all approaches to treatment, machine learning approaches are increasingly improving the ability to target patients for specific treatments, such as in the use of DNA methylation to subclassify tumors of the central nervous system [70]. The potential of this work to improve personalized therapies through the use of mathematical models is great, yet both the perceived benefits and the social, economic, and health-related concerns vary by actor [71]. In other words, a provider will likely have a different set of investments in the technology, research, and treatment outcomes than a given patient, a hospital chief executive officer, a pharmaceutical company, or an interested member of the public. A patient might be most concerned about loss of privacy, discrimination, or stigmatization (albeit also interested in disease prevention and better treatment), whereas company representatives might be uneasy about losing exclusive access to datasets and find themselves at odds with community members committed to principles of open access. Thus, a central aspect of data work is creating the spaces for interaction and facilitating conversations between differently motivated parties, such as assisting one actor to understand the concerns of another, or finding novel ways to address specific concerns around discrimination, privacy, or equity.
In the digital era, privacy concerns take on a different configuration than in the paper age [72]. Data work in the context of privacy is not limited to simply informing patients of what happens with their data and information once it has been collected but includes moving beyond the widely accepted ethical principle of respecting patient autonomy [73] to including patients in decisions over what type of information will be collected about them in the first place, and to what end. The General Data Protection Regulation (GDPR) introduces protections that began in 2018 across the European Union (EU; including the United Kingdom), but outside of the EU, there is little agreement on regulatory standards for digital health tools or data protection in research, databanks, and big data [61,74-77]. Despite the overall objective of European harmonization, the GDPR gives member states leeway, for instance, in determining whether patient consent is required for secondary data use in medical research, and in which form [74,78]. These national differences have various practical and normative consequences, most of which have not yet been fully analyzed, as well as different implications for research practice across member states. Legislation in countries where data protection is sector specific, rather than general, such as Health Insurance Portability and Accountability Act (HIPAA) in the United States, has addressed data privacy and security concerns relating to medical information since 1996. Subsequently, the HIPAA omnibus rule of 2013 modified the Act to meet guidelines set by the Health Information Technology for Economic and Clinical Health in 2009. Such efforts have expanded the extent of HIPAA beyond providers and insurance companies to also consider the role of business associates. However, even though concerns surrounding patient privacy and the reuse of health information have long been an important topic, the ability of existing regulation such as the GDPR or HIPAA to fully address the concerns emerging in the age of big data remains unknown [79]. We highlight here that the forms of data work we identify can pose particular challenges for privacy, including: the rapid rate of digital innovation; that decisions need to be made on both on the individual and societal level about which aspects of everyday life should be captured by data in the first place; that harm can occur from data use that is not necessarily illegal [80]; as well as broader concerns about data privacy protection legislation.
How to effectively engage a range of stakeholders, including patients, providers, researchers, and insurance companies in these data work concerns, is an ongoing discussion in both clinical practice and biomedical research [81-83]. One critical area of data work for health care providers and researchers is holding conversations with patients about data collection and privacy to better understand the impact of collecting anonymized patient health data in research [14,84]. Data work includes ensuring that patients are party to the decisions about what information will be included in their records, who the gatekeepers for this information are, and for which goals and for whose benefit this information will be used beyond the realm of individual-level health care decisions. It is critical that these discussions include reflections on how data could potentially be reused in the future, for example, the use of predictive health and frailty scores by insurance companies as mentioned in the vignette, as well as the identification of potential protections to guard against uses of data that could be harmful or exclusionary to patients. Specific conditions of access, reuse, and reidentification need to be identified and continually updated in light of new digital advances.
In particular, digital technologies raise important questions over the access of personal information. Each patient’s needs and interests are influenced by their human, natural, and artifactual environments. An individual’s decision to access his or her electronic health records or use a Web-based genetic testing service is not just a choice made by an atomistic individual but an act shaped by the person’s family ties and social relations, his or her connection to others, and the country in which he or she lives [85]. For example, an individual may want to share and discuss this health information with his or her partner or children [82]. This decision to share and discuss information received is not an afterthought but may well have shaped the decision to obtain information in the first place [86]. This layer of dyadic or multilateral forms of decision making can vary significantly across cultural contexts.
In sum, joining distinct datasets from different types, locations, and ethical standards adds additional layers of deliberation to well-rehearsed ethical considerations. Recognition and fostering dialog around aims and goals and the more complex, potentially shared nature of decision making in the era of big data is a critical form of data work. However, how this can be achieved when data are held in dispersed locations and are diverse in nature is entirely unclear. It will require close communication between the patient and the health care provider to ensure that the built-in decisional pathways offered by data-driven practices do not eclipse individual priorities. One potential way of addressing this concern is to reconsider existing methods for ensuring patient privacy and protection and addressing them through regulatory measures, for example through the GDPR in Europe. According to the GDPR, for personal data to be processed lawfully, either individual consent is required, or a legal authorization has to apply. The most relevant legal authorization in the medical context is the research exemption (Article 89). However, particularly in view of international research collaborations, further work is necessary on how GDPR is implemented across individual countries. To provide an example, in line with Article 89, Germany now allows data processing of pseudonymized data for scientific or historical research purposes or for statistical purposes, at least prima facie, without requiring individual consent. However, neither clear guidance exists as of yet for how these purposes are exactly delineated nor have studies been conducted on how this new legal provision has penetrated research practice and how effects differ from countries that are more restrictive. Countries that have long-term experience with more permissive approaches, such as broad or blanket consent (eg, the United Kingdom) and the processing of genetic data should help to anticipate the implications of the novel practice and to raise the standards for how informed consent can be better operationalized in light of the concerns of big data—also in areas outside of Europe [87].
Who Does Data Work: Patient Work 2.0?
The different kinds of technological, intellectual, social, and emotional work sketched here mean that patients, their families, caregivers, and other health care providers will be faced with an increasing range of tasks in the domain of health care, which we have summarized in a list (Table 1). This list of tasks is not meant to be exhaustive but rather to make explicit some of the principal kinds of work involved in making data matter medically. Many of these concerns overlap; we expect that new forms of expertise will continue to emerge along with clinical and technological advances.
Outline of various types of data work with examples.
Types of data work
Why is this work needed?
Examples of data work in practice; ongoing and possible in the future
Supporting digital data practices
Engagement with health data is increasingly taking place outside the clinic, and it can also create digital divides; traditional means of managing and evaluating data are increasingly not suited to meet the realities of the digital age; persistent difficulties in assessing accuracy and appropriateness of diverse, unvalidated forms of health data.
Patients research and consider the implications of data; health practitioners assist in navigation of data relationships; creation of guidelines for how to evaluate new digital technologies or assess internet sources; identification of how digital interaction can create new patterns of exclusion.
The work of interpretation and contextualization
Unclear what biometric data collected via devices such as wearables or smartphones will mean for medical practice; misleading or false health information is often shared on the internet; the algorithms that produce data are neither objective nor intrinsically fair; the full implications of diverse, unregulated health information are often difficult for users to discern or anticipate.
Expert guidance on how to decide which devices and resulting data are reliable and relevant for a given context; research on reliability of commercial devices; provision of prescreening and assistance to make digital health tools meaningful for individual patients; identification of biases built into algorithms of datasets, devices, and models.
The work of inclusion and interaction
Data are increasingly accessible, distributed, revealing, and reidentifiable, creating new ethical concerns; perceived benefits of the data-driven medicine and the social, economic, and health-related concerns vary by actor; patient experience of digital tools affects self-management of chronic conditions and well-being.
Support for patients in determining their priorities, needs, and wishes with regard to their digital health activities and data collection and use; facilitation of conversations between differently motivated parties about aims, goals, and interests.
Yet what is clear is that the problems accompanying these demands are currently underappreciated. This raises the question of who should be tasked with the increasing interpretation needs of data in the health care domain. Visions of data-rich medicine often imply that doctors should or will take on this work, as reflected in frequent calls for better genomic or data literacy for health care professionals. In the past decade, there have been numerous calls for more training in several of the domains mentioned above, such as ethical concerns surrounding the communication of genetic data and related health risks to patients [42], or counseling patients about the advantages and pitfalls of Web-based or commercial sources of health information [69]. Some, such as Celi et al, call for increased training of medical students and residents in order to “creat[e] a medical culture that is aware of and respectful of the importance and potential power of data for supporting and improving both practice and research may be the most important and ultimately effective element” [53]. At the moment, although health care professionals are seen as the first in line to take on this additional work, allowances are not made in schedules or training to accommodate meaningful engagement with the social complexities of data in medicine. Even if actors find the time to engage in the various types of data work, not all can acquire the necessary skills. Finally, many of the tasks described above take place outside health professionals’ sphere of influence entirely.
Throughout this paper, we have proposed a few possible ways of addressing the emerging forms of data work identified here, ranging from the creation of a new profession dedicated to help both patients and providers assess and understand diverse kinds of health data, to greater involvement and creation of guidelines by scientific and academic societies, to raising expectations through regulatory frameworks for how mechanisms such as informed consent are operationalized across novel research practices. However, none of these approaches alone will be sufficient for taking on the myriad aspects of data work that we have outlined, as well as those that will continue to emerge in the future. Although the focus of this paper has been on the identification of the contours of the phenomenon we are calling data work, further attention is needed to analyze and consider other solutions for addressing these concerns. Importantly, some aspects of data work can neither be delegated to professionals nor addressed completely through better guidelines or greater public discourse. Hence, the current landscape of big data in medicine remains open for new proposals, such as how such work can or should be acknowledged or even reimbursed. What other tools—conceptual, analytic, instructive, or collaborative—would be helpful for navigating increasingly complex data use? What would be a fair division of work? What responsibilities should corporations using health data have, beyond compliance with data protection regulations? Our intent is that by making these forms of work more explicit and transparent, more appropriate ways of addressing data work can be devised in future.
Conclusions
In addition to the established challenges surrounding data collection, storage, analysis, and security, pressing questions have arisen around: how to enable the appropriate use of technologies and engagement with health data outside of the structured environment of health care; what the utility, quality, and possibilities of data collected from wearable devices or smartphones will be for clinical practice; strategies to avoid the digital health divide; how to distinguish data noise from clinically actionable health resources for patients; how to contextualize health data gained through Web-based tests or digital technologies; and how to foster conversations surrounding the ethical concerns of big data between different stakeholders in health care and society. Of course, the various forms of work included within the categories of supporting digital tool use, contextualization, and inclusion and integration cannot be neatly disentangled. Conversations between different actors in the health care domain are necessary to determine what types of data and data use are feasible, ethical, and cost-effective in particular situations. Although we expect that AI applications such as deep learning will be of great help in matters such as the interpretation of data, the analysis above has shown that the task of interpretation is not something that can be devolved to machines entirely.
A critical thread that runs throughout the forms of data work identified here is that of context: data work does not involve questions of absolutes but rather of contingencies. What is relevant, important, or significant for one individual may not apply to the next. Data, just like the experience of health and illness, are profoundly dependent upon the social world in which they exist. As we have shown in this paper, the turn toward data-rich health care has created new forms of data work and expertise. Data work needs to be named and recognized as the human endeavors that make digital advances meaningful in medicine. We argue that greater attention is needed for the very craft of deriving choices, narratives, and practices from our data and that the current medical system is not equipped to take on this challenge alone. If the great potential of data-rich medicine to improve future clinical care is to be realized, the new data work that patients, health professionals, and other actors increasingly contribute must be recognized as an important and multifaceted task.
AbbreviationsAI
artificial intelligence
EU
European Union
GDPR
General Data Protection Regulation
HIC
health information counselor
HIPAA
Health Insurance Portability and Accountability Act
None declared.
NeffGWhy big data won't cure us201309131172310.1089/big.2013.00292516182710.1089/big.2013.0029PMC4114418ReddyMCGormanPBardramJSpecial issue on supporting collaboration in healthcare settings: the role of informatics201108808541310.1016/j.ijmedinf.2011.05.00121680235S1386-5056(11)00108-0AartsJA sociotechnical perspective of health information technology20131282121133510.1016/j.ijmedinf.2013.10.00724216291S1386-5056(13)00224-4BjørnPKensingFSpecial issue on information infrastructures for healthcare: the global and local relation201305825281210.1016/j.ijmedinf.2013.01.00423422271S1386-5056(13)00015-4LeonelliSData interpretation in the digital age2014091222339741710.1162/POSC_a_0014025729262PMC4340525SteinR201806182018-06-25Results Of At-Home Genetic Tests For Health Can Be Hard To Interpret
https://tinyurl.com/y974fodbKolataGMurphyH201804272019-06-10The Golden State Killer Is Tracked Through a Thicket of DNA, and Experts Shudder
https://tinyurl.com/y8yqd25cSaeyTH20190262019-06-10What FamilyTreeDNA Sharing Genetic Data With Police Means For You
https://www.sciencenews.org/article/family-tree-dna-sharing-genetic-data-police-privacySifferlinA201703162019-06-10Why Perfectly Healthy People Are Using Diabetes Monitors
http://time.com/4703099/continuous-glucose-monitor-blood-sugar-diabetes/O'NeilC2016New YorkCrown20172019-06-10[Big Data and Health - Data Sovereignty as Informational Freedom Design]
https://tinyurl.com/y6qnjhrb20142019-06-10The Collection, Linking and Use of Data in Biomedical Research and Health Care: Ethical Issues
http://nuffieldbioethics.org/wp-content/uploads/Biodata-a-guide-to-the-report-PDF.pdfSuPDirect-to-consumer genetic testing: a comprehensive view2013098633596524058310PMC3767220RichterGKrawczakMLiebWWolffLSchreiberSBuyxABroad consent for health care–embedded biobanking: understanding and reasons to donate in a large patient sample2018201768210.1038/gim.2017.8228640237WhettonA20192019-06-10Protein Biomarkers For Precision Medicine: Development of Large Scale Integrated Platform for Clinical ProteomicsNeffGTanweerAFiore-GartlandBOsburnLCritique and contribute: a practice-based framework for improving critical data studies and data science201752859710.1089/big.2016.005028632445PMC5515123LeonelliS2016ChicagoUniversity Of Chicago PressEbelingMF2016New YorkPalgrave MacmillanTempiniNLeonelliSGenomics and big data in biomedicine2018UKRoutledge4451StraussALFagerhaughSSuczekBWienerCThe work of hospitalized patients19821699778610.1016/0277-9536(82)90366-57112176StraussAThe articulation of project work: an organizational process19882921637810.1111/j.1533-8525.1988.tb01249.xStraussAL1997New JerseyTransaction PublishersClarkeAMamoLFishmanJRShimJKFosketJRBiomedicalization: technoscientific transformations of health, illness, and US biomedicine2003046821619410.2307/1519765StaceyMWho are the health workers? Patients and other unpaid workers in health care1984521578410.1177/0143831X8452002PrainsackB2017New YorkNYU PressKnoppersBMThorogoodAMEthics and big data in health201708453710.1016/j.coisb.2017.07.001BelleAThiagarajanRSoroushmehrSMNavidiFBeardDANajarianKBig data analytics in healthcare2015201537019410.1155/2015/37019426229957PMC4503556LoiselleCGAhmedSIs connected health contributing to a healthier population?20171911e38610.2196/jmir.830929127077v19i11e386PMC5701967HesseBWGreenbergAJRuttenLJThe role of internet resources in clinical oncology: promises and challenges20161213127677610.1038/nrclinonc.2016.7827273045nrclinonc.2016.78RamirezVJohnsonEGonzalezCRamirezVRubinoBRossettiGAssessing the use of mobile health technology by patients: an observational study in primary care clinics2016041942e4110.2196/mhealth.492827095507v4i2e41PMC4858592KleringsIWeinhandlASThalerKJInformation overload in healthcare: too much of a good thing?20151094-52859010.1016/j.zefq.2015.06.00526354128S1865-9217(15)00128-2KungJWuCTLeveling the playing field: closing the gap in public awareness of genetics between the well served and underserved2016465172010.1002/hast.61327649825PengWKanthawalaSYuanSHussainSAA qualitative study of user perceptions of mobile health apps2016161115810.1186/s12889-016-3808-02784253310.1186/s12889-016-3808-0PMC5109835Del SavioLPrainsackBBuyxAMotivations of participants in the citizen science of microbiomics: data from the British gut project20171989596110.1038/gim.2016.20828125088gim2016208NguyenAMosadeghiSAlmarioCVPersistent digital divide in access to and use of the internet as a resource for health information: results from a California population-based study2017103495410.1016/j.ijmedinf.2017.04.00828551001S1386-5056(17)30086-2HongYAZhouZFangYShiLThe digital divide and health disparities in China: evidence from a national survey and policy implications2017199e31710.2196/jmir.778628893724v19i9e317PMC5613190LorenceDPParkHFoxSRacial disparities in health information access: resilience of the digital divide200608304241910.1007/s10916-005-9003-y16978003FoxSPurcellK20102019-06-10Chronic Disease and the Internet
http://www.pewinternet.org/2010/03/24/chronic-disease-and-the-internet/CasperMMooreLJ2009New YorkNYU Press2019022019-06-10The Topol Review. Preparing the Healthcare Workforce to Deliver the Digital Future: An Independent Report on Behalf of the Secretary of State for Health and Social Care
https://topol.hee.nhs.uk/wp-content/uploads/HEE-Topol-Review-2019.pdfGormanDKashnerTMMedical graduates, truthful and useful analytics with big data, and the art of persuasion2018089381113610.1097/ACM.000000000000210929280752BadalatoLKalokairinouLBorryPThird party interpretation of raw genetic data: an ethical exploration2017251111899410.1038/ejhg.2017.12628832567ejhg2017126PMC5643961AnnesJPGiovanniMAMurrayMFRisks of presymptomatic direct-to-consumer genetic testing20100916363121100110.1056/NEJMp100602920843242FiskeABuyxAPrainsackBHealth information counselors: a new profession for the age of big data201901941374110.1097/ACM.000000000000239530095453PMC6314498Granado-FontEFlores-MateoGSorlí-AguilarMMontaña-CarrerasXFerre-GrauCBarrera-UriarteMLOriol-ColominasERey-ReñonesCCaulesISatué-GraciaEMOBSBIT Study GroupEffectiveness of a smartphone application and wearable device for weight loss in overweight or obese primary care patients: protocol for a randomised controlled trial20150641553110.1186/s12889-015-1845-82604113110.1186/s12889-015-1845-8PMC4455326Lewis JrGKLangerMDHenderson JrCROrtizRDesign and evaluation of a wearable self-applied therapeutic ultrasound device for chronic myofascial pain20130839814293910.1016/j.ultrasmedbio.2013.03.00723743101S0301-5629(13)00585-1MummahSRobinsonTNMathurMFarzinkhouSSuttonSGardnerCDEffect of a mobile app intervention on vegetable consumption in overweight adults: a randomized controlled trial201714112510.1186/s12966-017-0563-22891582510.1186/s12966-017-0563-2PMC5603006MacdonaldGGTownsendAFAdamPLiLCKerrSMcDonaldMBackmanCLeHealth technologies, multimorbidity, and the office visit: qualitative interview study on the perspectives of physicians and nurses2018201e3110.2196/jmir.898329374004v20i1e31PMC5807622Anstey WatkinsJGoudgeJGómez-OlivéFXHuxleyCDoddKGriffithsFmHealth text and voice communication for monitoring people with chronic diseases in low-resource settings: a realist review201832e00054310.1136/bmjgh-2017-00054329527356bmjgh-2017-000543PMC5841508BanburyANancarrowSDartJGrayLParkinsonLTelehealth interventions delivering home-based support group videoconferencing: systematic review2018202e2510.2196/jmir.809029396387v20i2e25PMC5816261TurriniMPrainsackBBeyond clinical utility: the multiple values of DTC genetics201684810.1016/j.atg.2016.01.00827047753S2212-0661(16)30008-4PMC4796703AndersonKBurfordOEmmertonLMobile health apps to facilitate self-care: a qualitative study of user experiences2016115e015616410.1371/journal.pone.015616427214203PONE-D-16-02829PMC4876999CeliLADavidzonGJohnsonAEKomorowskiMMarshallDCNairSSPhillipsCTPollardTJRaffaJDSalciccioliJDSalgueiroFMStoneDJBridging the health data divide201612201812e32510.2196/jmir.640027998877v18i12e325PMC5209608GraberMLThe incidence of diagnostic error in medicine20131022Suppl 2ii21710.1136/bmjqs-2012-00161523771902bmjqs-2012-001615PMC3786666MadathilKCRivera-RodriguezAJGreensteinJSGramopadhyeAKHealthcare information on YouTube: a systematic review2015092131739410.1177/1460458213512220246708991460458213512220MurrayELoBPollackLDonelanKCataniaJWhiteMZapertKTurnerRThe impact of health information on the internet on the physician-patient relationship: patient perceptions200307281631417273410.1001/archinte.163.14.172712885689163/14/1727Fernandez-LuqueLKarlsenRBonanderJReview of extracting information from the social web for health personalization20110128131e1510.2196/jmir.143221278049v13i1e15PMC3221336Syed-AbdulSFernandez-LuqueLJianWSLiYCCrainSHsuMHWangYCKhandregzenDChuluunbaatarENguyenPALiouDMMisleading health-related information promoted through video-based social media: anorexia on YouTube20130213152e3010.2196/jmir.223723406655v15i2e30PMC3636813FiskeAPrainsackBBuyxA2019AitkenMLyleJ201509Patient Adoption of mHealth: Use, Evidence and Remaining Barriers to Mainstream Acceptance
https://www.iqvia.com/-/media/iqvia/pdfs/institute-reports/patient-adoption-of-mhealth.pdf20152019-06-10The Ethical Implications of New Health Technologies and Citizen Participation
http://ec.europa.eu/research/ege/pdf/opinion-29_ege.pdfPasqualeF2016London, EnglandHarvard University PressBeckerSMiron-ShatzTSchumacherNKroczaJDiamantidisCAlbrechtUVmHealth 2.0: experiences, possibilities, and perspectives2014051622e2410.2196/mhealth.332825099752v2i2e24PMC4114478HarrisAKellySWyattS2016UKRoutledgeFloridiLCowlsJBeltramettiMChatilaRChazerandPDignumVLuetgeCMadelinRPagalloURossiFSchaferBValckePVayenaEAI4People-an ethical framework for a good AI society: opportunities, risks, principles, and recommendations201828468970710.1007/s11023-018-9482-5309305419482PMC6404626FiskeAHenningsenPBuyxAYour robot therapist will see you now: ethical implications of embodied artificial intelligence in psychiatry, psychology, and psychotherapy2019059215e1321610.2196/1321631094356v21i5e13216RuckensteinMSchüllNDThe datafication of health2017462617810.1146/annurev-anthro-102116-041244KayeJCurrenLAndersonNEdwardsKFullertonSMKanellopoulouNLundDMacArthurDGMascalzoniDShepherdJTaylorPLTerrySFWinterSFFrom patients to partners: participant-centric initiatives in biomedical research2012043135371610.1038/nrg321822473380nrg3218PMC380649720102019-06-10Medical Profiling and Online Medicine: The Ethics of ‘Personalised Healthcare’ in a Consumer Age
https://tinyurl.com/zpq4pwbCapperDJonesDTSillMHovestadtVSchrimpfDSturmDKoelscheCSahmFChavezLReussDEKratzAWefersAKHuangKPajtlerKWSchweizerLStichelDOlarAEngelNWLindenbergKHarterPNBraczynskiAKPlateKHDohmenHGarvalovBKCorasRHölskenAHewerEBewerunge-HudlerMSchickMFischerRBeschornerRSchittenhelmJStaszewskiOWaniKVarletPPagesMTemmingPLohmannDSeltFWittHMildeTWittOAronicaEGiangasperoFRushingEScheurlenWGeisenbergerCRodriguezFJBeckerAPreusserMHaberlerCBjerkvigRCryanJFarrellMDeckertMHenchJFrankSSerranoJKannanKTsirigosABrückWHoferSBrehmerSSeiz-RosenhagenMHänggiDHansVRozsnokiSHansfordJRKohlhofPKristensenBWLechnerMLopesBMawrinCKetterRKulozikAKhatibZHeppnerFKochAJouvetAAKeohaneCMühleisenHMuellerWPohlUPrinzMBennerAZapatkaMGottardoNGDrieverPHKrammCMMüllerHLRutkowskiSvon HoffKFrühwaldMCGnekowAFleischhackGTippeltSCalaminusGMonoranuCMPerryAJonesCJacquesTSRadlwimmerBGessiMPietschTSchrammJSchackertGWestphalMReifenbergerGWesselingPWellerMCollinsVPBlümckeIBendszusMDebusJHuangAJabadoNNorthcottPAPaulusWGajjarARobinsonGWTaylorMDJaunmuktaneZRyzhovaMPlattenMUnterbergAWickWKarajannisMAMittelbronnMAckerTHartmannCAldapeKSchüllerUBusleiRLichterPKoolMHerold-MendeCEllisonDWHasselblattMSnuderlMBrandnerSKorshunovAvon DeimlingAPfisterSMDNA methylation-based classification of central nervous system tumours201855576974697410.1038/nature2600029539639nature26000PMC6093218KlingmüllerUEthics, bio-politics and regulation of precision medicine20180328International Cluster Symposium: Precision Medicine in Chronic InflammationMarch 26-27, 2018Hamburg, GermanyHamburgSchuellerSMWashburnJJPriceMExploring mental health providers' interest in using web and mobile-based tools in their practices201605421455110.1016/j.invent.2016.06.00428090438PMC5231655BeauchampTLChildressJF2012Oxford, USAOxford University PressRumboldJMPierscionekBThe effect of the general data protection regulation on medical research2017192e4710.2196/jmir.710828235748v19i2e47PMC5346164ThompsonB2016072019-06-10Analysis: Research and the General Data Protection Regulation
https://wellcome.ac.uk/sites/default/files/new-data-protection-regulation-key-clauses-wellcome-jul16.pdfDreyerNABlackburnSHlivaVMt-IsaSRichardsonJJamry-DziurlaABourkeAJohnsonRBalancing the interests of patient data protection and medication safety monitoring in a public-private partnership2015041532e1810.2196/medinform.393725881627v3i2e18PMC441495720152019-06-10Mobile Medical Applications
https://www.fda.gov/medical-devices/digital-health/mobile-medical-applications201703302019-04-23GDPR Series: Part 8 - Leeway Granted to Member State National and Supervisory Authorities
https://tinyurl.com/y5gny58vSnellE201801222019-04-23How Compliance, Data Security Needs Shift with Big Data Push
https://healthitsecurity.com/news/how-compliance-data-security-needs-shift-with-big-data-pushMcMahonABuyxAPrainsackBBig data governance needs more collective responsibility: the role of harm mitigation in the governance of data use in medicine and beyond2019SandersonSCBrothersKBMercaldoNDClaytonEWAntommariaAHAufoxSABrilliantMHCamposDCarrellDSConnollyJConwayPFullertonSMGarrisonNAHorowitzCRJarvikGPKaufmanDKitchnerTELiRLudmanEJMcCartyCAMcCormickJBMcManusVDMyersMFScrolAWilliamsJLShrubsoleMJSchildcroutJSSmithMEHolmIAPublic attitudes toward consent and data sharing in biobank research: a large multi-site experimental survey in the US201703210034142710.1016/j.ajhg.2017.01.02128190457S0002-9297(17)30021-6PMC5339111KayyaliRHessoIEjikoEGebaraSNA qualitative study of telehealth patient information leaflets (TILs): are we giving patients enough information?201717136210.1186/s12913-017-2257-52852602610.1186/s12913-017-2257-5PMC5438507PrainsackBBuyxA2017CambridgeCambridge University PressSpencerKSandersCWhitleyEALundDKayeJDixonWGPatient perspectives on sharing anonymized personal health data using a digital system for dynamic consent and research feedback: a qualitative study20160415184e6610.2196/jmir.501127083521v18i4e66PMC4851723EssénAScandurraIGerritsRHumphreyGJohansenMAKierkegaardPKoskinenJLiawSTOdehSRossPAnckerJSPatient access to electronic health records: differences across ten countries201771445610.1016/j.hlpt.2017.11.003WassSVimarlundVRosAExploring patients' perceptions of accessing electronic health records: innovation in healthcare2019032512031510.1177/146045821770425828457195PormeisterKGenetic data and the research exemption: is the GDPR going too far?201705721374610.1093/idpl/ipx006