Digital information technology can facilitate informed decision making by individuals regarding their personal health care. The digital divide separates those who do and those who do not have access to or otherwise make use of digital information. To close the digital divide, health care communications research must address a fundamental issue, the consumer vocabulary problem: consumers of health care, at least those who are laypersons, are not always familiar with the professional vocabulary and concepts used by providers of health care and by providers of health care information, and, conversely, health care and health care information providers are not always familiar with the vocabulary and concepts used by consumers. One way to address this problem is to develop a consumer entry vocabulary for health care communications.
To evaluate the potential of controlled vocabulary resources for supporting the development of consumer entry vocabulary for diabetes.
We used folk medical terms from the Dictionary of American Regional English project to create exended versions of 3 controlled vocabulary resources: the Unified Medical Language System Metathesaurus, the Eurodicautom of the European Commission's Translation Service, and the European Commission Glossary of popular and technical medical terms. We extracted consumer terms from consumer-authored materials, and physician terms from physician-authored materials. We used our extended versions of the vocabulary resources to link diabetes-related terms used by health care consumers to synonymous, nearly-synonymous, or closely-related terms used by family physicians. We also examined whether retrieval of diabetes-related World Wide Web information sites maintained by nonprofit health care professional organizations, academic organizations, or governmental organizations can be improved by substituting a physician term for its related consumer term in the query.
The Dictionary of American Regional English extension of the Metathesaurus provided coverage, either direct or indirect, of approximately 23% of the natural language consumer-term-physician-term pairs. The Dictionary of American Regional English extension of the Eurodicautom provided coverage for 16% of the term pairs. Both the Metathesaurus and the Eurodicautom indirectly related more terms than they directly related. A high percentage of covered term pairs, with more indirectly covered pairs than directly covered pairs, might be one way to make the most out of expensive controlled vocabulary resources. We compared retrieval of diabetes-related Web information sites using the physician terms to retrieval using related consumer terms We based the comparison on retrieval of sites maintained by non-profit healthcare professional organizations, academic organizations, or governmental organizations. The number of such sites in the first 20 results from a search was increased by substituting a physician term for its related consumer term in the query. This suggests that the Dictionary of American Regional English extensions of the Metathesaurus and Eurodicautom may be used to provide useful links from natural language consumer terms to natural language physician terms.
The Dictionary of American Regional English extensions of the Metathesaurus and Eurodicautom should be investigated further for support of consumer entry vocabulary for diabetes.
Digital information technology can facilitate informed decision making by individuals and can contribute to the realization of 2 important goals of the United States (US) health care establishment. The first of these goals is the US Healthy People 2010 goal of increasing "life expectancy and quality of life . . . by helping individuals gain the knowledge, motivation, and opportunities they need to make informed decisions about their health" [
A promise of health care-communications research is that it can offer solutions to the digital divide. Ensuring universal access to information technology will not, however, be a sufficient solution. To close the digital divide, health care-communications research must address an issue that is more fundamental than the digital divide: the issue that we call the consumer vocabulary problem.
The consumer vocabulary problem is that consumers of health care, at least those who are laypersons, are not always familiar with the professional vocabulary and concepts used by providers of health care and by providers of health care information, and, conversely, health care and information providers are not always familiar with the vocabulary and concepts used by consumers. This bidirectional communication problem is more fundamental than the digital divide because it affects the use of information in all forms, both digital and non-digital. Access to a high speed network and a World Wide Web (Web) browser will serve little purpose for a health care consumer who lacks the vocabulary necessary to: ask questions about his or her health care, search for information to support his or her decisions about it, or understand such information when he or she does manage to access it.
Potential misunderstandings attendant upon vocabulary differences in health care communications may reduce the quality of patient-physician interaction, result in poor health outcomes and patient satisfaction, impact consumer access to health care information, and have implications for informed consent.
The effect of vocabulary differences on patient-physician communication has long been recognized and studied by medical anthropologists and practitioners [
Not all problems in patient-physician communication are the result of vocabulary differences. Some communication problems result from differences in values between the patient and the physician [
Vocabulary and other language differences may be particularly significant in cross-cultural contexts [
With regard to health care-information access, in a previous study [
Vocabulary differences affecting information retrieval and conceptual frameworks for reconciling those differences have long been studied in the fields of information science and informatics [
The lack of semantic overlap may have implications for informed consent. Perhaps the best-known US example of this is the infamous Tuskegee syphilis study [
The types of problems just described will differ in degree from case to case; some health care consumers are more familiar than others with professional vocabulary, and likewise some health care providers are more familiar than others with consumer vocabulary. In addition, there may be differences between countries, eg, differences between US physicians and British physicians in their familiarity with consumer language. Nevertheless, in general the kind of bidirectional communication problems we are discussing can make it difficult for individuals to get the information they need to support their decisions about their personal health care, and so these problems run counter to the goals of Healthy People 2010 and the US National Cancer Institute (NCI).
One way to address these bidirectional communication problems is to develop a
A controlled vocabulary may be defined minimally as a concept-based vocabulary in which both the concepts and the terms used to express them are subjected to some level of control; only some subset of concepts is expressed, and only some terms and term variations are allowed as expressions of those concepts. Typically, an entry vocabulary will link natural language terms to terms in a controlled vocabulary. Since, however, health communication begins and ends with the natural language of professional practice and that of common sense, we were interested in using a controlled-vocabulary resource to link 2 natural language domains of vocabulary. (We use
By using a controlled vocabulary resource to link 2 natural language domains of vocabulary, we may provide entry vocabulary from one domain to the other. Specifically, in this pilot study we evaluated controlled vocabulary resources for their capacity to support an entry vocabulary from natural language health care consumer vocabulary to natural language health care professional vocabulary. The conceptual model we used is depicted in
A controlled vocabulary resource providing a bridge between consumer terms and professional terms
As shown in
In
In order to link natural language consumer terms to natural language professional terms we must, according to our model, group the consumer terms and professional terms into semantic neighborhoods (
To keep this initial exploratory study to a manageable size, we have restricted ourselves to consumer vocabulary related to diabetes, and have attempted to relate this consumer vocabulary to diabetes vocabulary used by US family physicians. As such, this study complements a previous study concerning consumer and professional vocabulary for diabetes that focused on the Read Thesaurus [
We addressed 2 research questions:
To what extent can the Metathesaurus [
Can retrieval of diabetes-related Web information sites maintained by nonprofit health care professional organizations, academic organizations, or governmental organizations be improved by substituting a physician term for its paired consumer term in a Web query?
The Metathesaurus is a large database developed by the National Library of Medicine that links together terms from over 50 health care vocabularies. The Metathesaurus links terms together when they express the same, or nearly the same, concepts. When terms express the same concept, they are assigned, in the language of the Metathesaurus, to the same
The Eurodicautom is the multilingual terminological database of the European Commission's Translation Service. The Eurodicautom contains 124,551 entries for medicine, with many of these in English. Altogether, the Eurodicautom contains 990,672 English term entries. The Eurodicautom is organized by terminology collections maintained by different terminology offices, and, like the Metathesaurus, links different terms together when they express the same, or nearly the same, concepts. For example, the Eurodicautom entry for
The European Glossary was developed by the European Commission and is separate from the Eurodicautom. The European Glossary contains pairs of synonymous or nearly-synonymous consumer terms and professional terms. For example, the European Glossary pairs the consumer term holding your breath with the professional term hypoventilation.
DARE is intended to "document the varieties of English that are not found everywhere in the United States--those words, pronunciations, and phrases that vary from one region to another, that we learn at home rather than at school, or that are part of our oral rather than our written culture" [
The responses included
Make plural singular
Put words in alphabetic order;
it may be normalized to the string
Both the consumer term and the physician term matched directly, via string matches, to the vocabulary resource
The vocabulary resource, according to its internally-specified rules and structure, located the contained consumer term and the contained professional term in the same semantic neighborhood.
We considered the second condition satisfied for the Metathesaurus when either of the terms were associated with the same metaconcept, or were associated with distinct metaconcepts that are related according to the Metathesaurus related metaconcepts (MRREL) table. We considered the second condition satisfied for the Eurodicautom when the terms were matched to the same Eurodicautom entry - and similarly for the European Glossary.
A pair was
In order to provide preliminary results for our first research question, we compared the number of consumer and physician terms that directly and indirectly matched to the DARE extensions of the 3 resources, as well as the consumer-term-physician-term pairs that were directly and indirectly covered.
Our raw data consisted of 909 consumer terms (eg,
Example Pairs of Consumer and Physician Terms
Consumer Term | Physician Term |
---|---|
diabetic food | healthy diabetic diet |
diabetic food | diabetic diet plan |
diabetes recipes | healthy diabetic diet |
diabetes recipes | diabetic diet plan |
We directly matched 27 consumer terms against the DARE extension of the Metathesaurus, 27 consumer terms against the DARE extension of the Eurodicautom, and 5 consumer terms against the DARE extension of the European Glossary. We indirectly matched 13 consumer terms against the DARE extension of the Metathesaurus, 11 consumer terms against the DARE extension of the Eurodicautom, and 1 consumer term against the DARE extension of the European Glossary.
We directly matched 39 physician terms against the DARE extension of the Metathesaurus, 23 physician terms against the DARE extension of the Eurodicautom, and 2 physician terms against the DARE extension of the European Glossary. We indirectly matched 29 physician terms against the DARE extension of the Metathesaurus, 7 physician terms against the DARE extension of the Eurodicautom, and no physician terms against the DARE extension of the European Glossary.
The DARE extension of the Metathesaurus directly covered 17 consumer-term-physician-term pairs, provided partial indirect coverage for 22 pairs, and full indirect coverage for 12 pairs. The DARE extension of the Eurodicautom directly covered 8 pairs, provided partial indirect coverage for 19 pairs, and full indirect coverage for 9 pairs. The DARE extension of the European Glossary directly covered 2 pairs, provided partial indirect coverage for 1 pair, and full indirect coverage for no pairs.
Examples of Directly and Indirectly Covered Pairs of Terms
Consumer Term | Physician Term | Extended Metathesaurus | Extended Eurodicautom | Extended European Glossary |
---|---|---|---|---|
sugar diabetics | diabetes | partial indirect | partial indirect | partial indirect |
sugar diabetes | diabetes | direct | direct | direct |
diabetes | diabetes | direct | direct | direct |
diabetes recipes | diabetic diet | partial indirect | partial indirect | not covered |
diabetic food | healthy diabetic diet | full indirect | partial indirect | not covered |
food to prepare for diabetic | healthy diabetic diet | full indirect | full indirect | not covered |
diabetic diet | diabetic diet | direct | direct | not covered |
Comparison of Results of Consumer Term Searches and Physician Term Searches for Pairs with Partial Indirect Coverage
Consumer Term | Physician Term | Vocabulary Resource Providing Partial Indirect Coverage | ||||
---|---|---|---|---|---|---|
Term | Results | Term | Results | |||
Total | Quality* (out of first 20 results) | Total | Quality* (out of first 20 results) | |||
diabetes recipes | 457 | 1 | diabetic diet | 83,778 | 2 |
|
foods for diabetics | 100 | 0 | diabetic diet | 83,778 | 2 |
|
diabetes in children | 2,016 | 1 | juvenile diabetes | 18,482 | 6 | Eurodicautom |
diabetic pregnancy | 562 | 0 | gestational diabetes | 13,498 | 2 | Metathesaurus |
sugar diabetics | 92 | 0 | diabetes | 1,038,268 | 6 |
|
organizations |
The results for our first research question for the DARE extension of the Metathesaurus appear somewhat promising. The DARE extension of the Metathesaurus provided coverage, either direct or indirect, for approximately 23% of the natural language consumer-term-physician-term
pairs. The results provided by the Eurodicautom extension are less promising, since it provided coverage for only 16% of the term pairs. (The results for the European Glossary were negligible.) However, what we find somewhat promising overall is that both the Metathesaurus and the Eurodicautom extensions indirectly covered more pairs than they directly covered. A high percentage of covered term pairs, with more indirectly-covered pairs than directly-covered pairs, might constitute an efficient use of expensive controlled vocabulary resources for health care communications.
The results for the second research question are also somewhat promising. Notwithstanding the small sample size, in every case the natural language physician term produced better results than the consumer term for sites maintained by nonprofit health care professional organizations, academic organizations, or governmental organizations. This suggests that the DARE extensions of the Metathesaurus and Eurodicautom may be used to provide useful links from natural language consumer terms to natural language physician terms. It might be argued that these results are not particularly meaningful since
We performed our study using terms rather than actual consumer queries
Maintenance of a site by a nonprofit health care professional, academic, or governmental organization is not a certain indicator of the quality of the site.
In response to the first point, we point out that some of the terms we used in our www.altavista.com queries were extracted from longer consumer e-mail messages, and some were used as actual queries to a Web information site. These terms reflect the actual terms used by consumers, although admittedly only the latter can be said to represent actual consumer Web queries (subject to the limitation noted earlier in
In response to the second point, we agree that maintenance of a site by a professional, academic, or governmental organization does not guarantee that the site is of high quality, but, as stated in
Although our results are somewhat promising, they are limited in several ways. One limitation of our study is that www.altavista.com, like most search engines, does not limit its exact-phrase results to results for true exact phrases, but will also report results where the words in the phrase are included in the document, but are not immediately next to each other. Thus, our results do not strictly discriminate between the retrieval effectiveness of the consumer terms and the physician terms. Our results do, however, reflect the situation actually faced by consumers, and to that extent are indicative of the relative effectiveness of the consumer and physician terms.
Another limitation of the study is that one of our sources of consumer terms, the Web query log, did not lend itself to term co-occurrence as a quantifiable measure of semantic locality, nor to other data that might be used to cluster terms into semantic neighborhoods. But, perhaps more importantly, we did not evaluate the final term pairs with respect to other quantitative evidence derived from consumer and physician surveys, nor did we evaluate them with respect to qualitative evidence derived from focus groups and interviews with consumers and physicians. Thus, even if the term pairs appear useful according to the standard of quality that we used in this project, they might not be more generally useful, because they might not constitute appropriate meaning-preserving (or meaning-warping) steps from the language of consumers to the language of family physicians.
Finally, although this study may serve in part as a pilot for larger studies of access to consumer health care information, a general limitation of the study is that we focused on the Web, and did not address the use of other venues of medical information, such as newspapers, magazines, and television. We recognize that these other sources and formats of health information require similar investigation if consumers are to get the information they need to support their decisions about their personal health care. Indeed, as we said at the outset, the bidirectional communication problems we have been considering are more fundamental than the Web and the digital divide. More work needs to be done with physicians, patient-education professionals, and consumers to further articulate the extent of these problems and to develop methods for their resolution.
This project was supported in part by grant LM07089 from the National Library of Medicine. The
None declared.
Associated Press
Center for Disease Control
Dictionary of American Regional English
Institutional Review Board
National Cancer Institute
Unified Medical Language System
United States