Adolescents' access to health information on the Internet is partly a function of their ability to search for and find answers to their health-related questions. Adolescents may have unique health and computer literacy needs. Although many surveys, interviews, and focus groups have been utilized to understand the information-seeking and information-retrieval behavior of adolescents looking for health information online, we were unable to locate observations of individual adolescents that have been conducted in this context.
This study was designed to understand how adolescents search for health information using the Internet and what implications this may have on access to health information.
A convenience sample of 12 students (age 12-17 years) from 1 middle school and 2 high schools in southeast Michigan were provided with 6 health-related questions and asked to look for answers using the Internet. Researchers recorded 68 specific searches using software that captured screen images as well as synchronized audio recordings. Recordings were reviewed later and specific search techniques and strategies were coded. A qualitative review of the verbal communication was also performed.
Out of 68 observed searches, 47 (69%) were successful in that the adolescent found a correct and useful answer to the health question. The majority of sites that students attempted to access were retrieved directly from search engine results (77%) or a search engine's recommended links (10%); only a small percentage were directly accessed (5%) or linked from another site (7%). The majority (83%) of followed links from search engine results came from the first 9 results. Incorrect spelling (30 of 132 search terms), number of pages visited within a site (ranging from 1-15), and overall search strategy (eg, using a search engine versus directly accessing a site), were each important determinants of success. Qualitative analysis revealed that participants used a trial-and-error approach to formulate search strings, scanned pages randomly instead of systematically, and did not consider the source of the content when searching for health information.
This study provides a useful snapshot of current adolescent searching patterns. The results have implications for constructing realistic simulations of adolescent search behavior, improving distribution and usefulness of Web sites with health information relevant to adolescents, and enhancing educators' knowledge of what specific pitfalls students are likely to encounter.
The Internet has become an important tool for many people with health concerns [
Because of the enormous amount of unstructured online content, it is crucial to understand how youth navigate through the Web to find health information. Prior research, primarily from library and information science literature and education literature, has highlighted several search characteristics that are either unique or more pronounced in adolescents. For example, adolescents take more time to complete online tasks than college students [
Searching for online health information involves distinctive challenges including unfamiliar terminology [
Observational research specific to the adolescent age group and online search behavior for health information is also sparse. There have been some good surveys that answer many useful questions concerning why adolescents go to the Internet, what they search for, if they find it, and what they do with it (written communication, 2001 Dec; Generation RX.com Survey printouts; V. Rideout, Henry J. Kaiser Foundation, Menlo Park, CA;and [
The study reported here provides a more in-depth understanding of how adolescents search for health information using the Internet and what implications this may have on access to health information. To capture enough detail, the study recorded specific actions taken by adolescents which were later coded and analyzed. Participants were encouraged to share their thought process out loud as they searched for answers to a list of predetermined health questions. The result was a rich set of both quantitative and qualitative data that was thoroughly analyzed for common themes and events. Specific questions of interest include, but are not limited to: What are the various search strategies used? What factors contribute to finding correct and useful answers? When using a search engine, how many results pages are viewed and utilized? What types of search strings are entered into search engines? Answers to these and related questions should be of interest to a number of parties including educators (eg, health educators, librarians, teachers), Web site and search engine designers, health care practitioners, and researchers (eg, to create a sample of URLs by simulating online searching behavior [
Twelve students from 1 middle school (N= 4) and 2 high schools (N = 4 and N= 4) in southeast Michigan were recruited for this study. Staff at each school were asked to select 4 students who were (a) comfortable using computers, (b) comfortable searching for information on the Internet, and (c) strong students who could afford to miss one class period. Students received a University of Michigan T-shirt, valued at roughly $8, in return for their participation.
The parent or guardian of every student signed an informed consent document that described the purpose and procedure of the study. Students also signed separate assent forms with similar information. The University of Michigan Behavioral Science Institutional Review Board approved this study and the consent and assent documents.
Three methods of data collection were used. First, one of the two members of the research team present during each of the observations coded searching behavior in real time while the second member of the research team interacted with the student. Second, TechSmith Camtasia 3.0.1 commercial tracking software [
All observations of adolescents were conducted during January 2002. Each school provided a room in which to conduct the observations. Students were brought to the observation room one at a time. Two researchers were present at every observation. For each student, one of the researchers first reviewed the assent form to introduce the project and obtain the student's permission to participate. The students were then asked 14 questions about demographics (age, race/ethnicity, and gender) and their prior computer use (eg, how often they use computers or the Internet, what health topics they have searched, which search engines they used, and whether they have a computer and access to the Internet at home).
Once the brief interview had been completed, the observed searches began. To help the students understand the procedure and to reinforce the importance of thinking out loud while doing their searches, each student was first asked to do an easy non-health-related search looking for the next day's local weather forecast. As with the subsequent health-related searches, the local-weather question was first read to the student by a researcher and then a card with the question on it was set next to the computer in case the student needed to read it. As part of the think-aloud protocol, the experimenter asked the student to talk out loud about what they were doing, so that researchers could better understand the reasons behind the searching behavior. If a student stopped talking during the search, he or she was reminded by the observers to "keep talking," but the experimenters did not ask students to elaborate on any specific thing they said. Concurrent verbal reports more accurately reflect a subject's mental state at the time of observed behaviors than do retrospective reflections, and this minimal think-aloud protocol has been shown to slow subjects down, but not to qualitatively change their problem solving behavior [
After the students completed the practice local-weather search, they were given a sequence of up to 6 predetermined health information questions (see
Health-related questions
Your aunt was just told she has diabetes. She isn't sure what kinds of food she can or can't eat. Using the Internet, find some information for your aunt about what foods she should or should not eat. |
A friend recently started taking a drug called Paxil for depression. He seems to be tired all the time, and even falls asleep in class. Use the Internet to find out if the drug might be making him sleepy. |
Your older brother has a problem with drinking too much alcohol. He wants to go to a local Alcoholics Anonymous meeting. Use the Internet to help him find a local meeting. |
You want to get an HIV test, but you don't want anyone to know. You also don't have any money to pay for it. Use the Internet to find a place to get a free and confidential HIV test. |
For class, you need to learn about medicine that can help people stop smoking. Using the Internet, find the names of these medicines. |
You are about to get a tattoo, but a friend warned you that some places spread infections like HIV and hepatitis. Use the Internet to find out if this is true. |
Topics for the health-related questions were chosen based upon responses to a survey of adolescents conducted by the Kaiser Family Foundation (written communication, 2001 Dec; Generation RX.com Survey printouts; V. Rideout, Henry J. Kaiser Foundation, Menlo Park, CA). Certain topics including homosexuality, teen pregnancy, and abortion were purposefully avoided so as not to expose participants to overly-controversial information.
After all the observations were completed, 3 researchers including a physician, health educator, and human-computer interface specialist met as a group to review the real-time coding results and to clarify or augment the coding scheme before the definitive final coding of the tracking-software records. The final coding scheme was designed to record data on the person searching, the question being asked, the time it took to find an answer, the search strategy utilized (eg, utilize search engine or directly type in URL); search strings used; number of search engine results pages reviewed; number of pages viewed within a particular site; and the use of menus, advertisements, and directories. One of the 3 coders was assigned as a primary reviewer for each of the observation sessions. The assigned primary reviewer was responsible for a detailed coding of the observation session and any coding problems were resolved in a second group discussion.
The reviewers classified each of the answers found by the students as
Twelve middle school students and high school students in southeast Michigan participated. Students ranged in age from 12 to 17 years old, with a mean of 14 years. Half of the students were female. Of the 12 students, 7 were white, 2 were African American, 1 was Indian American, 1 was Hispanic, and 1 was Asian American. Of the 12 students, only the 6 oldest students had searched for health information on the Internet before. The variation by age is consistent with other findings that youth age 15 to 17 years are significantly more likely to have looked up health information (32%) than youth age 12 to 14 years (18%) [
Eleven students attempted all 6 searches, while the remaining student attempted 3, for a total of 69 searches. One search was not included since the Internet connection was not working properly, making a total of 68 searches that were analyzed. Searches took an average of 5 minutes and 41 seconds, ranging from just under a minute to nearly 24 minutes. This time frame is essentially the same as Eysenbach recorded for adults [
As students thought aloud, the researchers got a sense of what students were looking at on each page. Students seemed to skip around a lot, and didn't skim results pages or specific Web sites in any methodical or thorough ways, sometimes missing links or text that contained the answer to questions. This is also consistent with findings from non-health-related searching behavior as summarized in Hsieh-Yee [
Distribution of pages viewed per site
|
|
||
|
|
|
|
1 | 143 | 70.4 | 70.4 |
2 | 27 | 13.3 | 83.7 |
3 | 11 | 5.4 | 89.2 |
4 | 8 | 3.9 | 93.1 |
5 | 8 | 3.9 | 97.0 |
6 | 2 | 1.0 | 98.0 |
8 | 1 | 0.5 | 98.5 |
9 | 1 | 0.5 | 99.0 |
15 | 2 | 1.0 | 100.0 |
Total | 203 | 100 |
Students used multiple methods to locate Web sites that they believed contained answers to the 68 questions. In 60 cases, the student started looking for an answer by visiting a search engine and entering in a search term or phrase. In 2 cases, the student started by selecting from directory menus (eg, choosing the topic
Even when students found a Web site that contained the answer to a question, they did not always find the answer. One example is the Alcoholics Anonymous site [
Seven search engines were used, including 2 meta-search engines (Dogpile and Locate.com). The meta-search engine Locate.com offers the user a number of search engines to choose from. Searches performed from the Locate.com Web site that utilized another search engine (eg, Yahoo!) are reported as if the search occurred on the destination search engine (eg, Yahoo!).
Search engine usage
|
|
|
|
|
|
38 | 48.1 | |
Yahoo! | 13 | 16.5 |
Ask | 12 | 15.2 |
MSN | 7 | 8.9 |
Hotbot | 6 | 7.6 |
Dogpile | 2 | 2.5 |
AltaVista | 1 | 1.3 |
A total of 132 search phrases were entered into the various search engines. Only 104 of those search phrases were unique. The most-frequent 2 phrases used were "diabetes" and "Paxil," each of which had 5 occurrences. There was an average of 3.6 words typed in per search phrase and 80% of the time there were 4 or fewer words per search phrase.
Distribution of search-result links viewed
|
|
||
|
|
|
|
Results 1-10 | 137 | 82.5 | 82.5 |
Results 11-20 | 8 | 4.8 | 87.3 |
Results 21-30 | 11 | 6.6 | 94.0 |
Results 31-40 | 4 | 2.4 | 96.4 |
Results 41-50 | 4 | 2.4 | 98.8 |
Results 51-60 | 1 | 0.6 | 99.4 |
Results 61 or more | 1 | 0.6 | 100.0 |
Of the 132 search phrases, 30 contained at least 1 word that was misspelled (eg, "tatoo," "Alchoholics," or "smokeing"), despite the fact that students could read the correctly-spelled word on the index card containing the question. Some search engines (eg, Google) offer a feature that recommends an alternate search string with the correct spelling of a word. For example, if a student typed "alchoholics anonymous," the first page of results began with, "Do you mean 'alcoholics anonymous?'" Students were offered a new search string with correct spelling on 15 separate occasions, but only noticed and used it 6 times. The remainder of the times they used the results that were offered for the incorrect spelling. Of the 7 students who were offered corrected spelling suggestions, only 2 ever used them.
Once a search string was entered into a search engine, students varied in the number of results pages that were viewed. Students viewed only the first results page 78% of the time and 4 pages or less of results 93% of the time. Because search engines report a different number of links per page of search results,
Of the 68 questions that students attempted to answer, 7 searches were abandoned after the student gave up or, in 2 cases, when the class period ended. Of the remaining 61 searches, 47 were successful in finding a complete, correct, and useful answer to the health question and the remaining 14 were unsuccessful. Six of the unsuccessful answers were completely incorrect and not useful, 4 were useful but only partially correct, and 4 were fully correct but not useful.
Several factors contributed to the success of finding a correct, complete, and useful answer. One important factor was the individual who was performing the search. Although every student answered at least 1 question correctly there was wide variation in the number of correct answers. Two students successfully answered 6 out of 6 questions, 3 students successfully answered 5 questions, 4 students successfully answered 4 questions, and the remaining 3 students only successfully answered 1 or 2 questions. While our sample of students was too small to draw conclusions from, no distinct patterns were observed that would indicate that race, gender, Internet experience, or health searching experience were significant determinants of success. However, the older adolescents (16-17 year olds) were successful 87% of the time (26 of 30) as compared to 68% (21 of 31) for the younger adolescents.
Another important factor was the difficulty level of the questions themselves.
Unsuccessful searches by search topic
|
|
|
|
|
|
HIV test | 8 | 38.1 |
Paxil | 4 | 19.0 |
Alcoholics Anonymous | 3 | 14.3 |
tattoo | 3 | 14.3 |
smoking | 2 | 9.5 |
diabetes | 1 | 4.8 |
Total | 21 | 100.0 |
Certain search actions led to sites that contained the answer more often than others. Overall, students found answers on 22% of the sites they accessed (47 of 215). They accessed sites in 5 ways. Although not often taken, the action with the highest probability of success (47%; 7 of 15) was following a link from 1 non-search-engine site (eg, www.aa-intergroup.org) to another site (eg, www.alcoholics-anonymous.org). In most of these cases, the student accessed the first site directly from a search engine. Clicking on search engine results led to a site where students found an answer 21% of the time (35 of 166). Success rates were similar for following a recommended link from a list or menu provided by the search engine (18%; 4 of 22). Directly typing in a URL, bypassing search engines entirely, was successful only 9% of the time (1 of 11). A sponsored link from a search engine was followed only once, and the student found an incorrect answer on that site.
Another contributing factor related to success was misspelling of search terms. Of the 14 completed but unsuccessful searches, 29% (4 searches) had at least 1 misspelling compared to only 15% (7 searches) of the 47 successful searches. Perhaps even more telling, both successful and unsuccessful searches with misspellings took students 1.5 minutes longer on average than searches without misspellings. Observations confirmed that some students were unable to find an answer until they discovered and corrected their misspelling, resulting in higher quality and more-relevant results.
Other search characteristics did not have statistically significant impacts on whether searches were successful, although this may have been due to small sample sizes. For example, the search engines were not significantly different in their percentages of successful searches. Similarly, the average number of words per search string was not significantly related to search success rate. (Data not shown.)
Certain common behaviors of the adolescent searchers were observed which were not apparent from the quantitative analysis.
First, the students were very comfortable and confident while searching online for health information. Most students knew where they wanted to start the search and navigated using quick mouse clicks and shortcut keys. However, this characteristic was likely over-represented in our population due to their strong academic performance and Internet proficiency.
Second, several searchers did not take much time in formulating a search strategy or (when applicable) choosing search terms. Instead, these searchers seemed to type in the first search string that came to mind. If the results were not what were anticipated, another search string was typed in, sometimes without even clicking on any results from the first search string. The overall approach was a trial-and-error method with frequent backtracking. The most-common problem with search strings was that they were not specific enough. For example, 2 different students typed in the search string "hiv" when looking for a place that administers free and confidential HIV tests.
Third, most students quickly scanned pages, jumping from place to place within a page, rarely reading an entire paragraph. In some cases the answer to a question was contained on a page, but the student left before finding it. In other cases a link that would have led to the answer was missed. This finding supports prior research on adolescent search behavior related to nonhealth topics [
Fourth, students mentioned that they purposefully avoided sponsored links and advertisements, despite the fact that many of the search engines present these results first. The qualitative data confirmed this practice, as only 1 sponsored link was ever selected.
Finally, little to no attention was paid to the source of the answer. In the vast majority of cases, once an answer was located, it was simply assumed to be correct.
When compared with prior research, the findings of this study show many similarities and a few key differences between the behaviors of adolescents and adults while searching for health information. This study found that adolescents searching for health information utilized search engines nearly every time. This finding was similar to that for adults as described in the Eysenbach study [
The results from this study have implications for anyone who simulates adolescent health searches, for providers of health information, and for educators. There are many reasons to simulate adolescent health searches. For example, an educator preparing a lesson plan may want to informally simulate searches in order to anticipate what students are likely to find if given certain particular search tasks. A researcher may want to simulate adolescent searches more systematically to evaluate the availability and accessibility of information on particular topics, to evaluate which search engines should be recommended to adolescents, or to evaluate whether the installation of filtering software will have a detrimental impact on accessibility of health information [
The results of this study suggest that such simulations can focus on the use of search engines, but that very-broad search terms and, especially for adolescents, common spelling errors should be considered. Ads and other nonresult links can be ignored. Since more than 80% of the links that were followed appeared in the top 10 results, and more than 95% were among the top 40, a search simulation need not consider result links beyond these.
Given the patterns of adolescent searching behavior found in this study, providers of health content can do several things to increase the probability that adolescents will find their sites. Since adolescents rely primarily on the first few results from search engines and do not tend to look at ads, it is important to ensure that health sites appear near the top of the results for searches on health terms. Choices of keywords in the domain name, page title, meta tags, and the first few sentences, as well as links from other sites, can all affect placement in search results. It may also be useful to include some common misspellings in meta-tag keywords and in the body of the text in order to make a site appear in the results page of searches using those misspellings of related search terms. Because most major English-language search engines no longer use the keyword feature of meta tags, site designers are left with the difficult task of working misspelled words (eg, misspelt) into the text without coming across as poor spellers themselves. It is also important that the site descriptions displayed in search engines be attractive to adolescent searchers: while our study did not analyze the various reasons that adolescents chose to follow one link over another, we did observe that they made choices based upon the link descriptions and did not simply select the first link offered. Books and articles, software, and consulting services are all widely available to improve search engine placement and to influence the short summary text that search engines extract for display in search results [
Another area that Internet content providers should focus on is within-site navigation. Because students tend to skip around from place to place within a page and read little in sequence, it is important that sites with a significant adolescent audience are well organized, concise, and understandable. Long paragraphs, too many links, and difficult vocabulary all decrease the likelihood of adolescents finding health information they are seeking, even if it is contained within a site. Internet content producers should attempt to understand the needs of the site visitors and build hierarchal structures that reflect those needs. For example, if one of the primary needs of individuals visiting the Alcoholics Anonymous site is to find a local meeting, the first page of the site should include an obvious link (eg, "Find an AA Meeting Near You") that leads to another page that returns the nearest meetings after entering in a zip code or city name. While ease of within-site navigation is important for all visitors to health information sites, some information providers may want to develop sites targeted specifically to adolescents. While they might like the targeted information once they found it, we observed that adolescents tend to rely on general-purpose search engines. Thus, developing special youth-targeted versions of information sites may be of somewhat limited utility, unless also accompanied by advertising or education campaigns that make adolescents more likely to find such sites.
Rather than changing Web sites or their presentation in search engines, it may also be useful to undertake education campaigns to improve the search strategies and tactics that adolescents use when seeking health information. It may be helpful to guide them towards youth-oriented directories or search engines, rather than general-purpose search engines. For example, both Yahoo! and Google offer directories with subcategories of sites designed for teens that cover various health topics. This approach may be facilitated by including links to such resources on the Web browser's starting page in schools and libraries. Alternatively, adolescents might be taught techniques for formulating and refining search terms at general-purpose search engines, adding or dropping more-specific words based on the kinds of results returned. They might also be taught to notice potential search term misspellings based on surprising search results. Finally, adolescents might also be taught techniques for systematically exploring within a Web site to find the kind of information they are looking for.
There are several important limitations to the interpretation of these results.
First, this was not a representative or random sample of adolescents. It was a small convenience sample with a selection bias toward adolescents with strong Internet searching skills. While the results cannot be generalized to all adolescents and do not capture the full range of adolescent searching experience, we can assume that the average adolescent would have had even more trouble than our study participants in finding health information on the Internet.
Second, the health-related search questions were deliberately constructed to avoid controversial topics such as safe sex, abortion, and homosexuality. Given that adolescents are often faced with health problems related to sexuality, their actual search behavior and success at finding health information related to sexuality may not be reflected in our results. Another concern is that participants may have changed their search behavior because of the presence of observers and because they were aware that their search behaviors were being recorded. For example, students who had trouble finding an answer may have persisted in their search longer than they would have in a nonresearch setting. Alternatively, because students knew they had several search questions to answer during a single class period, they may not have been as persistent as they might have been with a more personally-relevant question and less-restricted search time. Thus, the data here reflect a rough estimate of persistence for an adolescent looking for health-related information. Also, searching was conducted individually, while in practice many searches both at home and at school are conducted with friends, teachers, or family close by. While it is difficult to know how this would affect searching behavior without future research, it is possible that students would act differently (eg, receive help with spelling).
Finally, while components of our classification scheme for successful versus unsuccessful searching have been previously validated, the overall scheme was modified to more accurately code the search results as correct, complete, and useful. A more-systematic validation of coding schemes for health information search results is an important area for future research.
More research is needed to validate the results presented in this article, as well as determine if results vary for different populations (eg, age, race, and experience with health searching) and different health questions (eg, finding a practitioner versus finding the answer to a question). Additionally, instead of focusing on how adolescents currently search for health information, future studies may also want to explore interventions aimed at improving their searches. For example, should health portal sites designed for adolescents or online directories be used? Or would the current practice of using common search engines, but with adolescents learning improved search tactics be more effective? Also, which search strategies lead to sites that are the most likely to be accurate and influence adolescents to change their behavior?
This study provides a useful snapshot of current adolescent searching patterns. The results have implications for constructing realistic simulations of search behavior, and for both information providers and educators. Analyzing search behavior through actual observation should be a cornerstone in any effort to improve adolescents' access to health information.
The study was conducted by the University of Michigan Health Media Research Lab. In addition to the authors, Ed Saunders and Mike Nowak assisted in observation and coding for the study. Suresh Bhavnani provided valuable feedback on our research design.
Funding for this study was provided under a contract from the Kaiser Family Foundation.
None declared.