Published on 21.06.04 in Vol 6, No 2 (2004)
Web Content Accessibility of Consumer Health Information Web Sites for People with Disabilities: A Cross Sectional Evaluation
Background: The World Wide Web (WWW) has become an increasingly essential resource for health information consumers. The ability to obtain accurate medical information online quickly, conveniently and privately provides health consumers with the opportunity to make informed decisions and participate actively in their personal care. Little is known, however, about whether the content of this online health information is equally accessible to people with disabilities who must rely on special devices or technologies to process online information due to their visual, hearing, mobility, or cognitive limitations.
Objective: To construct a framework for an automated Web accessibility evaluation; to evaluate the state of accessibility of consumer health information Web sites; and to investigate the possible relationships between accessibility and other features of the Web sites, including function, popularity and importance.
Methods: We carried out a cross-sectional study of the state of accessibility of health information Web sites to people with disabilities. We selected 108 consumer health information Web sites from the directory service of a Web search engine. A measurement framework was constructed to automatically measure the level of Web Accessibility Barriers (WAB) of Web sites following Web accessibility specifications. We investigated whether there was a difference between WAB scores across various functional categories of the Web sites, and also evaluated the correlation between the WAB and Alexa traffic rank and Google Page Rank of the Web sites.
Results: We found that none of the Web sites we looked at are completely accessible to people with disabilities, i.e., there were no sites that had no violation of Web accessibility rules. However, governmental and educational health information Web sites do exhibit better Web accessibility than the other categories of Web sites (P < 0.001). We also found that the correlation between the WAB score and the popularity of a Web site is statistically significant (r = 0.28, P < 0.05), although there is no correlation between the WAB score and the importance of the Web sites (r = 0.15, P = 0.111).
Conclusions: Evaluation of health information Web sites shows that no Web site scrupulously abides by Web accessibility specifications, even for entities mandated under relevant laws and regulations. Government and education Web sites show better performance than Web sites among other categories. Accessibility of a Web site may have a positive impact on its popularity in general. However, the Web accessibility of a Web site may not have a significant relationship with its importance on the Web.
J Med Internet Res 2004;6(2):e19)
The World Wide Web (WWW) has become an increasingly essential resource for health information consumers. One recent study estimated that 73 million US residents searched for health information online during the year 2002 . The investigators estimated that seventy-odd percent of the online population search for health-related information for their decision-making [ ]. Eysenbach and Kohler [ ] estimated that approximately 4.5% of all search queries submitted to Web search engines are health related, which is equivalent to a global minimum of 6.75 million health-related searches on the Web every day. With the advances of computer and Internet technology, the distribution of the online population is becoming representative of the general population in terms of demographic and socioeconomic status [ ].
The ability to obtain accurate medical information online quickly, conveniently, and privately provides health consumers with the opportunity to make informed decisions and participate actively in their personal care . Little is known, however, about whether this online information is equally accessible to people with disabilities who must rely on special devices or technologies to process online information due to their visual, hearing, mobility, or cognitive limitations.
The latest report on Internet use from the National Telecommunication and Information Administration (NTIA) demonstrated that people of all ages, races, and ethnicities, including people with disabilities, are moving more and more of their activities online . A recent investigation on Internet use by people with disabilities reported that people without disabilities are four times more likely (38.1%) to use the Internet than are people with disabilities (9.9%) [ ]. Similar patterns remain even when factors, such as income, gender and educational attainment, are taken into account [ ]. The large disparity in Internet usage may be attributable to problems with the accessibility of Web content [ ]. Nielsen (2001) reported that the usability of the Web is about three times better for users without disabilities than it is for users with disabilities [ ].
For people with disabilities, the Web is very often the only source of information that they may access without having to depend unduly on others. Equivalent Internet access to health information will open a door to people with disabilities by offering them exciting possibilities for independent living and community participation . People with disabilities can find a wealth of information on the Internet that addresses many issues of special concern to them, including chronic disease information and rehabilitation and assistive technology services [ ]. According to a recent report, people with disabilities tend to seek health related information online more frequently than the able-bodied population [ ]. Nevertheless, for health information Web sites to be of real use to people with disabilities, they must first be accessible to them. Health information Web sites are a classic example of the "inverse information law": access to appropriate information is particularly difficult for those who need it most [ ].
Background and Prior Work
Web content accessibility helps people with disabilities access Web pages directly or use assistive technologies. Many people with disabilities have to rely on specialized software or hardware to access the Web. For example, people who are visually impaired have to install a software package called a screen reader to read all the content on the Web page aloud to them. Some people who are blind also use a talking browser like IBM Home Page Reader to access the Web page aurally. Some people who are blind prefer a hardware-level solution like the computer-controlled Braille embosser to help them perceive content of the Web page haptically. Regardless of the solution favored by the users with disabilities, if the content of the Web page is not available to their remaining sensory channel, then the page is not accessible to them.
The Web inadvertently has become increasingly inaccessible to people with disabilities as it adopts numerous emerging multimedia technologies. The Web at its beginning was designed for sharing and accessing documents across different computer systems and platforms. These documents are primarily text-based and mostly accessible to assistive technology, such as screen readers. With the introduction of appealing multimedia content, however, the Web is becoming an information medium that is not accessible to or not easily interpreted by assistive technology. Graphics, animations, and even video/audio clips, now commonly appear on the Web. The absence of alternative information about multimedia content makes them less accessible to people with disabilities than those with multimodal access to the multimedia content. The rapid expansion of e-commerce also makes the Web even more complicated and less accessible for people with disabilities. As Herbert A. Simon  once stated, "What information consumes is rather obvious: it consumes the attention of its recipients. Hence a wealth of information creates a poverty of attention, and a need to allocate that attention efficiently among the overabundance of information sources that might consume it." Web page developers believe that multimedia content could lure more visitors to the Web site and make them stay longer. However, they may overlook or ignore the accessibility for people with disabilities to that multimedia content because its primary purpose is to draw attention from potential consumers, the majority of whom are not people with disabilities.
Realizing this dilemma, the World Wide Web Consortium (W3C), the international organization that oversees the standardization and operation of the Web, announced the establishment of the Web Accessibility Initiative (WAI) on April 7, 1997 . Supported by all W3C members, including such heavyweight stakeholders as Microsoft and IBM, the WAI plays a central role in promoting and correcting the functionality of the Web for people with disabilities. The first major responsibility of the WAI was to formalize guidelines for Web content developers and designers. WAI introduced Web Content Accessibility Guidelines (WCAG) to the public as a draft in 1998, and developed it into a full recommendation in 1999 [ ]. WAI expanded the guidelines to be applicable in the design of user agents (e.g., Web browsers or assistive technology agents like the screen reader JAWS for Windows), authoring tools (e.g., Microsoft FrontPage or Macromedia Dreamweaver) and related techniques, and a practical checklist [ , ].
There are two basic themes reflected in the WCAG: ensuring graceful transformation of Web pages, and making content understandable and navigable. By providing Web pages that transform gracefully, people with disabilities or users with device limitations will be able to access them without constraints. Keys to graceful transformation include separating structure from presentation, providing text equivalents to non-textual elements, creating documents that work even if the user cannot see and/or hear, and creating device-neutral documents. When the content is understandable and navigable, end users can utilize the page in a more effective, efficient and satisfactory manner. Keys for making content understandable and navigable include providing a navigating context and orienting information, providing a clear navigation mechanism, and ensuring succinct content descriptions.
Another initiative in the development of accessibility standards is Section 508, conducted by the US Access Board . The Access Board issued standards for accessible information technology under the Reauthorized Rehabilitation Act. These amendments strengthen Section 508 of the Rehabilitation Act of 1973. It mandates that when federal agencies develop, procure, maintain, or use electronic and information technology, they shall ensure that the electronic and information technology will allow federal employees with disabilities access to and use of the same information and data as that accessed and used by federal employees who are not individuals with disabilities, unless an undue burden would be imposed on the agency. Section 508 also mandates that agencies ensure equal access to individuals with disabilities who are members of the public seeking information on data that are comparable to that provided to those who are not individuals with disabilities, unless undue burden would be imposed on the agency. Section 508 clearly defines the accessibility for people with disabilities for federal government Web sites. Section 508 took effect on February 20, 2001.
Many software packages have been developed and commercialized to help Web developers evaluate the accessibility of their Web sites to people with disabilities . These packages can scan Web pages, list computer detectable violations of Web accessibility standards, and give warnings for suspicious HTML snippets. Some tools integrate themselves into Web site developing or quality control programs to assist Web developers in quickly eliminating the inaccessible parts. Bobby, one of the earliest and most well known packages for checking Web accessibility, was used in our study.
Researchers from different disciplines have evaluated Web accessibility and usability of Web sites in various domains. The Journal Library Hi Tech published two special issues dedicated to Web content accessibility of Web-based information resources for people with disabilities [, ]. Axel Schmetzke [ ] maintains a Web accessibility survey site that aspires to be a clearinghouse for studies involving the collection of accessibility data pertaining to Web sites and online resources in education. The site listed many Web accessibility evaluation studies on libraries and higher education Web sites. Another related effort is the Web Usability Index (WUI), a free Web usability statistics database provided by UsableNet [ ]. It employs an automatic Web usability evaluation tool for testing Web accessibility and obtains daily statistics of Web usability of sample Web sites from the Internet. According to WUI, only about 43% of current Web sites provide excellent or good Web usability design.
Although the Web is considered a powerful force for reshaping the healthcare infrastructure, the accessibility of Web content to people with disabilities is not a primary consideration for most designers of Web sites providing health related information . Very few research studies have been conducted on the accessibility of health information Web sites for people with disabilities. Research studies on the accessibility of health information Web sites are, for the most part, about the find-ability and search-ability of Internet Web sites by online search engines or about the availability of information technology for the people who need it - ]. Previous guidelines related to the quality of health information Web sites failed to emphasize the accessibility of Web sites by people with disabilities [ ] until the National Cancer Institute (NCI) published research-based guidelines addressing Web usability [ ]. Chapter 3 of the NCI report is specifically dedicated to the issue of Web accessibility for persons with disabilities although the rest of the guidelines can also benefit general Web users.
The only study known to us that covers health information Web sites was the study conducted by Joel Davis in 2002 . Davis explored the extent to which Internet-based health information is accessible to visually impaired individuals who rely on automated screen readers. Davis selected 500 individual Web sites representing 50 common illnesses and conditions for evaluation. The study found that accessibility is currently very low-only 19% of the examined sites' home pages were accessible. It also found that the reason for the inaccessibility of the Web pages was noncompliance with the recommended design and coding changes.
Our study will be different from other studies in several ways: first, the study will check the degree of accessibility not only of home pages (main pages) of health information Web sites, but also of other Web pages within certain levels below the home pages. Second, the majority of other studies report the state of accessibility in terms of the absolute number of violations of accessibility checkpoints. Although absolute numbers of violations of Web content accessibility provide useful information about the state of accessibility, it is not straightforward for direct comparison of general accessibility between Web sites, and it does not include the complexity of the webpage into the evaluation. Third, we will investigate the relationship between Web accessibility and other features of a Web site including function, popularity and importance.
The overall objective of the study was to evaluate the accessibility of consumer health information Web sites for people with disabilities. We were interested in the following specific research questions:
- What is the current level of accessibility for consumer health information Web sites?We were interested in using automated computer programs to evaluate the current state of content accessibility of Web sites providing health information to consumers. The checkpoints used in the program were derived from Web accessibility specifications -- WCAG 1.0 and Section 508.
- What is the relationship between Web accessibility and the functional category of the Web site?We were interested in determining the distribution of the level of accessibility among these Web sites after we categorized them into functional groups. We expected government and education Web sites to provide information that is more accessible to consumers than other types of Web sites because of the existing specifications and initiatives.
- What is the relationship between Web accessibility and the popularity of the Web site?The hypothesis for this research question is that there is a positive correlation between the degree of Web accessibility and the popularity of the Web sites. The variable representing popularity of a Web site was determined by its visiting traffic.
- What is the relationship between Web accessibility and the importance of the Web site?
We wanted to investigate whether there is any correlation between the level of Web accessibility and the importance of the Web site. We expected to find that more important Web sites would be more accessible to people with disabilities. The variable representing the importance of a Web site was determined by the page importance ranking data provided by a Web search engine.
Materials and Methods
The study is a cross-sectional study concentrating on the degree of accessibility of Web sites providing consumer health information. We used established Web accessibility specifications as the sources for constructing the measurement framework. Additionally, we investigated the relationship between Web accessibility and other features including function, popularity, and importance.
An individual Web site providing consumer health information is the unit of analysis in the study. Because the exact number and distribution of Web sites are not pre-determinable due to the tremendous size and rapid growth of the Web, probability based sampling methods, such as random or stratified sampling, are not applicable in the study. An alternative sampling approach widely adopted by researchers conducting studies on Web sites is to use search engines or online Web site directories.
We acquired a list of consumer health information Web sites from the directory service of the Google search engine (). Google's directory service obtained data from the Open Directory Project, the largest, most comprehensive human-edited directory of the Web [ ]. We included all Web sites under the subdirectory "Health/Resources/Consumer" as our candidate Web sites for evaluation. These are health information Web sites for the public, and their content are not necessarily specific to issues related to disability. We excluded ones that had their content changed to non-health related areas or were continuously unavailable during our study period after we reviewed the home page of each Web site.
After selecting the sample Web sites, we needed to establish a limit to the scope of the Web pages to be included within each site. Because WCAG only applies to Web pages, other content formats such as PDF (Portable Digital Format) files were not considered. However, server side scripting such as Active Server Page (ASP), or JavaServer Page (JSP) is able to dynamically produce HTML-based code at the client side, therefore we took these types of pages into consideration. Second, we needed to determine the number of Web pages from each Web site to be included in the analysis. Due to the large number of Web pages in some Web sites, it was not feasible to include all the pages into the study. We selected only the first two layers from the home page within a domain of a Web site in our sample. We hypothesized that the first two layers would be the most visited and would reflect the overall accessibility of the Web site for the study. The other reason for choosing only the first two layers was that Bobby version 4.01 has the ability to only process a limited number of pages on a given Web site because it consumes a large amount of computer memory during the analysis. When we selected three layers from the home page, Bobby encountered an "out of memory" error when analyzing large Web sites using a Pentium 2.4Ghz desktop computer with 1Gb memory.
Web Content Accessibility
One of the objectives of the study is to construct a measurement framework to assess the accessibility of consumer health information Web sites. As we discussed in the background section, two major specifications served as the normative guidelines for Web content accessibility design. The first-the W3C Web Content Accessibility Guideline 1.0 (WCAG)-is a stable international specification developed through a voluntary industry consensus. The US Access Board published the second specification-Electronic and Information Technology Accessibility Standards-in December 2000, pursuant to the US rulemaking process as required by Section 508 of the Rehabilitation Act Amendments of 1998 . Both specifications offer checklists or rules that Web developers should follow with regard to content accessibility for people with disabilities. These two specifications largely overlap; only three of the checkpoints defined in Section 508 are not mentioned in the WCAG guideline 1.0. WCAG is more comprehensive than Section 508 on checkpoints of Web content accessibility, and it provides a priority level to each checkpoint to reflect severity of violations. Therefore, WCAG was used as the foundation for the accessibility metrics we developed.
The number of violations of each checkpoint is a component of our scoring method called the Web Accessibility Barrier (WAB) score. For example, a Web page with fewer accessibility checkpoint violations, e.g., providing an alternative description for an image object, would be considered to present fewer barriers for people with disabilities and will have a lower WAB score.
Because we are interested in automated evaluation of the degree of accessibility of a Web site, the subset of Web accessibility checkpoints demanding manual checking are not included in the calculation of the WAB score. For example, compliance to the rule "If you use color to convey information, make sure the information is also represented another way," cannot be verified until a manual check is done. For a list of Web accessibility rules that need to be manually checked, please see the WAI references .
WCAG attaches a three-point priority level to each checkpoint from its impact on Web accessibility. Priority 1 checkpoints mandate the largest level of compliance while Priority 3 checkpoints are optional for Web content developers. In weighting the calculation of the WAB score, we used the priority level in reverse order. The weighting factor for Priority 1 violations is 3, for Priority 2 violations is 2, and for Priority 3 violations is 1.
Using only the number of violations of Web accessibility checkpoints, however, may bias the results of the measurement. For example, a Web page with five "image without alternative text" violations may have 500 image objects embedded in the page and the Web page with one "image without alternative text" violation may have only one image object in the page. The developer of the first page may have already paid a great deal of attention to and put great effort into complying with the Web accessibility specifications while the developer of the second page may be completely unaware of accessibility. Therefore, the number of true violations of a checkpoint must be normalized against the number of potential violations of the checkpoint. In the last example, true violations are the image objects without alternative text, and the potential violations include all image objects on the page. Whenever a Web developer puts an image element into a Web page, he increases the potential that there could be a violation of the "alternative text" checkpoint.explains the selection of potential violations from HTML code. The average WAB score of all Web pages within a Web site will be the WAB score of the Web site.
summarizes the calculation of the WAB score of a Web site as a formula. A higher score means there are more accessibility barriers on the site, while a lower score indicates fewer barriers. A score of zero denotes that the Web site does not violate any Web accessibility guidelines and should have no automatically detectable accessibility barriers to people with disabilities.
We employed several program tools to examine the true and potential violations of the Web pages. Bobby is a checking program that can examine a Web page and report violations of Web accessibility checkpoints . It is the most widely used accessibility checking software package and has been around longest. Bobby was originally developed by the Center for Applied Special Technology [ ], and is now maintained and distributed by Watchfire Corporation [ ].
Bobby desktop version 4.0.1 was used in this study. The desktop version can check compliance with WCAG of an entire Web site or only certain layers from the main page of the Web site. The version 4.0.1 can check non-compliance issues with both WAI and Section 508 checkpoints. After checking a Web site, Bobby generates a report in eXtensible Markup Language (XML) format that can be further processed to extract data about true violations.
Bobby implements 91 distinct testing rules, each of which maps onto a specific WCAG checkpoint. The Bobby tests are classified into a number of different "checking" categories, as follows: (1) Full: Bobby automatically checks this rule and decides whether there is an error. (2) Partial: Bobby automatically performs some checking of the rule, but cannot decide the existence of violations. Instead, the line number is used as a warning to the testers. (3) Partial Once: Similar to the Partial category, but the warning is not specific to an individual line. (4) Ask Once: Bobby does not have a mechanism to check the rule, so the rule is presented as a reminder to the testers.
For all categories other than Full, a human tester must manually evaluate the Web site further to determine the WCAG compliance, which is not viable for a large scale Web site study like this one. We used only the 25 rules that Bobby implements with Full checking capacity for our evaluation. Even for the rules with "Full" checking capacity, we still could not determine the quality of the compliance with WCAG. For example, the Web page developer could simply put the file name of the image into the "alt" attribute of the <IMG> element to avoid a flag from Bobby. The quality of such compliance is much less acceptable than providing detailed description in the "ALT" attribute.
The data of corresponding potential violations for each checkpoint can be extracted using a Web crawler program, which is an automated program that follows hyperlinks to visit Web pages. We developed a lightweight Java-based Web crawler program to access Web pages at remote Web sites and determine the number of potential violations of Web accessibility checkpoints. We did not use the built-in Web crawler in Bobby because it cannot be customized to check potential violations of checkpoints in a Web page. We also made use of the "homemade" crawler as the basis for future development of tools for Web accessibility evaluation. For a list of rules for extracting data of potential violations, please see. Since the crawler embedded in Bobby and the "homemade" Web crawler may retrieve an unmatched number of pages for the different capacities of both crawlers, we only used the Web pages retrieved by both programs in the study.
Function of the Web Sites
We measured three variables-function, popularity and importance-as other features of the Web sites. We classified the candidate Web sites based on their functions. We used a taxonomy that classifies the Web sites into six functional categories: e-commerce, corporate, portal, community, government, and education. We derived the taxonomy from a similar one from the Web Usability Index database . An e-commerce Web site conducts online transactions of health related products or services. A Corporate Web site represents a health care service corporation online. A Portal Web site provides entrance to various health related information resources. A Community Web site hosts online activities for patients or health information seekers. Government and education Web sites have the postfix ".gov" and ".edu", respectively in their domain names. lists example Web sites from each category.
Two evaluators individually assigned each Web site to one of the aforementioned categories. In case of a disagreement about the assignment, both evaluators discussed it until reaching a consensus. Each Web site fell into only one of the categories. Government (.gov) and education (.edu) Web sites had precedence over other function categories. For example, HealthFinder.gov is a government Web site, but its function is also to provide health information as a portal. We assigned it to the government instead of portal category. The reason for the precedence is that we were especially interested in the degree of Web accessibility of these two functional categories because of the existing specifications and initiatives.
Popularity of the Web Sites
We used daily traffic-ranking data of each Web site that was provided by the search engine Alexa as the measurement variable for the popularity of the Web sites . Alexa calculates statistics about the traffic patterns of a Web site after aggregating visit data from all users who install Alexa's toolbar in their Web browsers during a three-month period. Because the Alexa toolbar is currently only available for Microsoft Windows and Internet Explorer, the accuracy of the traffic ranking of the Web site is limited. However, it may reflect the popularity of the Web site on the Web to a certain extent. We retrieved the ranking data of the entire candidate Web sites from Alexa on February 25, 2003.
Importance of the Web Sites
We measured the degree of importance using the PageRank score of each Web site available from the Google search engine. The PageRank score relies on the uniquely hypertext nature of the Web by using its vast link structure as an indicator of an individual page's value. In essence, Google interprets a link from page A to page B as a vote by page A for page B. Therefore, the PageRank score of a page can be viewed as an indicator of the importance of the page. But Google looks at more than the absolute volume of votes, or links that a page receives; it also analyzes the page that makes the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important."  Because Google does not provide PageRank in a numerical value from its searching interface, we had to rank the sites according to an implicit PageRank score and use the ranking number as the value of the variable of importance. We retrieved the ranking of importance of all candidate Web sites from Google on February 26, 2003.
All statistical analyses were performed with alpha value at 0.05 and power at 0.80. Descriptive statistics (means and standard deviation) were calculated for each variable considered in the study. Univariate statistics of the WAB scores were calculated at the level of each category. Then a one-way ANalysis Of VAriance (ANOVA) test was applied to the WAB scores at the level of the Web site's functional category. If the ANOVA test indicated a large difference in the WAB scores among different categories, the post hoc Bonferroni test of the WAB scores between different categories was conducted. The alpha level was adjusted for multiple comparisons in the Bonferroni test.
Google ranked Web sites with a sub-category from highest to lowest PageRank value. Therefore, we used the ranking sequence as the value of Web page importance for nonparametric Spearman correlation. Nonparametric Spearman correlation statistics were also conducted to measure the level of correlation between the WAB scores and the popularity of the Web sites. All statistical analyses were conducted using the SPSS 11.0 software package.
The Google subdirectory "Health/Consumer/Resources" lists 122 Web sites, 14 of which were excluded because their content are no longer healthcare related or they were not active during the study period. The assessing program retrieved 7,109 Web pages from the remaining 108 sites. Means and standard deviations of WAB scores for the remaining 108 Web sites were calculated. The average WAB score was 9.31 with standard deviation of 6.29. None of the 108 Web sites was absolutely accessible (WAB score = 0). The National Institutes of Health (NIH) Combined Health Information Database (CHID) Web site (http://chid.nih.gov/) achieved the lowest WAB score, i.e., it had the fewest accessibility barriers, of the sites tested (0.97), while a community Web site (http://www.discussyourhealth.com/) received the highest WAB score (24.99). The five most frequently violated WCAG checkpoints of all webpages were: "identify language of the text" (77.0%), "use a public text identifier in a DOCTYPE statement" (65.6%), "provide a summary for tables" (61.6%), "use relative sizing and positioning (% values) rather than absolute (pixels)" (60.0%), and "provide alternative text for all images" (52.2%).
WAB and Categories
Among the six functional categories of Web sites, government Web sites were most accessible and had the lowest WAB scores, and portal Web sites were least accessible to people with disabilities, indicated by higher WAB scores ().
The average scores of Web accessibility were calculated for each of the Web categories and the results indicate possible clustering among the six categories, as shown in.
Statistically significant differences among the category groups were found using the ANOVA test on the WAB scores (F = 9.705, P < 0.001). In addition, the post hoc Bonferroni test found that the mean WAB scores of governmental and educational Web sites were significantly different from the rest of the categories (P < 0.001). There is no statistically significant difference between any two categories within each of the two clusters.
WAB Score vs. Popularity and Importance
Furthermore, the Spearman correlation test indicates a statistically significant, though modest, correlation between the WAB score and the Alexa traffic ranking (r= 0.28, P < 0.01). No statistically significant correlation between the WAB score and the PageRank of Web sites was found (r= 0.15, P = 0.111) using the Spearman correlation test (). The correlation between the Alexa's traffic ranking and Google's PageRank was statistically significant (r= 0.32, P < 0.01) using the Spearman correlation test.
Awareness of accessibility issues is increasing among developers of Web sites due to law enforcement, public initiative, and prospective commercial incentives . Even though many evaluation tools are now available to developers intending to improve the accessibility of their Web sites, the status of Web accessibility, especially among health information Web sites, is largely unknown. Compliance with the specifications of Web content accessibility is necessary to narrow the digital divide between the information affluent and digitally underserved people, in this case, those with disabilities. Ours is the first study to address the issue. It provides a relatively comprehensive evaluation of the Web accessibility of consumer health information Web sites, and proposes a metric evaluation for measuring the accessibility of a Web site, taking into account both accessibility violations and the complexity of the Web site presented as potential violations of accessibility checkpoints. This approach provides a more accurate and impartial measurement about the level of accessibility barriers than using only the absolute number of violations as has been employed by most other evaluations. Additionally, the study investigates the relationship between the level of accessibility and the function, importance and popularity of a Web site.
Current Level of Web Accessibility Across Consumer Health Information Web Sites
No consumer health information Web sites satisfied all of the Web accessibility requirements, which may be attributed to Web site developers knowing little about accessibility standards, the lack of effective and efficient evaluation and repair tools, and the pressure to update information on the Web site quickly. Web accessibility, if ever considered, is often an afterthought once Web content design is finished. This implies that program tools that produce efficient, effective post-hoc repairs of Web content accessibility violations, or an accessible proxy server that transforms and filters inaccessible online content for people with disabilities may be more accepted by both the developers and Web site visitors.
Web Accessibility and Functions of the Web Sites
Of the sites providing health information, government sites followed by education sites are the most accessible. This compliance may be attributed to Section 508, since it is mandatory for all federal agencies . High compliance among sites that fall under this mandate also indicates that legal activities would facilitate the removal of accessibility barriers for people with disabilities.
None of the tested Web sites, including the most accessible government sites, passed the WCAG guideline priority 1 checkpoints, even though the five most frequently violated checkpoints have technically uncomplicated solutions if designers pay attention to them. This may imply that the Web site editor simply overlooked the errors and, for such editors, an automatic Web site monitoring program could be very helpful in identifying and correcting these errors. Other possible reasons for such imperfection are the lack of integrated accessibility tools or functions within Web site editing software. Most Web site editing tools make it optional to strictly follow accessibility rules.
The education Web sites are the second most accessible category. Section 508 is not strictly mandatory for the information technology available on educational Web sites, but high awareness of WCAG rules and legal requirements on most campuses may contribute to better accessibility among the education Web sites. Furthermore, although Section 508 does not mandate all education Web sites, it does apply to educational programs and projects that receive federal funding, as many do, which may explain the high compliance to WCAG rules among education sites.
Web Accessibility and Popularity of the Web Sites
The accessibility of a Web site also correlates with its popularity, possibly implying that people with disabilities are more likely to visit sites that contain fewer or no barriers to them. A more accessible Web site may be more usable for the general population because it can also improve the efficiency, effectiveness, and ease of using the Web site . Meanwhile, accessible Web pages will have better opportunities for indexing by Web search engines, which use programs called crawlers to access Web pages on the Internet and store Web page indexes in a database for fast Web information retrieval. Web crawlers work similarly to Web users who are blind and using screen reader programs. Therefore accessible Web pages will have more chances to be indexed by a Web crawler [ ]. Subsequently the overall popularity of the Web sites increase since they attract a group of visitors who have difficulties accessing other sites containing more Web accessibility barriers. Other reasons for the correlation between accessibility and popularity include the possibility that people may take notice that a Web site is accessible and tend to visit it often, or Web developers of accessible Web sites spend more time ensuring their Web sites are appropriate in following other usability rules that make visiting easier for the public.
Web Accessibility and Importance of the Web Sites
The correlation between Web accessibility and a Web site's importance was not statistically significant in our study, although the correlation between its importance and popularity was statistically significant. The measurement of the importance of a Web site was derived from comprehensive link analysis on the Web. It revealed the value of the Web site by measuring how many and what kind of other Web sites link into it. It does not necessarily reflect the value of other HTML elements, especially those Web accessibility related elements. A Web site can be very important in terms of PageRank because many other Web sites have links to it, even though it is not accessible to persons with disabilities when they directly visit it.
Please note that there are several limitations to this study. First, although this study attempts to comprehensively assess the accessibility of a Web site, it is not practical for some Web sites, especially those with large numbers of archived documents. The Bobby program often freezes when checking all layers of a Web site, and this resulted in the decision to check only a manageable two layers of Web pages in this study. A more robust tool needs to be adopted or developed for future studies.
Second, only the checkpoints of Web accessibility that can be examined automatically by a computer program were studied. Many other checkpoints require a manual check of pages to ensure the compliance of the content with the guidelines of Web accessibility. WAI proposed a comprehensive framework for evaluating Web content accessibility which requires multiple steps involving several evaluation tools to ensure the accuracy of the evaluation results. Although this type of evaluation is important for quality assurance of individual Web sites, the cost of such a large operation makes it impractical for an evaluation study involving many Web sites. This study assumes that the checkpoints that can be automatically evaluated will strongly correlate to the manual checkpoints and can be used as a surrogate assessment for accessibility of a Web site. Future studies might explore the agreement between these two groups of checkpoints.
Furthermore, the traffic ranking information provided from Alexa may skew towards users of Internet Explorer on a Windows operating system, underestimating the traffic to sites that are disproportionately accessed by people using other browsers or operating systems. The site most likely to suffer from this bias is AOL (America Online), since their members commonly use AOL browsers to access the site.
The WAB score in the study can be used to measure the degree of accessibility of a site. However, it should not be used as the only indicator for Web accessibility, which includes other checkpoints that can not be automatically assessed by computer programs. An experienced Web developer can fine-tune a Web site to produce a perfect WAB score. However, this does not necessarily mean that the Web site is entirely accessible to people with disabilities when they visit it.
This study evaluates the current state-of-accessibility of consumer health information Web sites for people with disabilities. Accessibility barriers are present in all site categories, especially commercial Web sites. Government and education Web sites show better performance than those in other categories. Accessibility may have an impact on its popularity because people with disabilities will feel more comfortable visiting those sites with fewer accessibility barriers. This study attempts to increase the awareness of Web accessibility among the designers of consumer health information Web sites.
This work was supported, in part, by grants number 42-60-I02013 from the National Telecommunications and Information Administration (NTIA) and H133A021916 from the National Institute on Disability and Rehabilitation Research (NIDRR). The authors also thank Ms. Yueh-Mei Yang for acquiring and formatting the data.
Conflicts of Interest
Appendix AConsumer Health Web Sites Selected from Google for Accessibility Evaluation
- Horrigan JB, Rainie L. Counting on the Internet. Pew Internet & American Life Project. 2002 Dec 29. URL: http://www.pewinternet.org/pdfs/PIP_Expectations.pdf [accessed 9 June 2004] [WebCite Cache]
- Eysenbach G, Kohler Ch. What is the prevalence of health-related searches on the World Wide Web? Qualitative and quantitative analysis of search engine queries on the internet. AMIA Annu Symp Proc 2003:225-229. [Medline]
- . A Nation Online. Government Report. 2002. URL: http://www.ntia.doc.gov/ntiahome/dn/index.html [accessed 9 June 2004] [WebCite Cache]
- Eysenbach G. Consumer health informatics. BMJ 2000 Jun 24;320(7251):1713-1716 [FREE Full text] [Medline] [CrossRef]
- Kaye HS. Computer and Internet use among people with disabilities (SuDoc ED 1.310/2:439579). Washington, DC: U.S. Dept. of Education, National Institute on Disability and Rehabilitation Research Office of Educational Research and Improvement, Educational Resources Information Center; 2000. URL: http://dsc.ucsf.edu/view_pdf.php?pdf_id=23
- . A Framework for the Evaluation of Internet-based Diabetes Management. URL: http://www.nielsen-netratings.com/hot_off_the_net.htm [accessed 30 Nov 2003] [WebCite Cache]
- Paciello MG. Web Accessibility for People With Disabilities (R & D Developer Series). Lawrence, KS: CMP Books; 2000. URL: http://dsc.ucsf.edu/view_pdf.php?pdf_id=23
- Hayes MG. Individuals with disabilities using the internet: a tool for information and communication. Technology and Disability 1998;8(3):153-158.
- Fox S, Fallows D. Internet Health Resources: Health searches and email have become more commonplace, but there is room for improvement in searches and overall Internet access. Pew Internet & American Life Project. 2003 Jul 16. URL: http://www.pewinternet.org/pdfs/PIP_Health_Report_July_2003.pdf [accessed 9 June 2004] [WebCite Cache]
- Varian HR. The Information Economy. How much will two bits be worth in the digital marketplace? Scientific American 1995 Sep;273:161-162.
- . Web Content Accessibility 1.0 Conformance Logos. URL: http://www.w3.org/WAI/WCAG1-Conformance [accessed 7 Dec 2003] [WebCite Cache]
- Balas JL. Library consortia in the brave new online world. Computers in Libraries 1998 Apr;18(4):42-44 [FREE Full text] [WebCite Cache]
- . Authoring Tool Accessibility Guidelines. URL: http://www.w3.org/TR/ATAG10/ [accessed 1 Aug 2003] [WebCite Cache]
- . User Agent Accessibility Guidelines 1.0. URL: http://www.w3.org/TR/UAAG10/ [accessed 1 Aug 2003] [WebCite Cache]
- . Section 508 of Rehabilitation Act. URL: http://www.section508.gov [accessed 11 Mar 2003] [WebCite Cache]
- Abou-zahra S. Evaluation, Repair, and Transformation Tools for Web Content Accessibility. URL: http://www.w3.org/WAI/ER/existingtools.html [accessed 3 Apr 2004] [WebCite Cache]
- Schmetzke A. Editor. Accessibility of Web-based Information Resources for People with Disabilities (Part One). Library Hi Tech 2002;20(2):135-136 [FREE Full text] [WebCite Cache]
- Schmetzke A. Editor. Accessibility of Web-based Information Resources for People with Disabilities (Part Two). Library Hi Tech 2002;20(4):397-398 [FREE Full text] [CrossRef]
- Schmetzke A. Web accessibility at university libraries and library schools. Library Hi Tech 2001;19(1):35-49 [FREE Full text] [WebCite Cache]
- . Web Usability Index. URL: http://www.usablenet.com/wui/wui_index.html [accessed 8 Mar 2003] [WebCite Cache]
- Lazar J, Dudley-sponaugle A, Greenidge KD. Improving web accessibility: a study of webmaster perceptions. Comp Hum Behav 2004;20(2):269-288 [FREE Full text] [CrossRef]
- Lawrence S, Giles CL. Accessibility of information on the web. Nature 1999 Jul 8;400(6740):107-109. [Medline] [CrossRef]
- Mandl KD, Katz SB, Kohane IS. Social equity and access to the World Wide Web and E-mail: implications for design and implementation of medical applications. Proc AMIA Symp 1998:215-219. [Medline]
- Berland GK, Elliott MN, Morales LS, Algazy JI, Kravitz RL, Broder MS, et al. Health information on the Internet: accessibility, quality, and readability in English and Spanish. JAMA 2001 May 23;285(20):2612-2621. [Medline] [CrossRef]
- Griffiths KM, Christensen H. The quality and accessibility of Australian depression sites on the World Wide Web. Med J Aust 2002 May 20;176 Suppl(Suppl):S97-S104 [FREE Full text] [Medline]
- Bykowski JL, Alora MB, Dover JS, Arndt KA. Accessibility and reliability of cutaneous laser surgery information on the World Wide Web. J Am Acad Dermatol 2000 May;42(5 Pt 1):784-786. [Medline] [CrossRef]
- Eysenbach G, Powell J, Kuss O, Sa ER. Empirical studies assessing the quality of health information for consumers on the world wide web: a systematic review. JAMA 2002 May;287(20):2691-2700. [Medline] [CrossRef]
- Koyani SJ, Bailey RW, Nall JR. Research-Based Web Design & Usability Guidelines. Washington, DC: Computer Psychology; Sep 2004. URL: http://usability.gov/guidelines/
- Davis JJ. Disenfranchising the disabled: the inaccessibility of Internet-based health information. J Health Commun 2002;7(4):355-367. [Medline] [CrossRef]
- . About the Open Directory Project. URL: http://dmoz.org/about.html [accessed 13 Mar 2003] [WebCite Cache]
- Astbrink G. Web page design: something for everyone. Link-Up 1996 Dec:7-10 [FREE Full text] [WebCite Cache]
- . Web Content Accessibility Guidelines 1.0. URL: http://www.w3.org/TR/WCAG10/ [accessed 1 Aug 2003] [WebCite Cache]
- Axtell R, Dixon JM. Voyager 2000: a review of accessibility for persons with visual disabilities. Library Hi Tech 2002;20(2):141-147 [FREE Full text] [CrossRef]
- . Home page. URL: http://www.cast.org/ [accessed 4 Aug 2003] [WebCite Cache]
- . Home page. URL: http://www.watchfire.com/ [accessed 3 Aug 2003] [WebCite Cache]
- . Home page. URL: http://www.alexa.com/ [accessed 4 Aug 2003] [WebCite Cache]
- . Home page. URL: http://www.google.com/technology/ [accessed 2 Mar 2004] [WebCite Cache]
- Jaeger PT. Section 508 goes to the library: Complying with federal legal standards to produce accessible electronic and information technology in libraries. Information Technology and Disabilities 2002;8(2) [FREE Full text] [WebCite Cache]
- Kirkpatrick CH. Getting two for the price of one: accessibility and usability. Computers in Libraries 2003;23(1):26-29 [FREE Full text] [WebCite Cache]
- Pemberton S. The Kiss of the Spiderbot. Interactions. Interactions 2003 Jan-Feb;10(1):44 [FREE Full text] [CrossRef]
|WWW: World Wide Web|
|WAB: Web Accessibility Barriers|
|NTIA: National Telecommunications and Information Administration|
|W3C: WWW Consortium|
|WAI: Web Accessibility Initiative|
|WCAG: Web Content Accessibility Guidelines|
|WUI: Web Usability Index|
|NCI: National Cancer Institute|
|PDF: Portable Digital Format|
|ASP: Active Server Page|
|JSP: Java Server Page|
|XML: eXtensible Markup Language|
|ANOVA: ANalysis Of VAriance|
|NIH: National Institutes of Health|
|CHID: Combined Health Information Database|
Edited by G. Eysenbach; submitted 16.03.04; peer-reviewed by R Appleyard, S Smeltzer, D Klein; comments to author 29.03.04; revised version received 14.05.04; accepted 21.05.04; published 21.06.04
© Xiaoming Zeng, Bambang Parmanto. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 21.6.2004. Except where otherwise noted, articles published in the Journal of Medical Internet Research are distributed under the terms of the Creative Commons Attribution License (http://www.creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited, including full bibliographic details and the URL (see "please cite as" above), and this statement is included.