Published on 11.08.99 in Vol 1, No 1 (1999)
Can Examination of WWW Usage Statistics and other Indirect Quality Indicators Help to Distinguish the Relative Quality of Medical websites?
Background: The Internet offers a great amount of health related websites, but concern has been raised about their reliability. Several subjective evaluation criteria and websites rating systems have been proposed as a help for the Internet users to distinguish among web resources with different quality, but their efficacy has not been proven.
Objective: To evaluate the agreement of a subset of Internet rating systems editorial boards regarding their evaluations of a sample of pediatric websites. To evaluate certain websites characteristics as possible quality indicators for pediatric websites.
Methods: Comparative survey of the results of systematic evaluations of the contents and formal aspects of a sample of pediatric websites, with the number of daily visits to those websites, the time since their last update, the impact factor of their authors or editors, and the number of websites linked to them.
Results: 363 websites were compiled from eight rating systems. Only 25 were indexed and evaluated by at least two rating systems. This subset included more updated and more linked websites. There was no correlation among the results of the evaluation of these 25 websites by the rating systems. The number of inbound links to the websites significantly correlated with their updating frequency (p<.001), with the number of daily visits (p=.005), and with the results of their evaluation by the largest rating system, HealthAtoZ (p<.001). The websites updating frequency also significantly correlated with the results of the websites evaluation by HealthAtoZ, both about their contents (p=.001) and their total values (p<.05). The number of daily visits significantly correlated (p<.05) with the results of the evaluations by Medical Matrix.
Conclusions: Some websites characteristics as the number of daily visits, their updating frequency and, overall, the number of websites linked to them, correlate with their evaluation by some of the largest rating systems on the Internet, what means that certain indexes obtained from the usage analysis of pediatric websites could be used as quality indicators. On the other hand, the citation analysis on the Web by the quantification of inbound links to medical websites could be an objective and feasible tool in rating great amounts of websites.
J Med Internet Res 1999;1(1):e1)
- Health Education;
- Information Systems;
- Computer Communication Networks;
- Web metrics;
After the early enthusiasm generated by the potential use of the Internet in Medicine [, , ], concern has been raised about the quality of the resources available on the Internet compared to more academic media. It is technically very easy to publish on the Internet [ ]. The lack of a review process of the documents on the Net, and the power of this media in transmitting the data has the risk of misinforming both lay people [ , , ] and health care professionals [ ]. However, only a few studies have tried to measure this risk of misinformation [ , , ]. Nothing yet is known about the users' ability to discriminate between low and high quality resources.
Several initiatives have been proposed which could be applied at different levels to improve the average quality of medical websites. For instance, we could apply certain basic methods for the websites to be correctly designed. In this sense, some academic organizations have proposed a set of basic information that every medical web site should provide about the author and sources of the web site contents, their potential conflicts of interest and funding, and the currency of the information . But many of the available medical websites have been created without any quality control by a third party. How can Internet health care visitors distinguish between such different resources?
Internet users can find health and medical related websites in several ways. World Wide Web search engines (e.g., AltaVista, Excite, Infoseek and many others) provide the users with a list of websites that match a given topic, with the results ordered by syntactic similarity with the query . Unfortunately, the quality of contents is not guaranteed.
On the other hand, certain websites indexes and review services, such as Medical Matrix (http://www.medmatrix.org/) and HealthAtoZ (http://www.HealthAtoZ.com/), offer systematic evaluations of medical resources on the Web , as a post publication editorial process. These rating systems could be an useful tool for guiding the visitors of medical websites [ ]. However, authors who have reviewed these Internet resources, point out the variability of their evaluation criteria and their doubtful efficacy [ ].
The quality of a given medical article on the Internet could be measured by the users opinion about it, for example by counting the number of times it is retrieved . However, this idea has been criticized because it would replace the scientific peer review process with the opinion of the Internet users, whatever their qualification [ ].
Despite the differences between the printed medical information and the Internet, several evaluation tools from the former could be useful if applied on the "Net." Similarly to printed medical journals, medical documents on the Internet could be ranked by their citation analysis [, ], but no methods have been proven for use with medical websites. When an article is quoted in a paper, certain agreement among the authors may be supposed. Similarly, when a webmaster makes a link from his web site to another, certain credibility is given to the latter. In fact, the International Committee of Medical Journals Editors recommend caution when a link is made from a peer reviewed journal site to other sites [ ]. If linking on the web can be equivalent to quoting in printed medical articles, a citation analysis on the web could be performed by the quantification of the links to a given medical web site.
The ideal method for assessing the quality of medical websites should provide a means of rating great amounts of medical web resources while respecting the World Wide Web peculiarities, such as its multimedia capabilities and changing contents. At the same time, it should at least be as reliable as systematic reviews of those resources by editorial boards. In summary, it should be a method born in the Internet but with the efficacy of those used in the printed media.
In this study, we evaluated the reliability of four websites characteristics as medical websites quality indicators. The four characteristis used: their authors' impact factor, their grade of updating, their daily visits and inbound links. The evaluations of a sample of pediatric websites by a number of Internet rating systems was the gold standard with which these websites characteristics were compared.
During March 1998 a subset of websites rating systems were compiled. From these, we selected a sample of websites that were studied during the first week of April 1998.
Eight web rating systems, whose evaluations were offered as figures, were compiled from previous studies [, ] ( ). One half of the selected rating systems gave the results of their evaluations by means of graphic analog scales, and the other half by numeric scales. Every web site evaluated by these rating systems that provided information about child health, whether for lay people or health professionals, was included in the study. Some of these rating systems (e.g., Lycos Top 5%) provides the visitors with a search tool by keyword. In these cases, the websites were selected using the keywords "Pediatrics", "Infancy", "Child health", and "Child Care." For the remaining rating systems, the pediatric websites were compiled manually. Those websites not accessible twice during the study period were excluded.
Only three rating systems (Medical Matrix, Physician Choice, and Six Senses) gave information about their editorial boards. Most of their members were physicians. Two of the web rating systems only gave a global result of their websites evaluation (Medical Matrix and Magellan), while the rest (HealthAtoZ, Argus Clearinghouse, Lycos Top 5%, Sympatico Health, Physician Choice, and Six Senses) gave a result for each considered criterion. Content was a common criterion to all the eight ranking systems. Therefore, the results of the evaluation of each web site were divided in two categories, content and non-content (design) aspects. In order to make comparisons, the results of the evaluations of the websites supplied by each rating system were transformed to a one hundred scale.
When provided, the daily visits registered by the websites visits counters were recorded. In some websites the date from which the counter was started was not available. Thus, their webmasters were asked for this information by electronic mail, and it was included in the statistical study if provided before the end of the observation period, 15th April 1998.
The websites authors and editors' names were searched in 1997 MEDLINE , and their articles were registered. Their impact factors of the journals wherein they were published were obtained by using the 1996 Science Citation Index (Institute for Scientific Information, Philadelphia, PA). The impact factor of a given web site author was the sum of the impact factors of his or her articles. For institutional websites only the name of the web editor was considered.
When provided, the time since the last update was also recorded.
Finally, by means of the Web search engine Infoseek , we calculated how many websites on the Internet linked to each web site of our sample. The searching strategy by syntax of this engine allows to know the websites that are linked to a given web site [ ]. As a web site may be linked not only from external websites but also from websites of its own organization, we only considered external links. Although other search engines such as AltaVista, Excite or HotBot offer similar searching options, we chose Infoseek because it provided the results of the queries grouped by web site, which makes the exclusion of the internal links easier.
Comparison of means was performed by Mann-Whitney U test, and correlation analysis by means of Spearman's correlation coefficient ( rS). P values equal or less than .05 were considered significant. All computations were made with SPSS for Windows 7.0 (SPSS Inc., Chicago, IL) statistical package.
After excluding 93 non-accessible websites, a total of 363 pediatric websites were compiled.
On average, the websites of our sample received links from 470 other sites on the Internet (range, 0 to 3574). In 48% of the websites, information on their last update was given. On average, they had been updated 47.5 weeks before (range, 0 to 395). Only 10% of the websites had a visit counter, and the average daily visits were 470 (range, 1.2 to 3145). Seven visit counters did not distinguish among different visitors, that is, they registered any visit to their websites. In 137 websites (38%) the editor/author's name was given, but only 60 of them had published at least one article since January 1997 in the journals included in MEDLINE database. Their average impact factor was 2.14.
Only 25 websites of the sample were indexed and evaluated at least by two rating systems, and none by the eight. This subset of websites showed significantly better results of the evaluation of their contents and design by HealthAtoZ, and higher grade of updating () and higher number of inbound links ( ). When the evaluations of these 25 websites by the different rating systems were compared, no significant correlations were found. Changes regarding the average impact factor of the authors of the websites or the number of daily visits could not be demonstrated in this subset of websites.
Some interesting correlations between the results of the evaluations of the websites and the rest of study variables were found. The number of links received by the websites significantly correlated with their daily visits and with the time since the last update (). The number of inbound links also correlated with the results of the websites evaluation by HealthAtoZ ( ).
The number of daily visits significantly correlated with the results of the websites evaluation made by Medical Matrix, and the grade of updating significantly correlated with the results of the contents and designs evaluation made by HealthAtoZ ().
Finally, no correlation was demonstrated between the average impact factor of the websites authors and the other variables.
The top fifty pediatric websites of the sample are shown in, ordered by the number of their inbound links according to the Infoseek indexing engine. More than a half of the 25 websites indexed by at least two rating systems may be found among these top fifty websites.
In this study, certain websites characteristics that depend on the users' preferences have been compared with evaluations of pediatric resources on the Web by third parties. Although rating systems have been previously criticized because their editorial boards frequently do not employ uniform criteria , we have considered them as the standard method because it somewhat represents a post-publication review process.
Some aspects of our method are open to discussion. Firstly, the reliability of the data regarding the daily visits and the updating frequency depends on the accuracy of the information that the websites editors offer in their sites. In this sense, we considered the grade of updating of the websites by the dates of their last changes. Clearly these changes could involve very different aspects and in different grades, and not necessarily provide more current contents. However, we believe that it could demonstrate the editor's efforts in maintaining or increasing the interest of his web site for the visitors.
The results regarding the number of daily visits to the websites must be considered with caution when comparing one web site to another, because some visit counters were set to register every visit, instead of every distinct visitor. Nevertheless, both can be considered usage indexes of a given web site.
On the other hand, quantification of links to the websites clearly depends on the power of the search engine we employ. By no means our results show the total number of links to the websites in our sample. In fact, a previous article states that it would be necessary to combine the databases from at least five large search engines to cover the most of the web .
Although all bibliometric indexes have limitations [, ], we employed the impact factor as a measure of the webmasters' publishing capacity because it is a classical indicator of the quality of biomedical articles. Recently, it has been suggested that every medical web site should be evaluated following some basic criteria [ ]. One of the more accepted criteria is that the authorship must be clearly stated, as a basic means for assessing the reliability of the web site contents. However, we could not demonstrate that the more highly evaluated, the most updated, or the most linked or visited pediatric websites, had the authors with the highest publishing capacity measured by their impact factor. In other words, some web quality standards do not correlate with classical quality standards from the printed media such as the impact factor of a given author's articles.
We could not find statically significant correlations among the evaluations of the websites by the different rating systems. This is probably due to the small size of the subset of websites indexed and evaluated by all the systems, and their different evaluation criteria. However, some interesting data were found when we considered the correlations among the four websites characteristics and the evaluations. We found that the best websites for HealthAtoZ, the largest analyzed rating system, were the most updated and the most linked ones. On the other hand, the most valuable websites for Medical Matrix, the second rating system by size, were the most visited ones. In any case, both the number of daily visits and the time since the last update highly correlated with the number of inbound links. The lack of correlation among the four variables and the evaluations by the other rating systems could be due to their little contribution to our sample.
Many efforts to establish quality criteria will have limited efficacy due to the dynamic behaviour of the Internet as a publishing medium. In fact, a recent article demonstrates the lack of consensus among the editorial boards of a large sample of evaluation and rating systems regarding the evaluation criteria they employ. The same authors pointed out that "... it may be difficult or even inappropriate to develop a static tool or system for assessing health related websites."  Therefore, the question could be to provide context to this issue. That is, to know how good a given medical web site is in comparison with the rest of medical websites. A democratic and feasible method for reaching this objective could be let the Internet community say which medical websites are the best ones, that is, which they usually visit or which they usually recommend by linking to them. Moreover, we believe that the fact that these usage indexes correlate with the evaluations by third parties, qualifies them as quality markers.
Eysenbach and Diepgen  have recently proposed that an ideal quality control system for medical resources on the Internet should take in account the users opinion, and not only their evaluation by a third party, that is, a "downstream filtering" and not only an "upstream filtering" approach. More interestingly, our study demonstrates certain agreement among both approaches in identifying high quality resources.
LaPorte et al  proposed an electronic publishing system in which the impact of a given resource on the Internet could be measured by counting how many times the document was retrieved or quoted. The introduction of the citation analysis of the medical resources on the web as a method to assess their quality has been recently proposed [ ]. On the other hand, a very promising software system is being developed by Kleinberg [ , ]. This system would provide the users with a way of knowing the very best of the web on a given topic in a faster and more complete way than commercial human compiled directories. This system is based in the identification of two subsets of websites when a query on a given topic is made, those websites containing a lot of information about the topic (authoritative websites) and those which contain large amounts of links to the former (hub sites). Our work demonstrates that those authoritative websites, that is the more linked ones, are indeed the best ones regarding the evaluation of its contents and design by the editorial boards of some large web rating systems.
The citation analysis of biomedical journals has been a classic tool in assessing their relative quality . Similarly, medical web resources could be ranked by a "webcite index" [ ], which is not yet defined. Linking in the World Wide Web could be equivalent to quoting in printed publications, and its quantification could be useful for measuring the relative quality of medical websites. Some indexes could be created to make more rational comparisons among websites with different sizes. For example, in the same way that the calculation of the impact factor of a given medical journal takes into account the number of articles published by that journal yearly, the size of a given domain could be considered to obtain some indexes that would express more accurately the grade of linkage of a medical web site. Moreover, Platform for Internet Content Selection (PICS) [ ], an infrastructure that could be applied as a filtering system of the medical information on the Net [ ], could incorporate these indexes as one of the meta- data assigned to every medical document as electronic labels. Then, these electronic labels could be checked automatically by an user's browser, bypassing those documents with a "webcite index" not high enough. A problem could be how to avoid false "self-labelling" by dishonest webmasters. In any case, more work is needed to give answers to these and other technical questions on the emerging field of Webometrics [ ].
An evaluation system based on this quantification would bring advantages and risks. Rankings could be generated very quickly and in an objective way, because the Internet community by itself would evaluate great amounts of medical websites. However, this evaluation process would be made a posteriori, and the potential harmful effects of the diffusion of documents without enough quality could not be avoided. Therefore, this method could not replace previous editorial effort that warrants a minimal quality for each resource.
Our work demonstrates that the visitors of pediatric websites and the editors of websites on the "Net," so called webmasters, show certain maturity when they have to identify the pediatric resources with high quality. We believe that the key point is how to augment the proportion of these resources. An important issue could be to establish a citation style not only for articles from peer reviewed electronic journals , but also for any medical document on the Net. The prestige that citation in a printed journal represents will stimulate high quality publishing on the Internet, and web site editors will employ enough review processes to obtain the necessary quality. A web site's ranking system based on the citation analysis on the web by the quantification of links would be an additional incentive. The more valuable resources will attract the Internet users' visits and the webmasters' links, and very likely the best funding and financial supports.
In summary, although the Internet provides a very different publishing medium, traditional means borrowed from printed journals could also be used with this electronic media for achieving minimal levels of quality. These include certain peer review processes, that enhance the rigor of the documents submitted for publication taking in account the peculiarities of this media, and linking analysis as a measure of the citation on the World Wide Web.
The authors are sincerely grateful to the members of the Research Unit of the Hospital Universitario de Canarias, and specially to Dr Armando Torres and Dr Eduardo Salido for their help and support in the editing and translation of the manuscript. We are also specially grateful to Mr. Trevor P. Doble for his help in the translation of this manuscript.
Conflicts of Interest
- Goldwein JW, Benjamin I. Internet-based medical information: time to take charge. Ann Intern Med 1995 Jul 15;123(2):152-153 [FREE Full text] [Medline]
- Chi-lum BI, Lundberg GD, Silberg WM. Physicians accessing the Internet, the PAI Project. An educational initiative. JAMA 1996 May 1;275(17):1361-1362. [Medline] [CrossRef]
- LaPorte RE. On behalf of the Global Health Network. Internet server with targeted access would cure information deficiency in developing countries [letter]. BMJ 1997;314:980. [Medline]
- Kassirer JP, Angell M. The Internet and the Journal. N Engl J Med 1995 Jun 22;332(25):1709-1710 [FREE Full text] [Medline] [CrossRef]
- Bower H. Internet sees growth of unverified health claims (news). BMJ 1996;313:381. [Medline]
- Dyer C. Internet sales of prescription drugs investigated. BMJ 1996 Sep 14;313(7058):645 [FREE Full text] [Medline]
- Editorial. The web of information inequality. Lancet 1997(349):1781.
- Shapira Y, Talmor D, Artru AA, Lam AM. Resident education and unreviewed material [letter]. Anesth Analg 1996;83:886-891. [Medline]
- Impicciatore P, Pandolfini C, Casella N, Bonati M. Reliability of health information for the public on the World Wide Web: systematic survey of advice on managing fever in children at home. BMJ 1997 Jun 28;314(7098):1875-1879 [FREE Full text] [Medline]
- Hernández-Borges AA, Pareras LG, Jiménez A. Comparative analysis of pediatric mailing lists on the Internet. Pediatrics 1997;100(2) [FREE Full text] [WebCite Cache]
- Hernández-Borges AA, Macías-Cervi P, Gaspar-Guardado MA, Torres-Álvarez de Arcaya ML, Ruiz-Rabaza A, Ormazábal-Ramos C. Assessing the Relative Quality of Internet Mailing Lists on Anesthesiology and Critical Care Medicine. Anesth Analg 1999 [in press].
- Silberg WM, Lundberg GD, Musacchio RA. Assessing, controlling, and assuring the quality of medical information on the Internet: Caveant lector et viewor--Let the reader and viewer beware. JAMA 1997 Apr 16;277(15):1244-1245. [Medline] [CrossRef]
- Sullivan D. A Webmaster's Guide To Search Engines. URL: http://www.searchenginewatch.com/webmasters/index.html [accessed 1999 Apr 15] [WebCite Cache]
- Jadad AR, Gagliardi A. Rating health information on the Internet: navigating to knowledge or to Babel? JAMA 1998 Feb 25;279(8):611-614. [Medline] [CrossRef]
- Laporte RE, Marler E, Akazawa S, Sauer F, Gamboa C, Shenton C, et al. The death of biomedical journals. BMJ 1995 May 27;310(6991):1387-1390 [FREE Full text] [Medline]
- Eysenbach G, Diepgen TL. Towards quality management of medical information on the internet: evaluation, labelling, and filtering of information. BMJ 1998 Nov 28;317(7171):1496-1500 [FREE Full text] [Medline]
- . Policies for posting biomedical journal information on the Internet. JAMA 1997;277(22):1808. [CrossRef]
- . PubMed. National Library of Medicine. URL: http://www.ncbi.nlm.nih.gov/PubMed/medline.html [accessed 1999 Apr 15] [WebCite Cache]
- . Home page. URL: http://infoseek.go.com [accessed 1999 Apr 15] [WebCite Cache]
- . Quick Reference to Syntax. URL: http://www.infoseek.com/Help?pg=SyntaxHelp.html [accessed 1999 Apr 15] [WebCite Cache]
- Lawrence S, Giles CL. Searching the world wide Web . Science 1998 Apr 3;280(5360):98-100. [Medline] [CrossRef]
- Tsafrir JS, Reis T. Using the citation index to assess performance. BMJ 1990 Dec 8;301(6764):1333-1334. [Medline]
- Hansson S. Impact factor as a misleading tool in evaluation of medical journals [letter]. Lancet 1995;346:906. [Medline]
- . Criteria for Assessing the Quality of Health Information on the Internet. URL: http://www.mitretek.org/hiti/showcase/documents/criteria.html [accessed 1999 Apr 15] [WebCite Cache]
- Kim P, Eng TR, Deering MJ, Maxfield A. Published criteria for evaluating health related web sites: review. BMJ 1999 Mar 6;318(7184):647-649 [FREE Full text] [PMC] [Medline]
- Kleinberg JM. Authoritative sources in a hyperlinked environment. URL: http://simon.cs.cornell.edu/home/kleinber/auth.pdf [accessed 1999 Apr 15] [WebCite Cache]
- Arunachalam S. Assuring quality and relevance of Internet information in the real world. BMJ 1998;317:1501-1502.
- Garfield E. Citation analysis as a tool in journal evaluation. Science 1972 Nov 3;178(60):471-479. [Medline]
- . Platform for Internet Content Selection (PICS). URL: http://www.w3.org/PICS [accessed 1999 Jun 8] [WebCite Cache]
- Almind TC, Ingwersen P. Informetric analyses on the World Wide Web: methodological approaches to "Webometrics". Journal of Documentation 1997;53(4):404-426. [CrossRef]
- . Uniform requirements for manuscripts submitted to biomedical journals. JAMA 1997;277(11):927-934. [CrossRef]
Edited by G. Eysenbach; submitted 27.04.99; peer-reviewed by R Appleyard, M Hogarth; comments to author 28.05.99; revised version received 11.06.99; accepted 05.07.99; published 11.08.99
© Angel A Hernández-Borges, Pablo Macías-Cervi, María Asunción Gaspar-Guardado, María Luisa Torres-Álvarez DeArcaya, Ana Ruiz-Rabaza, Alejandro Jiménez-Sosa. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 11.8.1999. Except where otherwise noted, articles published in the Journal of Medical Internet Research are distributed under the terms of the Creative Commons Attribution License (http://www.creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited, including full bibliographic details and the URL (see "please cite as" above), and this statement is included.