The Internet offers a great amount of health related websites, but concern has been raised about their reliability. Several subjective evaluation criteria and websites rating systems have been proposed as a help for the Internet users to distinguish among web resources with different quality, but their efficacy has not been proven.
To evaluate the agreement of a subset of Internet rating systems editorial boards regarding their evaluations of a sample of pediatric websites. To evaluate certain websites characteristics as possible quality indicators for pediatric websites.
Comparative survey of the results of systematic evaluations of the contents and formal aspects of a sample of pediatric websites, with the number of daily visits to those websites, the time since their last update, the impact factor of their authors or editors, and the number of websites linked to them.
363 websites were compiled from eight rating systems. Only 25 were indexed and evaluated by at least two rating systems. This subset included more updated and more linked websites. There was no correlation among the results of the evaluation of these 25 websites by the rating systems. The number of inbound links to the websites significantly correlated with their updating frequency (p<.001), with the number of daily visits (p=.005), and with the results of their evaluation by the largest rating system, HealthAtoZ (p<.001). The websites updating frequency also significantly correlated with the results of the websites evaluation by HealthAtoZ, both about their contents (p=.001) and their total values (p<.05). The number of daily visits significantly correlated (p<.05) with the results of the evaluations by Medical Matrix.
Some websites characteristics as the number of daily visits, their updating frequency and, overall, the number of websites linked to them, correlate with their evaluation by some of the largest rating systems on the Internet, what means that certain indexes obtained from the usage analysis of pediatric websites could be used as quality indicators. On the other hand, the citation analysis on the Web by the quantification of inbound links to medical websites could be an objective and feasible tool in rating great amounts of websites.
After the early enthusiasm generated by the potential use of the Internet in Medicine [
Several initiatives have been proposed which could be applied at different levels to improve the average quality of medical websites. For instance, we could apply certain basic methods for the websites to be correctly designed. In this sense, some academic organizations have proposed a set of basic information that every medical web site should provide about the author and sources of the web site contents, their potential conflicts of interest and funding, and the currency of the information [
Internet users can find health and medical related websites in several ways. World Wide Web search engines (e.g., AltaVista, Excite, Infoseek and many others) provide the users with a list of websites that match a given topic, with the results ordered by syntactic similarity with the query [
On the other hand, certain websites indexes and review services, such as Medical Matrix (
The quality of a given medical article on the Internet could be measured by the users opinion about it, for example by counting the number of times it is retrieved [
Despite the differences between the printed medical information and the Internet, several evaluation tools from the former could be useful if applied on the "Net." Similarly to printed medical journals, medical documents on the Internet could be ranked by their citation analysis [
The ideal method for assessing the quality of medical websites should provide a means of rating great amounts of medical web resources while respecting the World Wide Web peculiarities, such as its multimedia capabilities and changing contents. At the same time, it should at least be as reliable as systematic reviews of those resources by editorial boards. In summary, it should be a method born in the Internet but with the efficacy of those used in the printed media.
In this study, we evaluated the reliability of four websites characteristics as medical websites quality indicators. The four characteristis used: their authors' impact factor, their grade of updating, their daily visits and inbound links. The evaluations of a sample of pediatric websites by a number of Internet rating systems was the gold standard with which these websites characteristics were compared.
During March 1998 a subset of websites rating systems were compiled. From these, we selected a sample of websites that were studied during the first week of April 1998.
Eight web rating systems, whose evaluations were offered as figures, were compiled from previous studies [
Only three rating systems (Medical Matrix, Physician Choice, and Six Senses) gave information about their editorial boards. Most of their members were physicians. Two of the web rating systems only gave a global result of their websites evaluation (Medical Matrix and Magellan), while the rest (HealthAtoZ, Argus Clearinghouse, Lycos Top 5%, Sympatico Health, Physician Choice, and Six Senses) gave a result for each considered criterion. Content was a common criterion to all the eight ranking systems. Therefore, the results of the evaluation of each web site were divided in two categories, content and non-content (design) aspects. In order to make comparisons, the results of the evaluations of the websites supplied by each rating system were transformed to a one hundred scale.
Compiled web sites ranking systems. The results of evaluations are showed as two possible types of scales, graphic analog (A) or numeric (N)
|
|
|
Argus Clearinghouse Seal of Approval (16/1) |
|
A |
HealthAtoZ (241/66) |
|
|
Lycos Top 5% (8/3) |
|
N |
Magellan Internet Guide (40/11) |
|
A |
Medical Matrix (75/11) |
|
A |
Physician's choice (4/0) |
|
N |
Six Senses Seal of Approval (4/0) |
|
N |
Sympatico Health (8/1) |
|
A |
* Graphic analog scale developed in numeric
When provided, the daily visits registered by the websites visits counters were recorded. In some websites the date from which the counter was started was not available. Thus, their webmasters were asked for this information by electronic mail, and it was included in the statistical study if provided before the end of the observation period, 15th April 1998.
The websites authors and editors' names were searched in 1997 MEDLINE [
When provided, the time since the last update was also recorded.
Finally, by means of the Web search engine Infoseek [
Comparison of means was performed by Mann-Whitney U test, and correlation analysis by means of Spearman's correlation coefficient (
After excluding 93 non-accessible websites, a total of 363 pediatric websites were compiled.
Correlations among the number of daily visits to the web sites, the impact factor of their authors or editors, the grade of update, and the number of links that receive. NS means not significant
|
|
|
|
Visits/day | .46 .005 | ||
Author impact factor | NS | NS | |
Weeks since the last update | -.36 <.001 | NS | NS |
Correlation among the number of links and visits to the web sites, the impact factor of their authors, and the time since the last update, and the results of their evaluation by HealthAtoZ and Medical Matrix. No significant correlations were demonstrated with the other systems. Medical Matrix only provides total results, does not specify results by contents and non-contents aspects
|
|
|
|
||||||
|
|
|
|
|
|
|
|
||
HealthAtoZ | Total |
|
NS | NS |
|
||||
Medical Matrix | Total | NS | .79 .03 | NS | NS |
On average, the websites of our sample received links from 470 other sites on the Internet (range, 0 to 3574). In 48% of the websites, information on their last update was given. On average, they had been updated 47.5 weeks before (range, 0 to 395). Only 10% of the websites had a visit counter, and the average daily visits were 470 (range, 1.2 to 3145). Seven visit counters did not distinguish among different visitors, that is, they registered any visit to their websites. In 137 websites (38%) the editor/author's name was given, but only 60 of them had published at least one article since January 1997 in the journals included in MEDLINE database. Their average impact factor was 2.14.
Weeks since the last update for the total of the sample, n=363, and for the websites evaluated at least by two rating systems, n=25 (median, 25th and 75th percentiles)
Number of inbound links to websites for the total of the sample, n=363, and for the websites evaluated at least by two rating systems, n=25 (median, 25th and 75th percentiles)
Top 50 pediatric web sites of the sample (N= 363) by the number of their inbound links. The weeks since the last update, the number of daily visits to the web sites and their editor/author's impact factor are also provided. In parenthesis, the place that each web site would obtain if ranked by the two latter criteria. In italics, those web sites indexed at least by two rating systems. Missing values are due to the lack of visits counter, editor's name, or information about the last update, for many web sites
|
|
|
|
|
|
1 |
|
3574 | - | - | 13 |
2 |
|
2355 | 1620 (3º) | 0 ( 360º) | - |
3 |
|
1109 | - | - | - |
4 |
|
927 | - | - | 3 |
5 |
|
896 | - | - | 1 |
6 |
|
785 | - | - | 4 |
7 |
|
767 | - | - | - |
8 |
|
714 | - | - | - |
9 |
|
677 | - | - | - |
10 |
|
612 | - | - | 4 |
11 |
|
572 | - | 0 ( 360º) | 1 |
12 |
|
534 | - | - | - |
13 |
|
502 | - | 0 ( 360º) | - |
14 |
|
487 | - | 10.1 (8º) | 9 |
15 |
|
428 | - | - | - |
16 |
|
423 | 1412 (5º) | - | - |
17 |
|
365 | 940 (6º) | - | 1 |
18 |
|
365 | - | - | 1 |
19 |
|
357 | - | - | - |
20 |
|
330 | - | - | - |
21 |
|
322 | 253 (10º) | 9.3 (13º) | 2 |
22 |
|
317 | - | - | - |
23 |
|
312 | - | 0 ( 360º) | 52 |
24 |
|
297 | - | 0 ( 360º) | - |
25 |
|
297 | 94 (20º) | - | - |
26 |
|
287 | - | - | 6 |
27 |
|
284 | - | - | - |
28 |
|
255 | - | - | - |
29 |
|
254 | - | 0 ( 360º) | 1 |
30 |
|
251 | - | - | - |
31 |
|
249 | 70 (25º) | - | 8 |
32 |
|
238 | - | 0 ( 360º) | 12 |
33 |
|
235 | - | 0 ( 360º) | 2 |
34 |
|
232 | 117 (15º) | - | 1 |
35 |
|
225 | - | 0.4 (51º) | 3 |
36 |
|
220 | - | - | 3 |
37 |
|
214 | 3145 (1º) | 0.3 (54º) | 1 |
38 |
|
212 | - | - | - |
39 |
|
208 | 3145 (1º) | 0.3 (55º) | 1 |
40 |
|
205 | - | - | - |
41 |
|
204 | - | - | 13 |
42 |
|
197 | - | - | 7 |
43 |
|
197 | 81 (23º) | 11.7 (6º) | - |
44 |
|
188 | 2441 (2º) | 0 ( 360º) | - |
45 |
|
179 | 145 (13º) | 1.0 (39º) | 2 |
46 |
|
179 | - | - | - |
47 |
|
162 | - | - | - |
48 |
|
150 | - | - | - |
49 |
|
144 | 68 (26º) | 0.2 (56º) | 1 |
50 |
|
141 | - | - | 1 |
Only 25 websites of the sample were indexed and evaluated at least by two rating systems, and none by the eight. This subset of websites showed significantly better results of the evaluation of their contents and design by HealthAtoZ, and higher grade of updating (
Some interesting correlations between the results of the evaluations of the websites and the rest of study variables were found. The number of links received by the websites significantly correlated with their daily visits and with the time since the last update (
The number of daily visits significantly correlated with the results of the websites evaluation made by Medical Matrix, and the grade of updating significantly correlated with the results of the contents and designs evaluation made by HealthAtoZ (
Finally, no correlation was demonstrated between the average impact factor of the websites authors and the other variables.
The top fifty pediatric websites of the sample are shown in
In this study, certain websites characteristics that depend on the users' preferences have been compared with evaluations of pediatric resources on the Web by third parties. Although rating systems have been previously criticized because their editorial boards frequently do not employ uniform criteria [
Some aspects of our method are open to discussion. Firstly, the reliability of the data regarding the daily visits and the updating frequency depends on the accuracy of the information that the websites editors offer in their sites. In this sense, we considered the grade of updating of the websites by the dates of their last changes. Clearly these changes could involve very different aspects and in different grades, and not necessarily provide more current contents. However, we believe that it could demonstrate the editor's efforts in maintaining or increasing the interest of his web site for the visitors.
The results regarding the number of daily visits to the websites must be considered with caution when comparing one web site to another, because some visit counters were set to register every visit, instead of every distinct visitor. Nevertheless, both can be considered usage indexes of a given web site.
On the other hand, quantification of links to the websites clearly depends on the power of the search engine we employ. By no means our results show the
Although all bibliometric indexes have limitations [
We could not find statically significant correlations among the evaluations of the websites by the different rating systems. This is probably due to the small size of the subset of websites indexed and evaluated by all the systems, and their different evaluation criteria. However, some interesting data were found when we considered the correlations among the four websites characteristics and the evaluations. We found that the best websites for HealthAtoZ, the largest analyzed rating system, were the most updated and the most linked ones. On the other hand, the most valuable websites for Medical Matrix, the second rating system by size, were the most visited ones. In any case, both the number of daily visits and the time since the last update highly correlated with the number of inbound links. The lack of correlation among the four variables and the evaluations by the other rating systems could be due to their little contribution to our sample.
Many efforts to establish quality criteria will have limited efficacy due to the dynamic behaviour of the Internet as a publishing medium. In fact, a recent article demonstrates the lack of consensus among the editorial boards of a large sample of evaluation and rating systems regarding the evaluation criteria they employ. The same authors pointed out that "... it may be difficult or even inappropriate to develop a static tool or system for assessing health related websites." [
Eysenbach and Diepgen [
LaPorte et al [
The citation analysis of biomedical journals has been a classic tool in assessing their relative quality [
An evaluation system based on this quantification would bring advantages and risks. Rankings could be generated very quickly and in an objective way, because the Internet community by itself would evaluate great amounts of medical websites. However, this evaluation process would be made
Our work demonstrates that the visitors of pediatric websites and the editors of websites on the "Net," so called webmasters, show certain maturity when they have to identify the pediatric resources with high quality. We believe that the key point is how to augment the proportion of these resources. An important issue could be to establish a citation style not only for articles from peer reviewed electronic journals [
In summary, although the Internet provides a very different publishing medium, traditional means borrowed from printed journals could also be used with this electronic media for achieving minimal levels of quality. These include certain peer review processes, that enhance the rigor of the documents submitted for publication taking in account the peculiarities of this media, and linking analysis as a measure of the citation on the World Wide Web.
The authors are sincerely grateful to the members of the Research Unit of the Hospital Universitario de Canarias, and specially to Dr Armando Torres and Dr Eduardo Salido for their help and support in the editing and translation of the manuscript. We are also specially grateful to Mr. Trevor P. Doble for his help in the translation of this manuscript.
None declared.