This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
Snowballing involves recursively pursuing relevant references cited in the retrieved literature and adding them to the search results. Snowballing is an alternative approach to discover additional evidence that was not retrieved through conventional search. Snowballing’s effectiveness makes it best practice in systematic reviews despite being time-consuming and tedious.
Our goal was to evaluate an automatic method for citation snowballing’s capacity to identify and retrieve the full text and/or abstracts of cited articles.
Using 20 review articles that contained 949 citations to journal or conference articles, we manually searched Microsoft Academic Search (MAS) and identified 78.0% (740/949) of the cited articles that were present in the database. We compared the performance of the automatic citation snowballing method against the results of this manual search, measuring precision, recall, and F1 score.
The automatic method was able to correctly identify 633 (as proportion of included citations: recall=66.7%, F1 score=79.3%; as proportion of citations in MAS: recall=85.5%, F1 score=91.2%) of citations with high precision (97.7%), and retrieved the full text or abstract for 490 (recall=82.9%, precision=92.1%, F1 score=87.3%) of the 633 correctly retrieved citations.
The proposed method for automatic citation snowballing is accurate and is capable of obtaining the full texts or abstracts for a substantial proportion of the scholarly citations in review articles. By automating the process of citation snowballing, it may be possible to reduce the time and effort of common evidence surveillance tasks such as keeping trial registries up to date and conducting systematic reviews.
Evidence retrieval tasks such as literature reviews and decision support, where recall of all relevant evidence is required, cannot rely on search technology alone due to limitations of keyword searching [
Snowballing involves recursively pursuing relevant references cited in already-retrieved literature and adding them to the search results. Thus, snowballing is not limited to citation information found in bibliographical databases. The technical challenges of snowballing include obtaining the full text of retrieved citations, recognizing citation strings in the text, and retrieving new citations from citation strings. These make snowballing both tedious and time consuming.
Unlike keyword searching, snowballing does not require specific search terms [
We tested an approach to automatic snowballing that uses citation extraction algorithms and scientific search engines to follow the steps of snowballing: (1) extract citation strings from documents, (2) find the citations, (3) fetch the full text of citations, and (4) repeat the process to recursively retrieve more citations. The goal of this study is to test the feasibility of automating each of the subtasks of snowballing.
With an initial set of at least one paper, portable document format (PDF) and hypertext markup language (HTML) documents are converted to plain text. A modified version of ParsCit [
In the evaluation, we used citations from a set of published English language reviews about neuraminidase inhibitors. The dataset consisted of 152 systematic and non-systematic review articles. We randomly selected a subset of 20 review articles that contained 1057 citations. We excluded references to websites, books, book chapters, newspaper articles, and grey literature, leaving 949 included citations. The properties of the 20 review articles are provided in
We evaluated our algorithm using the proportion of extracted references, the proportion of citations retrieved, and the proportion of abstracts and full texts downloaded. We checked extracted citations manually against the references in the paper. We considered a reference to be correctly extracted only if it contained the entire reference without loss of information. We did allow for minimal extra information, such as white space and citation number but not information that should have been part of another citation string, page footer or manuscript text. The accuracy of the retrieved citations and abstract/full text with the references from the systematic reviews were verified manually. Correctly retrieved articles were counted as true positives. Retrieved articles that are not the ones cited were counted as false positives.
We used Microsoft Academic Search (MAS) [
We manually searched for missed references to ascertain whether they were indeed indexed in MAS. Articles that were not retrieved but were found by manual search of MAS were counted as false negatives. We calculated precision, recall, and F1 score using the standard formulae:
Precision = (True positives) / (True positives + False positives)
Recall = (True positives) / (True positives + False negatives)
F1 score = 2 x Precision x Recall / (Precision + Recall)
The precision, recall, and F1 scores were computed for retrieval of citations, abstract (only abstracts or abstracts with full texts), and full text against all citations (1057 references), included citations (949 references), and included citations indexed in MAS (740 references).
All experiments were conducted on computers with Internet protocols (IP) allocated to the University of New South Wales. Journals that automatically recognize subscription by IP address and to which the University of New South Wales library is subscribed were thus granted access. No other subscription activation or authentication methods were used. However, since most abstracts are freely accessible, download of abstracts will not normally be affected by journal subscription.
Microsoft Academic Search (MAS).
The summary of the evaluation is shown in
For the reference strings indexed in MAS, 66.2% (490/740) of abstracts were correctly downloaded either on their own or as part of the full text. These represent 51.6% of 949 included citations and 46.4% of all 1057 references included in the study.
Out of the 633 correctly identified citations, we retrieved the full text or abstract for 490 (recall=82.9%, precision=92.1%, F1 score=87.3%). We examined the specific reasons why 143 (22.6%) of the articles were not downloaded automatically. MAS had incorrect links for 39 articles (6.2%), and no link to full text for 6 articles (0.9%); 56 links (8.8%) led to a page that uses JavaScript to dynamically create a link to the full text. For citations where only abstracts were downloaded (15 abstracts), full text documents were not downloaded due to journal subscription access.
Results of citations, abstract, and full text retrieval (precision, recall, and F1 score of database results fetch, and full text and abstract retrieval, comparing all reference strings, only included citations, and only included citations indexed in MAS).
|
As proportion of all citations (n=1057) | As proportion of included citations (n=949) | As proportion of citations in MAS (n=740) | |
|
||||
|
Precision | 0.977 | 0.977 | 0.977 |
|
Recall | 0.600 | 0.667 | 0.855 |
|
F1score | 0.743 | 0.793 | 0.912 |
|
||||
|
Precision | 0.921 | 0.921 | 0.921 |
|
Recall | 0.483 | 0.540 | 0.702 |
|
F1score | 0.634 | 0.681 | 0.797 |
|
||||
|
Precision | 0.919 | 0.919 | 0.919 |
|
Recall | 0.475 | 0.533 | 0.696 |
|
F1score | 0.626 | 0.674 | 0.792 |
Summary of the evaluation results (from 20 reviews with 949 scholarly citations, MAS included 740 citations, 633 citations were found, and 490 full texts and abstracts were downloaded automatically).
Snowballing is tedious and resource demanding but has shown to improve retrieval. This evaluation shows that it is feasible to automatically perform snowballing using our method by extracting and downloading the citations. Systems designed to perform many of the systematic review tasks are already in use, in development, or in research [
Automatic citation extraction is a difficult task [
A limitation of this study is that full text fetching is tested on journal subscription by IP address and to which the University of New South Wales library is subscribed. While this means that results may vary in other institutions, they also represent an exemplar that may guide expectations of results. With the growth of open source and other means of obtaining full text [
In this evaluation, the algorithm was limited to MAS. This is a constraint of the testing system, not of the method. From the limited testing we have conducted, the algorithm performs equivalently on Google Scholar but computer-access restrictions prevented a robust comparison.
Some existing databases, such as Scopus [
Snowballing is automatable and can reduce the time and effort of evidence retrieval. It is possible to reliably extracts reference lists from the text of scientific papers, find these citations in scientific search engines, and fetch the full text and/or abstract.
Source code http://www2.chi.unsw.edu.au/~miewkeen/ESuRFr.html.
Properties of the 20 review articles included in the study.
application programming interface
digital object identifier
Internet protocol
Microsoft Academic Search
This work was supported by a National Health & Medical Research Council Centre for Research Excellence in eHealth Grant APP1032664.
None declared.