Advertisement: Preregister now for the Medicine 2.0 Congress
Building a Transnational Biosurveillance Network Using Semantic Web Technologies: Requirements, Design, and Preliminary Evaluation
Douglas Teodoro1,2, BEng; Emilie Pasche1,2, MSc; Julien Gobeill2,3, MSc; Stéphane Emonet1,2, MD; Patrick Ruch3, PhD; Christian Lovis1,2, MPH, MD
1University Hospitals of Geneva, Geneva, Switzerland
2University of Geneva, Geneva, Switzerland
3University of Applied Sciences Western Switzerland, Geneva, Switzerland
University Hospitals of Geneva
Rue Gabrielle-Perret-Gentil 4
Phone: 41 (0)22 372 6203
Fax: 41 (0)22 372 6255
Background: Antimicrobial resistance has reached globally alarming levels and is becoming a major public health threat. Lack of efficacious antimicrobial resistance surveillance systems was identified as one of the causes of increasing resistance, due to the lag time between new resistances and alerts to care providers. Several initiatives to track drug resistance evolution have been developed. However, no effective real-time and source-independent antimicrobial resistance monitoring system is available publicly.
Objective: To design and implement an architecture that can provide real-time and source-independent antimicrobial resistance monitoring to support transnational resistance surveillance. In particular, we investigated the use of a Semantic Web-based model to foster integration and interoperability of interinstitutional and cross-border microbiology laboratory databases.
Methods: Following the agile software development methodology, we derived the main requirements needed for effective antimicrobial resistance monitoring, from which we proposed a decentralized monitoring architecture based on the Semantic Web stack. The architecture uses an ontology-driven approach to promote the integration of a network of sentinel hospitals or laboratories. Local databases are wrapped into semantic data repositories that automatically expose local computing-formalized laboratory information in the Web. A central source mediator, based on local reasoning, coordinates the access to the semantic end points. On the user side, a user-friendly Web interface provides access and graphical visualization to the integrated views.
Results: We designed and implemented the online Antimicrobial Resistance Trend Monitoring System (ARTEMIS) in a pilot network of seven European health care institutions sharing 70+ million triples of information about drug resistance and consumption. Evaluation of the computing performance of the mediator demonstrated that, on average, query response time was a few seconds (mean 4.3, SD 0.1×102 seconds). Clinical pertinence assessment showed that resistance trends automatically calculated by ARTEMIS had a strong positive correlation with the European Antimicrobial Resistance Surveillance Network (EARS-Net) (ρ = .86, P < .001) and the Sentinel Surveillance of Antibiotic Resistance in Switzerland (SEARCH) (ρ = .84, P < .001) systems. Furthermore, mean resistance rates extracted by ARTEMIS were not significantly different from those of either EARS-Net (∆ = ±0.130; 95% confidence interval –0 to 0.030; P < .001) or SEARCH (∆ = ±0.042; 95% confidence interval –0.004 to 0.028; P = .004).
Conclusions: We introduce a distributed monitoring architecture that can be used to build transnational antimicrobial resistance surveillance networks. Results indicated that the Semantic Web-based approach provided an efficient and reliable solution for development of eHealth architectures that enable online antimicrobial resistance monitoring from heterogeneous data sources. In future, we expect that more health care institutions can join the ARTEMIS network so that it can provide a large European and wider biosurveillance network that can be used to detect emerging bacterial resistance in a multinational context and support public health actions.
(J Med Internet Res 2012;14(3):e73)
Antimicrobial drug resistance; heterogeneous databases; online information services; surveillance
Since their discovery, antibiotics have proved powerful for the control of bacterial infections. However, because of multifactorial causes, especially the widespread use of antibiotics in medicine, animal husbandry, and agriculture, pathogens have developed increasing resistance to many effective drugs [1,2]. The problem of antimicrobial resistance has reached an alarming level, and urgent efforts are needed to avoid regressing to the preantibiotic era [3,4].
In addition to well-known drug resistance cases such as Pneumococcus species to penicillin [5-7], outbreaks of new resistant pathogens have become ever more common and have caused many deaths worldwide. Aware of the risks that antimicrobial resistance poses to global public health, the World Health Organization (WHO), among other measures, chose combating antimicrobial resistance as the theme of World Health Day 2011. Lack of effective monitoring systems was identified as an underlying cause of resistance increase, and its improvement is one of the policies the WHO adopted to tackle the problem.
Over a decade ago, Monnet et al described and compared the most relevant antimicrobial resistance surveillance systems in Europe. Since then, no new public transnational surveillance initiatives have been developed. Consequently, most projects in use are based either on reporting and manual data acquisition or on outdated information technologies, especially concerning data integration and semantics. Furthermore, no cross-country surveillance system that provides online, direct, and real-time access to antimicrobial resistance information is available. All the systems implemented so far are dependent on delayed data warehouses, usually compiled yearly, which, among other weaknesses, fail to capture antimicrobial resistance outbreaks[10,11]. Finally, these systems do not provide easy ways to export data. Participating institutes have to comply with the surveillance system standards, a labor intensive task, especially for newcomer institutions or newly discovered resistance pathogens .
The primary aim of this study was to develop a framework for transnational antimicrobial resistance monitoring, featuring real-time access to laboratory information and being generic with respect to data sources, in order to support multinational resistance surveillance. The secondary aim of the study was to investigate the use of Semantic Web-based architecture in the integration and interoperability of interinstitutional and cross-border databases to support such a framework. To fulfill these aims, we designed the Antimicrobial Resistance Trend Monitoring System (ARTEMIS). ARTEMIS architecture illustrates how Semantic Web technologies can support online monitoring of antimicrobial resistance trends in heterogeneous networks of health care institutions. It demonstrates how semantically interoperable end points can provide on-demand information on resistance evolution. Furthermore, it describes ways to automate the monitoring process through a state-of-the-art clinical data integration system, which provides mechanisms to adapt to existing electronic health records and laboratory information systems. The architecture is validated according to performance and clinical pertinence.
This paper addresses a large audience, from engineers who have Semantic Web techniques in mind to public health authorities, by showing the results of applying Semantic Web technologies to one of the most crucial current public health challenges: building a global surveillance system for antimicrobial resistances. Here we discuss the technical framework of the project, a technical evaluation, and the quality of the system compared with existing surveillance networks.
Previous European Antimicrobial Resistance Monitoring and Surveillance Initiatives
Several projects have been implemented to provide monitoring and surveillance of antimicrobial resistance evolution in a European context. WHONET was one of the first initiatives to standardize and aggregate results from laboratories in a cross-country environment. Since 1995, the WHO has been developing the WHONET software, in which participating microbiology laboratories present their tests using a specific susceptibility testing terminology defined by the WHO.
The most successful European surveillance project is the European Antimicrobial Resistance Surveillance System developed by the European Centre for Disease Prevention and Control. According to the agency, 900 public health laboratories serving over 1400 hospitals in Europe participate in the network, providing results on a yearly basis. To improve data quality, external control is applied to the susceptibility testing methods used by the participating laboratories. The project has recently evolved into the European Antimicrobial Resistance Surveillance Network (EARS-Net) and will serve as a reference to assess the sampling effectiveness of ARTEMIS.
A few other public initiatives were introduced in parallel. In 1998, the European Society of Biomodulation and Chemotherapy created the European Surveillance of Antibiotic Resistance project. The goal was to establish a representative network of sentinel diagnostic laboratories across Europe to provide antimicrobial resistance monitoring and early detection of new resistant pathogens. In the same year, the US Centers for Disease Control and Prevention launched the International Network for the Study and Prevention of Emerging Antimicrobial Resistance  with 79% of participant countries, out of 40, from Europe. The main objective of the project was to serve as an early warning system for emerging resistant pathogens. Finally, in 1999, the Antimicrobial Resistance Information Bank  was derived from the WHONET informal network. Results were reported to the WHO, and an additional external audit quality control was performed on the data. All of these projects have been discontinued, and some were characterized more as a survey than as a surveillance system.
In contrast to the previous initiatives, The Surveillance Network is a corporate-funded surveillance project. It started in 1992 in the United States and later enrolled European laboratories as well. The data extraction and aggregation processes are done by Focus Technologies Inc. (Herndon, VA, USA), the company responsible for the project. Unfortunately, despite having probably the biggest antimicrobial resistance database worldwide, this network provides no antimicrobial resistance information free to the public.
The DebugIT Project
ARTEMIS was developed as part of the Detecting and Eliminating Bacteria Using Information Technology (DebugIT) project, which is funded by the European Union Seventh Framework Programme . DebugIT is a consortium composed of 14 industrial, research, and clinical institutions from nine countries that are collaborating to build a framework for sharing antimicrobial resistance information from clinical information systems in a Europeanwide context. The project aims to reuse existing clinical data for generating new knowledge to be incorporated in decision support and monitoring engines at the point of care and for developing prevention strategies at policy levels.
The DebugIT architecture (Figure 1) is based on distributed services that exchange information using Semantic Web technologies . The Semantic Web stack provides methods that can contribute to solving technical, syntactic, and semantic differences between disparate data sources [21-24], bringing formal and meaningful representation to data models and sources. First, it presents a standard format to encode information called Resource Description Framework (RDF) , which models Web resources in a subject–predicate–object form, a so-called triple. This generic model, in contrast to the entity-relationship model used in traditional databases, facilitates the representation of clinical facts to an unconstrained dimension. Second, it has defined the Simple Protocol and RDF Query Language (SPARQL) standard that provides ways to access ubiquitously resources available on the Web. Finally, computer-interpretable ontologies written in the Web Ontology Language  bring formal conceptualization to RDF resources, improving the quality of data and fostering interoperability between heterogeneous systems.
[view this figure]
|Figure 1. Architecture of the Detecting and Eliminating Bacteria Using Information Technology (DebugIT) framework. Components of the architecture, such as the clinical data repository (CDR), knowledge repository (KR), decision support system (DSS), and monitoring system (MS), are interconnected using the HTTP/SPARQL protocol through the Internet bus. Messages are transferred in the RDF format, and ontologies formalize the data model and content.|
Experts from the DebugIT project with different backgrounds, including infectiologists, epidemiologists, computer scientists, knowledge engineers, and eHealth service providers, were involved in the design of ARTEMIS. Over the course of 2 years, we held weekly meetings with these experts to discuss the status of the tasks involved in the system development . In the process, we reviewed the existing distributed integration and interoperable eHealth systems and European antimicrobial resistance monitoring programs. Thereafter, we elaborated the requirements and designed the system model.
To provide a monitoring system that can be effectively used in the fight against antimicrobial resistance, we derived the following six main requirements based on the published literature and on the expertise of the DebugIT consortium.
The System Shall Provide Online Information
All public European supranational monitoring systems provide resistance information in batch mode—that is, data are collected into batches of laboratory tests and processed periodically, usually on a yearly frequency. While online resistance information is useful on a daily basis at local levels, recent infectious pandemic threats have shown how important this information would be at a multinational level for decision makers. Thus, changing this paradigm to online trends is crucial for antimicrobial resistance surveillance, especially for early warning of emerging resistance trends [10,11].
The System Shall Provide Aggregated Information From Numerous National Sources
Increasing antibiotic resistance is a worldwide public health concern, and for its effective combat, a successful surveillance system has to offer multinational resistance information.
The System Shall Not Store Data Centrally
Sharing biomedical data raises several ethical concerns. To comply with international standards on sharing biomedical information, increase the trust of data providers, and encourage collaboration in the surveillance network, central aggregation must be avoided.
The System Shall Implement a Formal and Semantic-Aware Data Model
Most of the available systems do not use formalized biomedical data models, nor computable terminologies and ontologies. As a result, the process of extracting resistance information and data analysis in a heterogeneous environment is done manually or semiautomatically. In addition to the overhead work, the lack of formal conceptualization of the raw laboratory data can have a negative influence on the quality of the data.
The System Shall be High Performing
To be operatively used by health care professionals, whose working environment is recognized to be very time constrained, eHealth systems must provide a fast response time.
The System Shall Provide Reliable Results
Automatic extraction of antimicrobial resistance trends from heterogeneous data sources poses several challenges to accurate data analysis, including concept ambiguity and the common denominator, which can degrade the quality of the examination. However, especially if the system is used by clinicians at the point of care, the accuracy of the results must be equivalent to those obtained by semiautomatic processes, where data cleansing and audit are performed prior to integration and interpretation.
To fulfill the ARTEMIS desiderata, we envisaged the system according to the Semantic Web-complying architecture presented in Figure 2. The system’s semantic interoperability schema is based on an ontology-driven data integration approach, where multiple semantically flat local data definition ontologies are mapped to a common domain ontology, the DebugIT Core Ontology . Semantic mappings at local and global levels align concepts from the local ontologies with the domain knowledge.
In the architecture’s data model layer, local laboratory databases are connected online to semantic-aware end points, the local clinical data repositories (lCDRs) [34,35]. The lCDRs formalize the local sources and provide a query interface to the controller layer. The semantic mediator, implemented at the controller layer, represents antimicrobial resistance clinical questions as query templates for each end point and coordinates the access to the different sites. It performs the query’s data aggregation operations locally to improve query performance and the site’s data integration on the fly to avoid central storage. Finally, in the view layer, query templates with parameters extracted from the domain ontology are used to represent antimicrobial resistance clinical questions. As a proof of concept, three initial query templates were proposed by clinicians to be implemented in the system. (1) What is the evolution of resistance to :antibiotic of :bacteria cultured from :sample extracted from :gender patients at :clinical_setting during period :begin_date - :end_date? (2) What is the prevalence of :antibiotic :susceptibility :bacteria in :sample extracted from :gender patients at :clinical_setting during period :begin_date - :end_date? (3) What is the rate of :gender patients that get :antibiotic to treat :bacteria infection found in :sample at :clinical_setting during period :date_begin - :date_end?
A more detailed description of the system model is given in Multimedia Appendix 1.
[view this figure]
|Figure 2. Antimicrobial Resistance Trend Monitoring System (ARTEMIS) architecture. (a) Ontology components. Models: data definition ontology (DDO), DebugIT Core Ontology (DCO), and interface ontology (IO). Mappings: local-terminology-to-DCO (LT2DCO) and global-terminology-to-DCO (T2DCO). (b) Run-time business components. (1) Data layer components are deployed within the demilitarized zone of the health care institution. (2) Controller and view layers contain central services, which are deployed in the Internet. lCDR = local clinical data repository.|
To assess ARTEMIS, we connected a network of seven data providers: National Heart Hospital, Sofia, Bulgaria; Les Hôpitaux Universitaires de Genève, Geneva, Switzerland; Georges Pompidou European Hospital, Paris, France; Internetový Pristup Ke Zdravotním Informacím Pacienta, Prague, Czech Republic; Swedish Intensive Care Registry, Sweden; Athens Chest Hospital “Sotiria”, Athens, Greece; and Universitätsklinikum Freiburg, Freiburg, Germany. Table 1 summarizes antimicrobial resistance-related data shared by these institutions.
We obtained permission to use de-identified data from the ethics committees of the respective participant hospitals. Privacy-sensitive information accessible through the local end points was pseudoanonymized to conform to the European legal and ethical patient data-sharing framework. Data values such as date of birth were truncated to the year, and concepts such as episode of care (or encounter) and patient identifiers were encrypted. Furthermore, query templates are pathogen and population centric—that is, the information collected concerns the resistance and treatment of a pathogen population for a given antibiotic in a set of microbiology results. It is therefore not related to a specific patient.
[view this table]
|Table 1. Data used in the Antimicrobial Resistance Trend Monitoring System (ARTEMIS).|
The implementation of the functional features defined in the first four design requirements is described at the technical component level using design pattern [37-39] examples. In contrast, for the last two requirements, which can be quantitatively measured, results are presented using efficiency and effectiveness metrics.
Methods for Data Acquisition and Analysis
We measured efficiency using the mediator’s query retrieval time for the three aforementioned query templates. Combinations of pathogens, antibiotics, and sample types were applied to vary the queries and thus avoid database caching effects. Results of the local aggregation mode applied in the query mediator were compared with a central aggregation strategy (baseline).
To assess effectiveness, resistance trends extracted using query template 1 were compared with data from two publicly available surveillance systems: EARS-Net and the Sentinel Surveillance of Antibiotic Resistance in Switzerland (SEARCH). We extracted yearly resistance trends for seven key pathogenic bacteria—Enterococcus faecalis, Enterococcus faecium, Escherichia coli, Klebsiella pneumoniae, Pseudomonas aeruginosa, Staphylococcus aureus, and Streptococcus pneumoniae—based on their presence in the three systems. Antibiotics were selected if they were present on both ARTEMIS and the reference system. Resistance rates of the last 4 years (2006 to 2009) available in EARS-Net were used, whereas all years (2008 to 2010) available in SEARCH were taken into account. ARTEMIS data sources that did not contain either more than 1 million triples or data elements to answer the queries were excluded from the analysis, resulting in four sites: Georges Pompidou European Hospital, Les Hôpitaux Universitaires de Genève, Swedish Intensive Care Registry, and Universitätsklinikum Freiburg.
We compared results from Georges Pompidou European Hospital, Swedish Intensive Care Registry, and Universitätsklinikum Freiburg with the resistance rates of their respective EARS-Net countries—France, Sweden, and Germany— and results from Les Hôpitaux Universitaires de Genève with SEARCH. We report correlation and equivalence results using the Spearman rank correlation and the two one-sided convolution[40,41] tests, respectively (see Multimedia Appendix 2).
ARTEMIS was implemented and deployed in a pilot network of seven European health care institutions sharing 70+ million triples of antimicrobial resistance information. As Figure 3 shows, near real-time resistance trends can be extracted from the distributed network using the system’s Web interface. The tool can be accessed at http://babar.unige.ch:8080/artemis.
[view this figure]
|Figure 3. Antimicrobial Resistance Trend Monitoring System (ARTEMIS) interface. The menu on the left displays the interface ontology concepts, which are used to fill in the template parameters. Each of the view tabs represents a different query template. The data visualization interface displays several graphical representations to provide a comprehensive view of the data.|
In this section, we present design patterns describing the main functional features of the online distributed monitoring system.
Online Information Provider
The system shall provide online information.
In the architecture presented in Figure 2, local semantic-aware end points, realized by RDF stores, are plugged into the laboratory databases. Thus, microbiology tests are accessible as soon as they are available in the production databases. These end points are formalized by local ontologies and exposed to the Web so that data are reachable by other parts of the system. In cases where local laboratory databases communicate in the SPARQL protocol, they can be directly connected to the network.
In ARTEMIS, the technical interoperability with the different data sources is provided by D2R  engines complemented by site-specific extract, transform, and load processes (Figure 4, part a), which can exploit autocoding methods [43,44]. Alternatively (Figure 4, part b), for cases where there is an accessible production laboratory database, D2R can be plugged directly into the existing system to transform the local data source into a semantic end point (lCDR).
[view this figure]
|Figure 4. Local clinical data repository (lCDR) deployment and population model. (a) Production data are extracted daily to a local mirror database, which is “sparqlized” by an SQL-to-RDF engine. (b) RDF view is created directly on top of the legacy system. Data are anonymized on the fly.|
The system shall provide aggregated information from numerous international sources.
The technical and semantic heterogeneity within models and concepts from different clinical data sources poses an important barrier for data aggregation and analysis. ARTEMIS architecture relies on a layer of semantically formalized end points, the lCDRs, to solve part of the integration problem. These end points provide a first level of interoperability, modeling the local systems and the data content and providing a common protocol to access data, the SPARQL protocol. The semantic mediator designed in the controller layer builds on top of the lCDR layer and allows the creation of homogeneous aggregated views over the distributed data sources. Thus, the system becomes a grid of semantic-aware sentinels that provide antimicrobial resistance information from heterogeneous supranational data sources.
In ARTEMIS, the lCDRs are provided by RDF-like stores to create the first semantic layer on top of the local databases. The data definition ontologies formalize the local end points and expose linkable data on the Web. The Jena Framework is used for querying the remote lCDRs and for reasoning over the RDF models.
The system shall not store data centrally.
ARTEMIS changes the centralized integration paradigm used in antimicrobial resistance surveillance. Unlike other systems[9-11], its distributed architecture does not require centralization of microbiology test results. At query time, a global aggregated view on the local end points is created by the semantic mediator, solving the problem of interoperability while avoiding a central repository, which would violates the project’s legal requirements. Additionally, since there is no need to move data across the health care border, this design gives full control to participating sites, allowing them to stop sharing data at any moment. Further, no historical information for the respective site is kept on the system.
In the model–view–control pattern  presented in Figure 2, persistent data stores are deployed only within the demilitarized zone of the data providers. The central mediator process and aggregates query constraints locally. In this configuration, there is no need to move datasets with information at the patient level out of the institutional borders. Only aggregated population data are retrieved at query time. Furthermore, institutions can stop sharing data at any moment by shutting down the lCDR server. This change is automatically reflected in ARTEMIS, which will not be able to retrieve any data from the respective data source; other sources remain seamlessly reachable.
The system shall implement a formal and semantic-aware data model.
In a multinational environment, the contents of electronic health records and laboratory information systems are expressed in several languages and different terminologies. Additionally, spelling mistakes and abbreviations are common in concept definitions. These ambiguities reduce the quality of the statistical analysis. To have unified semantics across the different data sources, in ARTEMIS’s knowledge model (Figure 5), concepts are represented using a formal language (RDF/OWL). Further, they are aligned into common syntaxes defined by biomedical terminologies. Finally, to have a common meaning across the whole system, these formally represented terminologies are mapped to a shared domain ontology.
[view this figure]
|Figure 5. The hybrid ontology-driven interoperability mapping model. White elements represent local-level concepts and blue elements represent shared knowledge. (a) Local entity-relationship schemata are formalized by the data definition ontologies (DDOs). Mappings between DDO data elements and DebugIT Core Ontology (DCO) link local concepts to the global knowledge. (b) Example of a semantic mapping: concept map diagram (left) and RDF/Notation3 representation (right).|
In ARTEMIS, standard terminologies such as the Systematized Nomenclature of Medicine-Clinical Terms (SNOMED-CT), the WHO’s Anatomical Therapeutic Chemical (WHO-ATC) classification system, and the Universal Protein Resource (UniProt/NEWT) are mapped to DebugIT Core Ontology using the Simple Knowledge Organization System (SKOS) ontology  and Notation3 rules (Figure 5b). If local concepts are not already defined using these terminologies, they are normalized against them using automatic classification tools [43,44]. Alternatively, local concepts represented in the SKOS notation can be directly mapped to DebugIT Core Ontology.
We assessed the mediator’s SPARQL query performance for query templates 1, 2, and 3. A query mix composed of 225 unique queries, spanning 4 years in daily, monthly, and yearly periods, were used. Each query mix was submitted 10 times against the seven end points. Table 2 summarizes the results. The mean query response time was 4.3 (SD 0.1×102) seconds. Comparing the results with a different aggregation strategy, based on central reasoning, the average retrieval time increased almost 30-fold (mean 130.5, SD 0.1×103 seconds).
Figure 6 shows how the response time of ARTEMIS queries varied with the number of rows retrieved for different query templates and aggregation periods. Indeed, the response time is highly correlated with the number of rows retrieved (ρ = .81, P < .001).
[view this table]
|Table 2. Arithmetic (ta) and geometric (tg) mean (SD) execution times for the two query mediation strategies: local (Antimicrobial Resistance Trend Monitoring System [ARTEMIS]) versus central (baseline) reasoning.|
[view this figure]
|Figure 6. Query performance. Response time and rows retrieved by template (1-3) and aggregation period. As the number of rows retrieved increases, the response time tends also to increase.|
Result Reliability Requirement
Following the data selection criterion, we created 221 queries for EARS-Net and 153 for SEARCH based on template 1. Table 3 shows the geometric mean resistance rates extracted from the three systems. The results yielded a strong positive correlation coefficient between ARTEMIS and both EARS-Net (ρ = .86, P < .001) and SEARCH (ρ = .84, P < .001) reference systems.
The within-country geometric standard deviation of EARS-Net was σears = 0.130. This value was extrapolated to the similarity region Δ ( Δ = σears) of the two one-sided convolution test. Figure 7 (part a, all results and part b, without outliers) presents the correlation between the two systems, and Figure 7c shows the regions of similarity. The confidence interval (CI) lies in the region of similarity (95% CI 0–0.030; P < .001), confirming the equivalence between the ARTEMIS and EARS-Net resistance rates. Similarly, for SEARCH, the Swiss region’s geometric standard deviation was σsearch = 0.042, indicating a small susceptibility rate variation in the different regions. In this scenario, the results of ARTEMIS (Figure 8, part a) cannot be considered equivalent to SEARCH (95% CI 0–0.052; P = .18). However, removing outliers—that is, those results that fall within a difference in resistance rate bigger than 3σsearch (Figure 8, part b)—also leads to an equivalent outcome (95% CI –0.004 to 0.028; P = .004).
[view this table]
|Table 3. Resistance rate geometric mean (SD) and correlation results.|
[view this figure]
|Figure 7. Antimicrobial Resistance Trend Monitoring System (ARTEMIS) vs European Antimicrobial Resistance Surveillance Network (EARS-Net). (a) Resistance rates (n = 221). Black line indicates an exact match (100% equivalence). Gray line indicates best fit. Gray dashed lines indicate Δ = ±0.130. (b) Resistance rates without outliers (n = 213). (c) Gray vertical dashed lines indicate similarity region Δ. Gray horizontal bars indicate two one-sided convolution confidence interval (CI). 95% CIa 0–0.030 (P < .001); 95% CIb 0.002–0.026 (P < .001).|
[view this figure]
|Figure 8. Antimicrobial Resistance Trend Monitoring System (ARTEMIS) vs Sentinel Surveillance of Antibiotic Resistance in Switzerland (SEARCH). (a) Resistance rates (n = 153). Black line indicates exact match (100% equivalence). Gray line indicates best fit. Gray dashed lines indicate Δ = ±0.042. (b) Resistance rates without outliers (n = 143). (c) Gray vertical dashed lines indicate similarity region Δ. Gray horizontal bars indicate two one-sided convolution confidence interval (CI). 95% CIa 0–0.052 (P = .17); 95% CIb –0.004 to 0.028 (P = .004).|
In this paper, we present an online and source-independent architecture that enables monitoring of multinational microbiology databases. The system was implemented and deployed in a pilot surveillance network distributed across Europe. From the results, one can see that Semantic Web-based architectures such as that of ARTEMIS are suitable to automate the integration and interoperability of distributed microbiology laboratory data sources. Therefore, it can be used to enable automatic access to antimicrobial resistance information in a transnational context and foster real-time multinational biosurveillance. The architecture is able to interoperate heterogeneous networks via the use of semantic maps that account for local specificity. The data integration process is performed on the fly using standard end points powered with RDF/SPARQL communication, which are mediated by a central engine. The local end points are directly connected to the laboratories’ databases and as such are able to provide (near) real-time resistance information, while avoiding centralization of the data.
The data integration architecture proposed in ARTEMIS is distinct from existing antimicrobial resistance surveillance systems [14,17,18], as it implements a loosely coupled data federation design, which is realized by formalization of the data sources and data semantics. Thus, the data layer is detached from the central system, which avoids central storage and guarantees to care providers full control over the local information. Moreover, online semantic data repositories automate access to local antimicrobial resistance databases, allowing the system to retrieve near real-time antimicrobial resistance trends. Therefore, emerging and outbreak resistances can be easily monitored on a multinational scale. Finally, instead of predetermined and statically monitored bacteria–antibiotic pairs, the architecture introduced here facilitates the expansion of the concept coverage, making the process of tracking resistance of new antibiotics and bacteria trivial. Since concepts are fully formalized by ontologies through the whole architecture, to add a new item to be monitored it is only necessary to create the respective class in the domain ontology and represent it in the semantic mappings (global, local, or both). Thus, it will be automatically reflected in the user interface, including past occurrences of the given class in microbiology tests.
ARTEMIS uses open Semantic Web technologies to provide technical and semantic interoperability. Semantic data sources create a common technical layer over the local microbiology databases, which can be accessed through a standard query protocol (SPARQL). Since local end points are fully formalized and accessible through the Web, they can be linked to external Web resources, such as the Linked Life Data, or reused in other clinical research projects to leverage knowledge on infectious diseases by combining different sources of information. Another benefit of using ontologies to represent data is the hierarchical structure, which allows higher-level representation of concepts. Therefore, the system can handle complex queries expressed at group levels allowing, for example, automatic clustering of antibiotic classes such third-generation cephalosporin or bacteria families such as Enterobacteriaceae.
Finally, the powerful query interface allied with the availability of near real-time results makes ARTEMIS not only useful to bodies concerned with supranational resistance but also potentially beneficial to local needs, especially if connected to online prescribing systems for empirical treatments. In addition, this local application might facilitate the maintenance of the system by health care institutions. As Goble and Stevens discussed , data integration systems tend to become “data mortuaries” once the research funding ends. Local appeal can possibly help to change this pattern.
All SPARQL performance benchmarks presented in the literature are focused on local single-source servers. Thus, they are not adequate to assess the performance of data integration systems. Hence, the ARTEMIS semantic mediator was compared with a standard approach of retrieving and aggregating centrally. As Table 2 shows, the push-down procedure has reduced the retrieval time by 30-fold (19-fold considering the geometric mean). Indeed, as Figure 6 shows, in a distributed system, response time is nearly linearly correlated (ρ = .81, P < .001) with the amount of data retrieved. Thus, local reasoning is crucial for systems that require fast response time.
The preference for an SQL-to-RDF engine  instead of a native RDF triple store to formalize local data sources was due to scalability issues. As Schmidt et al  noted, native RDF triple stores can hardly be scaled to answer queries when their size is bigger than a few million triples. At the mediation level, the use of a push-down approach while performing aggregation has proved efficient. The average query response was in the order of a few seconds (mean 4.3, SD 0.1×102 seconds), which could contribute to the adoption of the system by clinicians, who consider a good response time an important requirement in the system design .
Comparison with Existing Systems
Existing surveillance systems normally use semiautomatic methods to extract antimicrobial resistance rates. Validation and cleansing steps are taken by experts before statistical analysis. In ARTEMIS, this process is fully automated and, as such, errors can be introduced. To validate ARTEMIS resistance trends, we compared antimicrobial resistance rates with European and national reference systems. The results indicated a strong positive correlation between the susceptibility test outcomes. We carried out a second evaluation based on equivalence tests to confirm the trustworthiness of the results. The tests showed that at the limit of 3σ ARTEMIS trends are deemed equivalent to both EARS-Net and SEARCH.
A difference in concept definition between ARTEMIS and the reference systems negatively affected the results. The majority of outliers (18 out of 33) presented in Figure 7a and Figure 8a were caused by semantic ambiguities between concepts. For example, in ARTEMIS, antibiotic definition follows the WHO-ATC classification system terminology, which does not define a single antibiotic concept for penicillin but rather classes including several antibiotics based on penicillin. In SEARCH, this concept is defined as an antibiotic agent. Analogously, the gentamicin definition, which is not related to concentration in ARTEMIS, is defined as Gentamicin HLAR in SEARCH and High level gentamicin in EARS-Net. These issues were not accentuated in the comparison with EARS-Net because, as expected, the region of similarity was wider than that of SEARCH, which considers only within-country variations. Adoption of standard and formalized terminologies in the eHealth care field and a more dynamic evolution of terminological resources so that they can cover operational needs are part of the semantic solution.
Finally, in statistical analysis, care should be taken with duplicate tests. If all apparent duplicates are ignored indiscriminately, information may be omitted, such as nosocomial infection, whereas inclusion of all tests may skew the results, usually toward augmentation of resistance . In the reference systems, duplicate tests are manually removed. In ARTEMIS, biases were automatically minimized by considering only the unique tests within an episode of care.
In an ontology-based integration system, automatic mapping from global to local ontologies using first-order logic reasoners creates logical inconsistencies because knowledge from the various local ontologies cannot be completely reconciled in the global model. For example, if at site 1 vancomycin-resistant Enterococcus is prevalent, this fact is not necessary true for all other sites. A solution, as implemented in ARTEMIS, is to create query templates over the local ontologies. However, as the system expands to a large number of clinical providers, this approach may prove difficult to maintain, since query templates must be defined centrally for each new data source. Nevertheless, this limitation could be easily overcome if local sources provided a datamart with a common data model as proposed in Figure 4a.
Aligning multinational microbiology laboratory results presents several issues. For example, it has been shown that, for a given sample test, independent laboratories will present different outcomes. Differences in susceptibility breakpoint across countries is also a complex issue involving standardization of antibiogram methodologies. Additionally, results of second-line antibiotics tend to present bias toward resistance, since they are normally tested when isolates show resistance to first-line drugs. The methodology proposed here cannot solve most of the intrinsic divergence between different laboratory procedures. Regardless, ARTEMIS does not aim to tackle these issues but rather to promote access to distributed antimicrobial resistance information as soon as data are available in a formalized and semantically defined way.
We designed, implemented, and deployed the ARTEMIS architecture in a small-scale biosurveillance network of European hospitals. Results indicate that the distributed monitoring architecture introduced here can potentially be used to build transnational antimicrobial resistance surveillance networks. The architecture proved to be efficient and reliable, while complying with local legal and regulatory frameworks. The Semantic Web-based approach proved to be an effective solution for development of eHealth architectures that enable online antimicrobial resistance monitoring from heterogeneous data sources. In the future, we plan to investigate local model mediation, paving the way to a more easily maintainable system. We expect that new health care institutions can join the network so that it can provide clinicians and decision makers with a missing tool to tackle the growing threat of rising emergent infectious diseases and antibiotic resistance patterns.
This work is funded by the DebugIT project of the European Union Seventh Framework Programme grant agreement ICT-2007.5.2-217139.
Conflicts of Interest
Multimedia Appendix 1
Architecture design.[PDF File (Adobe PDF File), 355KB]
Multimedia Appendix 2
Equivalence test.[PDF File (Adobe PDF File), 56KB]
- Adam D. Global antibiotic resistance in Streptococcus pneumoniae. J Antimicrob Chemother 2002 Jul;50 Suppl:1-5 [FREE Full text] [Medline]
- Sarmah AK, Meyer MT, Boxall AB. A global perspective on the use, sales, exposure pathways, occurrence, fate and effects of veterinary antibiotics (VAs) in the environment. Chemosphere 2006 Oct;65(5):725-759. [CrossRef] [Medline]
- World Health Organization. 2001. WHO Global Strategy for Containment of Antimicrobial Resistance URL: http://whqlibdoc.who.int/hq/2001/WHO_CDS_CSR_DRS_2001.2.pdf [accessed 2012-01-02] [WebCite Cache]
- Boucher HW, Talbot GH, Bradley JS, Edwards JE, Gilbert D, Rice LB, et al. Bad bugs, no drugs: no ESKAPE! An update from the Infectious Diseases Society of America. Clin Infect Dis 2009 Jan 1;48(1):1-12 [FREE Full text] [CrossRef] [Medline]
- Pittet D, Safran E, Harbarth S, Borst F, Copin P, Rohner P, et al. Automatic alerts for methicillin-resistant Staphylococcus aureus surveillance and control: role of a hospital information system. Infect Control Hosp Epidemiol 1996 Aug;17(8):496-502. [Medline]
- Jacobs MR. Worldwide trends in antimicrobial resistance among common respiratory tract pathogens in children. Pediatr Infect Dis J 2003 Aug;22(8 Suppl):S109-S119. [Medline]
- Devaux I, Manissero D, Fernandez de la Hoz K, Kremer K, van Soolingen D, EuroTB network. Surveillance of extensively drug-resistant tuberculosis in Europe, 2003-2007. Euro Surveill 2010 Mar 18;15(11) [FREE Full text] [Medline]
- World Health Organization. 2011 Feb. Antimicrobial Resistance Fact Sheet no 194 URL: http://www.who.int/mediacentre/factsheets/fs194/en/ [accessed 2012-01-02] [WebCite Cache]
- Monnet DL. Toward multinational antimicrobial resistance surveillance systems in Europe. Int J Antimicrob Agents 2000 Jul;15(2):91-101. [Medline]
- Giske CG, Cornaglia G, ESCMID Study Group on Antimicrobial Resistance Surveillance (ESGARS). Supranational surveillance of antimicrobial resistance: The legacy of the last decade and proposals for the future. Drug Resist Updat 2010 Oct;13(4-5):93-98. [CrossRef] [Medline]
- O'Brien TF, Stelling J. Integrated Multilevel Surveillance of the World's Infecting Microbes and Their Resistance to Antimicrobial Agents. Clin Microbiol Rev 2011 Apr;24(2):281-295 [FREE Full text] [CrossRef] [Medline]
- Stelling JM, O'Brien TF. Surveillance of antimicrobial resistance: the WHONET program. Clin Infect Dis 1997 Jan;24 Suppl 1:S157-S168 [FREE Full text] [Medline]
- Bronzwaer SL, Goettsch W, Olsson-Liljequist B, Wale MC, Vatopoulos AC, Sprenger MJ. European Antimicrobial Resistance Surveillance System (EARSS): objectives and organisation. Euro Surveill 1999 Apr;4(4):41-44 [FREE Full text] [Medline]
- European Centre for Disease Prevention and Control. 2011. European Antimicrobial Resistance Surveillance Network (EARS-Net) URL: http://www.ecdc.europa.eu/en/activities/surveillance/EARS-Net/Pages/index.aspx [accessed 2012-01-02] [WebCite Cache]
- European Society of Biomodulation and Chemotherapy Datacenter Munich. 1999. European Surveillance of Antibiotic Resistance (ESAR) URL: http://www.esbic.de/esbic/ind_esar.htm [accessed 2012-01-02] [WebCite Cache]
- Richet HM, Mohammed J, McDonald LC, Jarvis WR. Building communication networks: international network for the study and prevention of emerging antimicrobial resistance. Emerg Infect Dis 2001 Apr;7(2):319-322 [FREE Full text] [Medline]
- Anonymous. The WHO Antimicrobial Resistance Information Bank. WHO Drug Inf 1999;13(4).
- Karlowsky JA, Kelly LJ, Thornsberry C, Jones ME, Evangelista AT, Critchley IA, et al. Susceptibility to fluoroquinolones among commonly isolated Gram-negative bacilli in 2000: TRUST and TSN data for the United States. Tracking Resistance in the United States Today. The Surveillance Network. Int J Antimicrob Agents 2002 Jan;19(1):21-31. [Medline]
- Lovis C, Colaert D, Stroetmann VN. DebugIT for patient safety - improving the treatment with antibiotics through multimedia data mining of heterogeneous clinical data. Stud Health Technol Inform 2008;136:641-646. [Medline]
- Berners-Lee T, Hendler J, Lassila O. The semantic web. Sci Am 2001;May:29-37.
- Sahoo SS, Bodenreider O, Rutter JL, Skinner KJ, Sheth AP. An ontology-driven semantic mashup of gene and biological pathway information: application to the domain of nicotine dependence. J Biomed Inform 2008 Oct;41(5):752-765. [CrossRef] [Medline]
- Belleau F, Nolin MA, Tourigny N, Rigault P, Morissette J. Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J Biomed Inform 2008 Oct;41(5):706-716. [CrossRef] [Medline]
- Chen B, Dong X, Jiao D, Wang H, Zhu Q, Ding Y, et al. Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data. BMC Bioinformatics 2010;11:255 [FREE Full text] [CrossRef] [Medline]
- Miñarro-Gimenez JA, Egaña Aranguren M, Martínez Béjar R, Fernández-Breis JT, Madrid M. Semantic integration of information about orthologs and diseases: the OGO system. J Biomed Inform 2011 Dec;44(6):1020-1031. [CrossRef] [Medline]
- Manola F, Miller E, RDF Core Working Group. World Wide Web Consortium. 2004 Feb 10. RDF Primer URL: http://www.w3.org/TR/2004/REC-rdf-primer-20040210/ [accessed 2012-02-05] [WebCite Cache]
- Corwin J, Silberschatz A, Miller PL, Marenco L. Dynamic tables: an architecture for managing evolving, heterogeneous biomedical data in relational database management systems. J Am Med Inform Assoc 2007 Feb;14(1):86-93 [FREE Full text] [CrossRef] [Medline]
- Prud'hommeaux E, Seaborne A. World Wide Web Consortium. 2008 Jan 15. SPARQL Query Language for RDF URL: http://www.w3.org/TR/rdf-sparql-query/ [accessed 2012-01-02] [WebCite Cache]
- McGuinness DL, van Harmelen F, OWL Working Group. World Wide Web Consortium. 2004 Feb 10. OWL Web Ontology Language URL: http://www.w3.org/TR/owl-features/ [accessed 2012-02-05] [WebCite Cache]
- Schwaber K, Beedle M. Agile Software Development With Scrum. Upper Saddle River, NJ: Prentice Hall; 2002.
- Millar MR. Tackling antibiotic resistance. International action required. BMJ 2010;340:c2978. [Medline]
- Piwowar HA, Becich MJ, Bilofsky H, Crowley RS, caBIG Data Sharing and Intellectual Capital Workspace. Towards a data sharing culture: recommendations for leadership from academic health centers. PLoS Med 2008 Sep 30;5(9):e183 [FREE Full text] [CrossRef] [Medline]
- Cruz IF, Xiao H. Ontology driven data integration in heterogeneous networks. Stud Comput Intell 2009(168):75-98. [CrossRef]
- Schober D, Boeker M, Bullenkamp J, Huszka C, Depraetere K, Teodoro D, et al. The DebugIT core ontology: semantic integration of antibiotics resistance patterns. Stud Health Technol Inform 2010;160(Pt 2):1060-1064. [Medline]
- Teodoro D, Choquet R, Pasche E, Gobeill J, Daniel C, Ruch P, et al. Biomedical data management: a proposal framework. Stud Health Technol Inform 2009;150:175-179. [Medline]
- Teodoro D, Choquet R, Schober D, Mels G, Pasche E, Ruch P, et al. Interoperability driven integration of biomedical data sources. Stud Health Technol Inform 2011;169:185-189. [Medline]
- European Commission. 2010 Nov 4. A Comprehensive Approach on Personal Data Protection in the European Union URL: http://ec.europa.eu/justice/news/consulting_public/0006/com_2010_609_en.pdf [accessed 2012-01-03] [WebCite Cache]
- Gamma E, Helm R, Johnson R, Vlissides J. Design Patterns: Elements of Reusable Object-Oriented Software. 1st edition. Reading, MA: Addison-Wesley; 1995.
- Timpka T, Eriksson H, Gursky EA, Strömgren M, Holm E, Ekberg J, et al. Requirements and design of the PROSPER protocol for implementation of information infrastructures supporting pandemic response: a Nominal Group study. PLoS One 2011;6(3):e17941 [FREE Full text] [CrossRef] [Medline]
- Cheng CK, Ip DK, Cowling BJ, Ho LM, Leung GM, Lau EH. Digital dashboard design using multiple data streams for disease surveillance with influenza surveillance as an example. J Med Internet Res 2011;13(4):e85 [FREE Full text] [CrossRef] [Medline]
- Lung KR, Gorko MA, Llewelyn J, Wiggins N. Statistical method for the determination of equivalence of automated test procedures. J Autom Methods Manag Chem 2003;25(6):123-127. [CrossRef] [Medline]
- Johnston RJ, Duke JM. Benefit transfer equivalence tests with non-normal distributions. Environ Resour Econ 2007;41(1):1-23. [CrossRef]
- Bizer C. D2R MAP-A DB to RDF Mapping Language. In: Proceedings. 2003 Presented at: 12th International World Wide Web Conference; May 24-24, 2003; Budapest, Hungary.
- Ruch P, Gobeill J, Lovis C, Geissbühler A. Automatic medical encoding with SNOMED categories. BMC Med Inform Decis Mak 2008;8 Suppl 1:S6 [FREE Full text] [CrossRef] [Medline]
- Ruch P. Automatic assignment of biomedical categories: toward a generic approach. Bioinformatics 2006 Mar 15;22(6):658-664 [FREE Full text] [CrossRef] [Medline]
- W3C Semantic Web Deployment Working Group. World Wide Web Consortium. 2012. SKOS Simple Knowledge Organization System URL: http://www.w3.org/2004/02/skos/ [accessed 2012-02-05] [WebCite Cache]
- Goble C, Stevens R. State of the nation in data integration for bioinformatics. J Biomed Inform 2008 Oct;41(5):687-693. [CrossRef] [Medline]
- Momtchev V, Peychev D, Primov T, Georgiev G. International Semantic Web Challenge. 2009. Expanding the Pathway and Interaction Knowledge in Linked Life Data URL: http://www.cs.vu.nl/~pmika/swc/documents/Linked%2520Life%2520Data-LLD%2520semantic%2520web%2520challenge%25202009.pdf [accessed 2012-05-09] [WebCite Cache]
- Morsey M, Lehmann J, Auer S, Ngomo A. DBpedia SPARQL benchmark: performance assessment with real queries on real data. In: Proceedings. 2011 Presented at: 10th International Semantic Web Conference; Oct 23-27, 2011; Bonn, Germany p. 454-469.
- Schmidt M, Hornung T, Lausen G, Pinkel C. SP^2Bench: a SPARQL performance benchmark. In: Proceedings. 2009 Presented at: IEEE 25th International Conference on Data Engineering (ICDE '09); Mar 29-Apr 2, 2009; Shanghai, China. [CrossRef]
- Lee F, Teich JM, Spurr CD, Bates DW. Implementation of physician order entry: user satisfaction and self-reported usage patterns. J Am Med Inform Assoc 1996 Feb;3(1):42-55 [FREE Full text] [Medline]
- Reynolds R, Hope R, Williams L, BSAC Working Parties on Resistance Surveillance. Survey, laboratory and statistical methods for the BSAC Resistance Surveillance Programmes. J Antimicrob Chemother 2008 Nov;62 Suppl 2:ii15-ii28 [FREE Full text] [CrossRef] [Medline]
- Calvanese D, Giacomo G, Lenzerini M. Ontology of integration and integration of ontologies. In: Proceedings. 2001 Presented at: 2001 International Workshop on Description Logic; Aug 1-3, 2001; Stanford, CA, USA.
|ARTEMIS: Antimicrobial Resistance Trend Monitoring System|
|CI: confidence interval|
|DebugIT: Detecting and Eliminating Bacteria Using Information Technology|
|EARS-Net: European Antimicrobial Resistance Surveillance Network|
|lCDR: local clinical data repository|
|RDF: Resource Description Framework|
|SEARCH: Sentinel Surveillance of Antibiotic Resistance in Switzerland|
|SKOS: Simple Knowledge Organization System|
|SNOMED-CT: Systematized Nomenclature of Medicine-Clinical Terms|
|SPARQL: Simple Protocol and RDF Query Language|
|WHO: World Health Organization|
|WHO-ATC: World Health Organization- Anatomical Therapeutic Chemical|
|Edited by G Eysenbach; submitted 07.01.12; peer-reviewed by L Balkanyi, T Timpka; comments to author 29.01.12; revised version received 05.03.12; accepted 29.04.12; published 29.05.12|
Please cite as:
Teodoro D, Pasche E, Gobeill J, Emonet S, Ruch P, Lovis C
Building a Transnational Biosurveillance Network Using Semantic Web Technologies: Requirements, Design, and Preliminary Evaluation
J Med Internet Res 2012;14(3):e73
END, compatible with Endnote
BibTeX, compatible with BibDesk, LaTeX
RIS, compatible with RefMan, Procite, Endnote, RefWorks
Add this article to your Mendeley library
Add this article to your CiteULike library
Copyright©Douglas Teodoro, Emilie Pasche, Julien Gobeill, Stéphane Emonet, Patrick Ruch, Christian Lovis. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 29.05.2012.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.