Building a Transnational Biosurveillance Network Using Semantic Web Technologies: Requirements, Design, and Preliminary Evaluation

Background Antimicrobial resistance has reached globally alarming levels and is becoming a major public health threat. Lack of efficacious antimicrobial resistance surveillance systems was identified as one of the causes of increasing resistance, due to the lag time between new resistances and alerts to care providers. Several initiatives to track drug resistance evolution have been developed. However, no effective real-time and source-independent antimicrobial resistance monitoring system is available publicly. Objective To design and implement an architecture that can provide real-time and source-independent antimicrobial resistance monitoring to support transnational resistance surveillance. In particular, we investigated the use of a Semantic Web-based model to foster integration and interoperability of interinstitutional and cross-border microbiology laboratory databases. Methods Following the agile software development methodology, we derived the main requirements needed for effective antimicrobial resistance monitoring, from which we proposed a decentralized monitoring architecture based on the Semantic Web stack. The architecture uses an ontology-driven approach to promote the integration of a network of sentinel hospitals or laboratories. Local databases are wrapped into semantic data repositories that automatically expose local computing-formalized laboratory information in the Web. A central source mediator, based on local reasoning, coordinates the access to the semantic end points. On the user side, a user-friendly Web interface provides access and graphical visualization to the integrated views. Results We designed and implemented the online Antimicrobial Resistance Trend Monitoring System (ARTEMIS) in a pilot network of seven European health care institutions sharing 70+ million triples of information about drug resistance and consumption. Evaluation of the computing performance of the mediator demonstrated that, on average, query response time was a few seconds (mean 4.3, SD 0.1×102 seconds). Clinical pertinence assessment showed that resistance trends automatically calculated by ARTEMIS had a strong positive correlation with the European Antimicrobial Resistance Surveillance Network (EARS-Net) (ρ = .86, P < .001) and the Sentinel Surveillance of Antibiotic Resistance in Switzerland (SEARCH) (ρ = .84, P < .001) systems. Furthermore, mean resistance rates extracted by ARTEMIS were not significantly different from those of either EARS-Net (∆ = ±0.130; 95% confidence interval –0 to 0.030; P < .001) or SEARCH (∆ = ±0.042; 95% confidence interval –0.004 to 0.028; P = .004). Conclusions We introduce a distributed monitoring architecture that can be used to build transnational antimicrobial resistance surveillance networks. Results indicated that the Semantic Web-based approach provided an efficient and reliable solution for development of eHealth architectures that enable online antimicrobial resistance monitoring from heterogeneous data sources. In future, we expect that more health care institutions can join the ARTEMIS network so that it can provide a large European and wider biosurveillance network that can be used to detect emerging bacterial resistance in a multinational context and support public health actions.


Mapping Ontologies
To align biomedical terminologies and locally defined concepts coming from the legacy system of the participant sites with the domain ontology, semantic mappings were created using the SKOS ontology ( Figure 5-b). At the global level, mappings from SNOMED-CT, WHO-ATC and UniProt terminologies were designed. In case of using other biomedical terminologies or even local terminologies, local mappings need to be provided by each site. This step is important, as it can easily be adapted to support local needs and evolutions.

Data Model Layer
In the architecture's data layer, local microbiology laboratory databases are converted into semantic endpoints. This is achieved through a fully semantic-complying clinical data repository, the local Clinical Data Repository (lCDR), as further described in [34,35]. The lCDRs are set-up within the de-militarized zone of each site participant in the DebugIT network. The lCDR, as well as other DebugIT services, expose data in the RDF format and communicate using the SPARQL protocol. It is the interface of the DebugIT services to the data providers.

Controller Layer
The semantic mediation process is performed in the controller layer. The query mediator defines, for each lCDR, SPARQL representations of a limited set of AMR clinical questions presented in the view layer. The clinical question SPARQL queries are built as templates, which are parameterized queries using DCO concepts. Assigning values to a clinical question template results in a new SPARQL query. For example, the template "What is the antimicrobial resistance evolution to :antibiotic of :pathogen cultured from :sample_origin from :begin_date to :end_date?" might be instantiated as "What is the antimicrobial resistance evolution to cefepime of Escherichia coli cultured from blood sample from 2011-01-01 to 2011-12-31?". Thus, a template represents an infinite number of queries.
At the query run-time, templates expressed through global concepts are translated into local SPARQL queries with terms from the local ontologies. The query parameters are expanded employing the hierarchical information modeled in the domain ontology and are translated to local terms using the semantic mappings. For example, the DCO concept "3rd generation cephalosporin" shown in Figure 9 is expanded to its DCO subclasses, which are further mapped to local DDO terms. In order to optimize network performance and reinforce patient confidentiality, aggregation operations are pushed down to the lCDRs. The SPARQL operators COUNT and GROUP BY are used to perform local result aggregation. Results are fetched respecting the query filter constraints, which perform logical disjunction operations for the expanded parameters. An inverse process is performed on the results retrieved: local terms are translated to global terms, which are aggregated in the root concept, i.e.,"3rd generation cephalosporin" in the example.

View Layer
Finally, the view layer provides methods for users to interact with the system. It implements two main modules: querying input and data visualization. The querying input interface presents a set of clinical question templates and the Interface Ontology input menu, which is used to fill in the template parameters. To improve usability and user-friendliness, query templates are expressed in natural language as in the template "What is the prevalence of :antibiotic :susceptibility :pathogen in :sample extracted from :gender patients at :clinical_setting during period :begin_date -:end_date?". The visualization module provides functions to extract trends, cumulative sum and other statistics from the data retrieved. Ultimately, it implements a set of charts in order to cover comprehensively the interpretation of the data.