Automating the generation of antimicrobial resistance surveillance reports: a proof-of-concept study in seven hospitals in seven countries

Background: Reporting cumulative antimicrobial susceptibility testing data on a regular basis is crucial to inform antimicrobial resistance (AMR) action plans at local, national and global levels. However, analysing data and generating a report are time-consuming and often require trained personnel. We illustrate the development and utility of an offline, open-access and automated tool that can support the generation of AMR surveillance reports promptly at the local level. Method: An offline application to generate standardized AMR surveillance reports from routinely available microbiology and hospital data files was written in the R programming language. The application can be run by a double-click on the application file without any further user input. The data analysis procedure and report content were developed based on the recommendations of the World Health Organization Global Antimicrobial Resistance Surveillance System (WHO GLASS). The application was tested in Microsoft Windows 10 and 7 using open-access example data sets. We then independently tested the application in seven hospitals in Cambodia, Lao People's Democratic Republic (PDR), Myanmar, Nepal, Thailand, the United Kingdom, and Vietnam. Findings: We developed the AutoMated tool for Antimicrobial resistance Surveillance System (AMASS), which can support clinical microbiology laboratories to analyse their microbiology and hospital data files (in CSV or Excel format) onsite and promptly generate AMR surveillance reports (in PDF and Excel formats). The data files could be those exported from WHONET and/or other laboratory information systems. The automatically generated reports contain only summary data without patient identifiers. The AMASS application is downloadable from www.amass.website. The participating hospitals tested the application and deposited their AMR surveillance reports in an open-access data repository. Interpretation: The AMASS application can be a useful tool to support the generation and sharing of AMR surveillance reports.


INTRODUCTION
Generating and sharing antimicrobial resistance (AMR) surveillance reports are fundamental elements of actions against AMR infections at local, national and international levels.
Information on patterns of antimicrobial susceptibility is important to guide empiric choice of systems with restricted access. Even in high-income countries, many hospitals lack welltrained clinical microbiologists, epidemiologists or data experts with adequate skills in Surveillance System (AMASS), which can support a local hospital to independently analyse 146 routinely collected electronic data and rapidly generate AMR surveillance reports. We tested  The tool operates by reading and processing the raw data files to automatically produce 155 AMR surveillance reports. To ensure the tool is fully open access, we built the application in 156 R (version 3.6.2), which is a free software environment. We then placed the application 157 within a user-friendly interface, which only requires a double-click on the application file to 158 run the automation without the need to understand the R program. We included both the R

176
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 1, 2020.

213
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 1, 2020.

248
reports and anonymous summary data contain no patient identifiers. Therefore, users may 249 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 1, 2020. . https://doi.org/10.1101/2020.04.27.20072025 doi: medRxiv preprint share the reports and anonymous summary data with national and international 250 organizations or make the reports and anonymous summary data open access.

Features of AMASS
application icon to perform data processing and analysis (i.e. user-friendly); 3) reads CSV 257 and EXCEL files that can be exported from WHONET, LIS, or electronic health record (eHR) 258 systems (i.e. highly compatible); and 4) can run on stand-alone and offline computers in the 259 hospitals under their local data protection standards. All data processing and report 260 generation are done locally, no raw data are shared, and final reports and output summary 261 data files contain no patient identifier (i.e. high data security).

263
Moreover, AMASS supports the rapid use of AMR surveillance data at the local level. The 264 readily printable report in PDF format can be reviewed and validated by non-statisticians.

265
When errors are found in the raw data files (e.g. incomplete data) or the data dictionary files 266 (e.g. typos), the application can be promptly re-run after fixing the errors. The summary 267 statistics shown in the PDF report are also saved in the aggregate summary data file (in 268 common-separated value format) ready for re-use.

282
There are two data dictionary files provided for the users to accommodate different ways of 283 naming data variables and data values (Figure 2). The first data dictionary file 284 (dictionary_for_microbiology_data.xls) is for the microbiology data file. For example, the 285 AMASS uses the variable name "hospital_number" as a patient identifier (Row 3, Column A 286 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 1, 2020. of the data dictionary file). In cases where a raw microbiology data file uses a different name 287 for the patient identifier (e.g. hn), users would need to fill "hn" in the data dictionary file (Row variable "hn" of the raw microbiology data file is the patient identifier (i.e. "hospital_number").

290
The second data dictionary file (dictionary_for_hospital_admission_data.xls) is for the 291 hospital admission data file, which is to be used likewise. Supplementary video 2 is a step-292 by-step tutorial on how to use and configure the data dictionaries.

294
Outputs generated by AMASS

295
We illustrate the AMR surveillance reports generated from AMASS using the two open-

319
AMASS also generates two log files. The first log file (generated in PDF format) is for the 320 users to validate the input data used by AMASS to generate the AMR surveillance report. It 321 contains information such as the total number of records analysed, age distribution, number

322
of missing values and total number of isolates per organism in the raw microbiology data file.

323
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 1, 2020.

328
AMASS was tested in seven hospital in seven countries (Figure 4). The hospitals varied in (Supplementary Table 2    CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 1, 2020. . https://doi.org/10.1101/2020.04.27.20072025 doi: medRxiv preprint microbiology data was exported from WHONET 5.6). This saved time and efforts needed to 360 complete the data dictionaries for the hospital.

362
We found that the AMASS took about 1 to 3 minutes to run and automatically generate an 363 AMR surveillance report using the local data and local hospital computers. AMASS works in characters) language. For example, "blood_specimen" was recorded as "Cấy máu

465
AMASS has a number of limitations. Firstly, AMASS is not applicable for hospitals that only 466 store data on paper forms. Secondly, AMASS cannot work with raw microbiology data file 467 that are not in wide format or combine many files in multiple formats; for example, the raw 468 microbiology data file where each row contains data of each antibiotic susceptibility result for 469 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 1, 2020. Step 2 (obtain data), step 4 (run AMASS), step 5 (review report), 526 and step 6 (share report) are ongoing steps that users could repeat regularly (i.e. monthly, 527 quarterly). *Two data dictionary files (in Excel format) are provided to allow the application to  Footnote of Figure 2. For a first-time user, the user may need to complete a data dictionary 542 file, by filling in variables names used in their raw data files into the data dictionary files (e.g. 543 arrow A). This is to allow AMASS to understand that the variable "hospital_number" used by 544 AMASS is named as "hn" in the user's raw microbiology data file. Then, users need to enter 545 how data values are named in their raw data files (e.g. arrow B). This is to allow AMASS to 546 understand that the data value named "blood_specimen" is named as "blood" in user's raw 547 microbiology data file. Please note that the contents in the first column of the data dictionary 548 file must remain unchanged. Users can add new rows but the content in the cell in the first 549 column must not be changed. For example, users can define that both "E. coli (ESBLs 550 producing strain)" and "Escherichia coli" in their raw microbiology data file means    Figure 3A represents the overall proportion of non-susceptible (intermediate and 559 resistant) isolates in an isolate-based report (section two in the report). Figure 3B represents 560 proportion of non-susceptible isolates stratified by origin of infection (section three in the 561 report). Figure 3C represents the frequency of bloodstream infections per 100,000 tested 562 patients (section four in the report). Figure 3D represents mortality involving antimicrobial-563 resistant and antimicrobial-susceptible bloodstream infections (section six in the report).

564
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 1, 2020.  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 1, 2020. Data cleaning, de-duplication, and analysis are performed rapidly (it took about 1-3 minutes to automatically produce an AMR surveillance report using example data sets provided in the AMASS package).
No additional program or software is needed. All the essential software is stored in the AMASS package and will operate automatically after a double click the application file (AMASS.bat). Users do not need to understand R program or write any codes to run the AMASS application. Highly-compatible AMASS works with raw data files in either CSV or Excel format, which can be commonly exported from WHONET and other software, program or data management systems used for microbiology data and hospital admission data. AMASS uses data dictionary files (in Excel format) to accommodate data exported from different software, program or systems that may have different ways to name data variables and data values ( Figure 2). AMASS dictionary files can be re-used by users in the future (e.g. monthly, quarterly and yearly) if the structures of the new raw microbiology data file and hospital admission data file remain unchanged. AMASS uses a tier-based approach based on availability of raw data files to generate reports. Users with limited data availabilities (e.g. microbiology data with only culture positive results) can still utilize de-duplication and report generation functions of the AMASS. Users with additional data (e.g. microbiology data with culture negative results and hospital admission date data) will receive additional reports (e.g. sample-based surveillance report with stratification by infection origin). High data security AMASS does not require the Internet to operate. Users do not have to transfer raw individual data (which may contain identifiable information) to any institution outside . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 1, 2020. . https://doi.org/10.1101/2020.04.27.20072025 doi: medRxiv preprint of the hospital to analyse the data and generate the reports. AMASS can be run within a standalone computer within the local hospital under the local data security. Hence, the AMASS does not increase any risks of breaching individual patient data confidentiality.

Easy-to-use outputs
The automatically generated AMR surveillance report is in PDF format, which is easy to print, read and share within and outside the hospitals. Easy-to-share outputs The report (in PDF format) and aggregated summary data files (in CSV format) contains no individual level patient data, and could be readily shared with national and international organizations.
576 577 578 579 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 1, 2020. . https://doi.org/10.1101/2020.04.27.20072025 doi: medRxiv preprint