Discovering Clinical Information Models Online to Promote Interoperability of Electronic Health Records: A Feasibility Study of OpenEHR

doi:10.2196/13504

Original Paper

Institute of Medical Information / Medical Library, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China

Corresponding Author:

Jiao Li, PhD

Institute of Medical Information / Medical Library

Chinese Academy of Medical Sciences & Peking Union Medical College

No 3 Yabao Road, Chaoyang District

Beijing, 100020

China

Phone: 86 18618461596

Email: li.jiao@imicams.ac.cn

Background: Clinical information models (CIMs) enabling semantic interoperability are crucial for electronic health record (EHR) data use and reuse. Dual model methodology, which distinguishes the CIMs from the technical domain, could help enable the interoperability of EHRs at the knowledge level. How to help clinicians and domain experts discover CIMs from an open repository online to represent EHR data in a standard manner becomes important.

Objective: This study aimed to develop a retrieval method to identify CIMs online to represent EHR data.

Methods: We proposed a graphical retrieval method and validated its feasibility using an online CIM repository: openEHR Clinical Knowledge Manager (CKM). First, we represented CIMs (archetypes) using an extended Bayesian network. Then, an inference process was run in the network to discover relevant archetypes. In the evaluation, we defined three retrieval tasks (medication, laboratory test, and diagnosis) and compared our method with three typical retrieval methods (BM25F, simple Bayesian network, and CKM), using mean average precision (MAP), average precision (AP), and precision at 10 (P@10) as evaluation metrics.

Results: We downloaded all available archetypes from the CKM. Then, the graphical model was applied to represent the archetypes as a four-level clinical resources network. The network consisted of 5513 nodes, including 3982 data element nodes, 504 concept nodes, 504 duplicated concept nodes, and 523 archetype nodes, as well as 9867 edges. The results showed that our method achieved the best MAP (MAP=0.32), and the AP was almost equal across different retrieval tasks (AP=0.35, 0.31, and 0.30, respectively). In the diagnosis retrieval task, our method could successfully identify the models covering “diagnostic reports,” “problem list,” “patients background,” “clinical decision,” etc, as well as models that other retrieval methods could not find, such as “problems and diagnoses.”

Conclusions: The graphical retrieval method we propose is an effective approach to meet the uncertainty of finding CIMs. Our method can help clinicians and domain experts identify CIMs to represent EHR data in a standard manner, enabling EHR data to be exchangeable and interoperable.

J Med Internet Res 2019;21(5):e13504

doi:10.2196/13504

Keywords

openEHR; clinical information model; health information interoperability; information retrieval; probabilistic graphical model

Electronic health record (EHR) data can be used and reused for many purposes, including managing an individual patient’s care, medical and health services research, and management of health care facilities. More recently, EHR data has been defined as a part of real-world data [1] and is increasingly seen as a viable source of data for regulatory decisions [2]. However, bias can occur in different steps of the data chain, which might lead to incomparable or invalid analysis results [3].

Semantic interoperability is essential for accurate and advanced health-related computing, shared EHRs, and coordination of clinical care across clinical systems [4,5]. According to ISO/TS 18308 (a standard published by the International Organization for Standardization defining the set of requirements for EHR architecture), it is the ability for data shared by systems to be understood at the level of fully defined domain concepts [6]. To achieve this, a two-level clinical modeling methodology is proposed to separate clinical knowledge from information models [7]. It distinguishes two models: the reference model (RM), which contains the basic and stable properties of health record information, and the clinical information model (CIM), which formally defines clinical concepts (or domain content models) in a standardized and reusable manner, such as blood pressure [8,9]. In this scenario, CIMs in agreement at an organizational, regional, national, or international level will provide a firm basis for establishing semantic interoperability [9].

This two-level modeling approach is used in the ISO/CEN EN13606 (a standard designed to achieve semantic interoperability in EHR communication) [10] and openEHR (described subsequently) [11], as well as Health Level Seven (HL7) version 3 Clinical Document Architecture (HL7's primary standard for representing structured clinical documentation on patients) and Care Provision messages (information structures used to communicate information between providers of care) [12]. For openEHR and ISO/CEN EN13606, CIMs are defined in the form of archetypes, whereas those of HL7 are in the form of HL7 templates. According to the systematic review done by Moreno-Conde et al [13], archetypes are the preferred type of technical artifacts, and openEHR is most frequently mentioned. Therefore, CIMs in our study specifically refer to openEHR archetypes.

OpenEHR is an open-source EHR standard ensuring universal interoperability among all forms of electronic data [14-21]. It is well known for its two-level design paradigm, consisting of an RM, archetypes, and templates. Archetypes are computable clinical content specifications that formalize the patterns and requirements for the representation of health information content [9]. To achieve common, coherent, and clinician-approved archetypes, the openEHR community provides a Web-based controlled authoring environment for a wide range of domain experts, especially clinicians, to participate in the creation of archetypes. All contributions are open access and freely available under a Creative Commons license. Archetypes are general purpose, reusable, and composable; therefore, searching for reusable archetypes from archetype repositories is essential throughout the development process [22,23]. Documents with complete archetype design specifications are the input; lists of existing reusable archetypes, either complete or needing modifications, and new archetypes to be developed from scratch are the output [23]. The crucial problem is how to find the relevant ones from open repositories to help identify reusable archetypes.

The openEHR community provides the Clinical Knowledge Manager (CKM) [24] to be a library of openEHR archetypes. It supports their retrieval based on clinical concepts in different sections of archetypes. When the end user enters a term, the CKM will return the archetype that contains the word in metadata, definition, or ontology section. It could help find reusable archetypes [25]. However, domain experts are mainly concerned about whether the concept name and core data items are covered [17,26,27], and they may be not familiar with openEHR archetypes, especially clinicians. For better results, end users usually need to do a large amount of preparatory work, which may include classifying and rearranging data [27], abstracting clinical concepts from data schemas [17], and identifying archetype-friendly concepts from clinical statements [26]. It is an iterative and time-consuming process.

We aimed to develop a retrieval method to identify archetypes online to represent EHR data and optimize existing retrieval results of the CKM. Archetypes usually have their own hierarchical structures, and semantic relationships occur between different archetypes; therefore, we considered that the graphical representation of this potential knowledge might support the retrieval of CIMs. Previous studies show that graphs could efficiently represent clinical knowledge [28-30], and the Bayesian network, as a probabilistic graphical model, is an effective methodology to meet the uncertainty of information needs. Rotmensch et al [30] used a naive Bayes classifier and a Bayesian network to automatically construct a health knowledge graph from electronic medical records. However, in retrieval tasks, differences between Bayesian network-based information retrieval methods mainly lie in the structure of the network, and this structure depends on dependencies between the variables involved in the problem. The basic Bayesian network consists of two different sets of variables, a set of indexing terms and a set of documents in the collection, and the relationships between them [31]. Related research has been conducted to extend a simple Bayesian network for better results. Some methods focus on the structure of the term subnetwork using a polytree [32,33] or two term layers [34,35] to represent term relationships. Some focus on the structure of the document subnetwork using two document layers [36] to represent document relationships. Compared with the previous studies, we focused on the probabilistic graphical representation of openEHR archetype sets, which depends on relationships between the variables involved in finding relevant archetypes, and how the inference process is carried out, aiming for better retrieval performance.

Information Need Analysis

To find relevant archetypes from the open repository, we first had to understand which kinds of terms end users tended to enter. As archetype modeling methodology [23] shows, domain experts identify core clinical concepts and related data elements involved in a particular scenario and organize them into mind maps or design tables. These archetype design specifications are the main source of search keywords. We considered that the input of end users was mainly the names of clinical concepts or related data elements.

Ideally, the user enters the clinical concept and the system feeds back the archetype defining the concept, or the user enters data elements related to a concept and the system feeds back the archetype that covers all the data elements. However, it is difficult to distinguish clinical concepts and data elements from the end user’s input, unless it forces users to input separately. More importantly, data elements defined by end users may be the concept in an archetype repository, or the defined concept is the data element of an archetype. If we match concepts and data elements separately, users may miss some important relevant archetypes.

Based on these considerations, we tried to translate the problem into identifying potentially relevant clinical concepts from the input. We proposed to reorganize the archetype collection with the dependencies between clinical concepts, data elements, and archetypes and used a probabilistic approach to meet the uncertainty of user information needs.

Graphical Retrieval Method Based on an Extended Bayesian Network

Archetype Feature Identification and Extraction

Based on information need analysis, we attempted to use clinical concepts and data elements to represent each archetype. An archetype is expressed in Archetype Definition Language (ADL) and mainly consists of three sections (Figure 1). The header contains a unique identifier for the archetype and includes some descriptive information, such as concept name and keywords; the definition contains the main formal definition of the archetype, including all possible data elements that could be relevant for the clinical concept; and the ontology contains the code that represents the meaning of nodes. We considered that clinical concepts were the topics of archetypes, whereas keywords and data elements explained the meaning of topics from different perspectives. Thus, we extracted archetype ID, concepts, keywords, and data elements based on ADL files parsing as features (Figure 1).

There are also relationships between archetypes, including specialization and aggregation. An archetype is a specialization of another if it mentions that archetype as its parent and only makes changes to its definition. Aggregation enables any subset of archetypes to be stated as the allowed set for use in a compositional parent archetype. In general, archetypes tend to provide highly reusable models of real-world content with local constraining left to templates, which may result in matching as many archetypes as possible when defining archetype slots. For example, “openEHR-EHR-CLUSTER.device_details.v1” allows the inclusion of 199 archetypes. We thought that such cases might blur the semantic relationship between archetypes. In addition, version control is an integral part of the openEHR architecture. When an archetype updates, the old version could not be found in the archetype library. Therefore, we only added the parent archetype ID as the feature (Figure 1).

Furthermore, there are four main categories of archetypes, including COMPOSITION, SECTION, ENTRY, and CLUSTER, each defined as part of the openEHR RM. A COMPOSITION is a container class, whereas a SECTION is an organizing class, each containing ENTRY objects [16]. The ENTRY class is further specialized into ADMIN_ENTRY, OBSERVATION, EVALUATION, INSTRUCTION, and ACTION subclasses, of which the latter four are kinds of CARE_ENTRY. CLUSTERS are reusable archetypes for use within any ENTRY or other CLUSTER. In addition, the openEHR designs Demographic archetypes for demographic information. Thereby, archetypes could be mainly divided into COMPOSITION, SECTION, ENTRY, CLUSTER, and DEMOGRAPHIC. However, these archetype categories will not obscure the clinical content, and we did not use these as the feature.

Figure 1. An example of archetype feature identification and extraction.

Clinical Resources Network Modeling

We attempted to use a three-level Bayesian network to represent the dependencies among data elements, concepts, and archetypes (Figure 2). The first is the data element layer. It contains the set of indexing data elements T={T_i, i=1...M}, M being the number of data elements from a given archetype collection. Each data element node is linked to its corresponding concept node in the clinical concept layer. The second is the clinical concept layer. It contains the set of indexing concepts C={C_j, j=1...N}, N being the number of concepts. The third layer contains the set of archetypes A={A_k, k=1...K}, K being the total number of archetypes in the collection. If A_k is a specialization of another archetype A_p which defines C_j, there is a link joining any concept node C_j and any archetype node A_k.

However, data elements are unevenly distributed across different types of archetypes, especially for container classes. When two archetypes have few data elements and terms used are totally different, such as “openEHR-EHR-COMPOSITION .medication_list.v0” and “openEHR-EHR-SECTION.medication _order_list.v0,” it is difficult to find correlation between them.

Therefore, we tried to include relationships between concepts in the model to extend the similarity between archetypes. Relationships between concepts were measured by estimating conditional probabilities of relevance of every concept given that another concept was considered relevant [36]. Let e (C_i) be an event representing some type of evidence about the relevance of a concept C_i. In openEHR, the evidence could be “keywords,” “purpose,” “use,” or other semantic information. In this case, we considered that e (C_i) as the event [KW_l= kw_l, ∀ KW_l∈ C_i], KW being the keywords used to describe the concept. Given a concept C_j, we calculated the probabilities p (c_j| e (C_i)) ∀ C_i∈ C using equation (a) in Figure 3, where the weight was computed by equation (d) in Figure 3 and M_k was the number of keywords. After decreasing the ordering of p(c_j|e(C_i)), the top n concepts R_n(C_j) were the ones that were more related to C_j. Then, we included in the network-explicit dependence relationships between C_j and each concept C_i∈R_n(C_j).

To determine the topology of the Bayesian network, we used a concept subnetwork with two layers instead of the original concept layer. We duplicated each concept node C_j to obtain another concept node C^ʹ_j, thus forming a new concept layer, and the arcs connecting the two layers went from C_i∈R_n(C_j) to C^ʹ_j. Thus, this directed acyclic graph had the set of variables V=T∪C∪C^ʹ∪A. The new topology avoids connections between nodes in the same layer and facilitates the inference process.

The overall modeling procedure is summarized in Figure 4. First, we extracted archetype ID, clinical concept, and data elements from the ADL files (detailed in section “archetype feature identification and extraction”). Second, we learned the dependencies between concepts (detailed previously). Third, we graphically represented the dependencies between the variables.

Parameters Estimation in the Clinical Resources Network

In this section, we will discuss how to estimate the probability distributions of each node in the network.

Data Element Nodes

A data element node has no parents; therefore, we had to store the probability of relevance p (t_i) and the probability of being nonrelevant. We used the estimator (Figure 3, equation b), where M is the number of terms used to index the concept collection.

Figure 2. Topology of three-level clinical resources network. A: archetype; C: clinical concept; T: data element.

Figure 4. Clinical resources network modeling pipeline. A: archetype; C: clinical concept; Cʹ: duplicated clinical concept; T: data element.

Concept Nodes

For each concept node C_j in the concept subnetwork, we needed to estimate a set of conditional probability distributions p (c_j|pa(C_j)). Pa(C_j) represents the parent nodes set of concept C_j, containing all the data elements belonging to concept C_j, and pa (C_j) is a possible configuration of value associated with the parent set Pa(C_j). We used the estimator (Figure 3, equations c and d) proposed by De Campos et al [33], where α is a normalizing constant (assure ∑_Ti_∈_Pa(Cj)w_ij≤1 ∀ C_j∈C), tf_ij is the term frequency of data element T_i in concept C_j, and idf_i is the inverse concept frequency of T_i in the whole concept collection; idf_i = 1 + log (N / n_i), N being the total number of concepts, and n_i being the total number of concepts containing T_i.

For each concept node C^ʹ_j, we need to estimate a set of conditional probability distributions p(c^ʹ_j|pa(C^′_j)). We used the estimator (Figure 3, equation e) proposed by Acid et al [36], where S_j = ∑_Ck_∈_Pa(C′j)p(c_j|e(C_k)) and the values p(c_j|e(C_k)) are obtained when modeling the network.

Archetype Nodes

For each archetype node A_k, we needed to estimate a set of conditional probability distributions p(a_k| pa(A_k)). Pa(A_k) represents the parent node sets of archetype A_k, containing all the concepts belonging to archetype A_k, and pa(A_k) is a possible configuration of values associated with the parent set Pa (A_k). v_jk is a constant to represent the weight of a concept for an archetype. The estimator is shown in Figure 3, equations (f) and (g), where R(Pa(A_k), A_k) represents two different relationships between the concept and archetype, n₁ is the number of “nonspecialized” archetypes of one concept, and n₂ is the number of “specialized” archetypes, whereas α and β are coefficients for the weight.

Relevant Archetype Discovering: Inference in the Clinical Resources Network

To find relevant archetypes is to estimate the probability of relevance p (a_k|Q) for each archetype, Q being an end user query.

Given a query Q, the set of terms used to formulate the query will be a new piece of evidence. The retrieval process starts by placing the evidence in the data element subnetwork. Then, the inference process is run in the clinical resources network. This allows us to obtain the probability of relevance of each archetype, given that the terms in the query are relevant, p (a_k|Q). Finally, the archetypes will be sorted in decreasing order of probability to carry out the evaluation process. The inference process is composed of four stages.

Terms in the data element layer are marginally independent; therefore, the probability of relevance p(t_i|Q) is calculated by equation (h) in Figure 3.
Based on the propagation process, the conditional probability of concept C_j in the concept subnetwork for the query Q could be calculated by equation (i) in Figure 3.
The conditional probability of concept C^ʹ_j in the concept subnetwork for the query Q could be computed using equation (g) in Figure 3.
The conditional probability of archetype A_k for the query Q, p(a_k|Q) could be carried out using information obtained in the previous step by the equation (k) in Figure 3.

Therefore, the propagation with this topology is to evaluate equations (h), (i), (g), and (k) in Figure 3.

Experiment Setup

Test Queries

We defined test queries with the following considerations: first, clinical concepts to be retrieved should be essential components of the EHR; second, there should be needs to reuse these clinical contents [37], such as medical events prediction [38], clinical research [39], and disease research [40]; third, queries should allow us to test the performance of retrieval methods in related archetypes identification, including specialized archetypes and compositional parent archetypes. Based on these criteria, we selected medication, laboratory test, and diagnosis as retrieval tasks and formulated three queries (Table 1).

Data Source

We downloaded all available archetypes from the CKM [24] for a total of 526 on August 30, 2018. All files were in ADL format. We used the ADL parser [41] to extract features. Among these CIMs, three archetypes did not use English as the description language, so the total number changed to 523.

Relevance Assessment

To evaluate retrieval results, we first had to identify relevant archetypes in three retrieval tasks as the gold standard. We manually annotated all 523 archetypes, according to their relevance to each query, to formulate three benchmark datasets. Given a query and an archetype, three annotators were asked to judge if the archetype was relevant. The labeling instructions were as follows: a label was relevant when the archetype could cover the potential clinical concept inferred from the given query; a label was nonrelevant otherwise. We took the majority vote to decide the relevance of an archetype. These three benchmark datasets were used as ground truth for the medication, laboratory test, and diagnosis retrieval tasks.

Baseline Methods

To validate the performance of our method, three typical retrieval methods were selected as baselines: CKM, BM25F, and simple Bayesian network.

Table 1. Test queries.

Query	Retrieval task	Input terms
1	Medication	Medicine name, total daily amount, allowed period, and order start date/time
2	Laboratory test	Report, test name, and test results
3	Diagnosis	Problem/diagnosis, test diagnosis, date/time of onset, and body site

BM25F is an extension of the BM25 ranking function, which is applicable to structured documents consisting of multiple fields. It combines the term frequencies (weighted accordingly to their field importance) and uses the resulting pseudofrequency in the BM25 ranking function. In this study, we supposed that an archetype was decomposed into two fields, concept and data elements, and used the function (Figure 3, equations l and m) proposed by Zaragoza et al [42], where w_ti is the RSJ relevance weight for term t_i, x_{ak, f, ti} is the term frequency of term t_i in the field type f of archetype a_k, l_{ak, f} is the length of that field, l_f is the average field length for that field type, and B_f is a field-dependent parameter.

For the Bayesian network, the structure is illustrated in Figure 2. The propagation with this topology is to evaluate equations (h), (i), and (k) in Figure 3.

Overview of Clinical Resources Network

Table 2 shows the distribution of archetypes across different clinical domains.Clinical domain classification refers to the concept schema proposed by Hruby et al [39].

Table 3 shows the distribution of archetypes, concepts, and data elements across different types of archetypes in the collection. In addition, there were 31 specialized archetypes, 11 of whose parent archetypes are no longer in the CKM.

Then, we learned the dependencies between concepts. Table 4 shows the top relevant concepts suggested by four different percentages of values of p(c_j|e(C_i)) for “dosage” and “examination of a lung,” respectively.

After that, we constructed four clinical resource networks, G₁, G₂, G₃, and G₄, according to the top 3%, 5%, 8%, and 10% of values, respectively. Each graph consisted of 5513 nodes, which were 3982 data element nodes, 504 concept nodes, 504 duplicated concept nodes, and 523 archetype nodes, with 6366 edges from T to C and 543 edges from Cʹ to A. For edges C to Cʹ, G₁ had 1590 arcs, G₂ had 2485 arcs, G₃ had 2958 arcs, and G₄ had 3263 arcs.

Evaluation of the Performance

To compare the performance of different graphs in supporting retrieval, we calculated the average precision (AP) values for the 11 standard recall points of each graph for the test queries and then computed the mean average precision (MAP) values. The results (Table 5) showed that the retrieval method based on G₃ achieved the best MAP (MAP=0.32), with an AP of 0.35, 0.31, and 0.3 for each query, respectively.

Table 2. Distribution of archetypes across different clinical domains.

Clinical domain and subdomains		Archetypes, n
Patient
	Demographic	42
	Health characteristic	32
	Patient	6
Pretreatment diagnosis
	Clinical assessment	73
	Pretreatment diagnosis	26
	Procedure	6
	Intent	1
Treatment
	Treatment	39
	Prescribed	12
	Surgery	9
Detection/Treatment results		184
Organizational/Provider characteristics		26
Outcomes		24
Patient environment factors		6
Other		37
Total		523

Table 3. Distribution of archetypes, concepts, and data elements.

Archetype type subtypes		Archetypes, n	Concepts, n	Elements, n	Data elements per concept, mean
Cluster		198	198	1567	7.9
Composition		25	25	45	1.8
Entry
	Action	15	15	252	16.8
	Evaluation	51	51	432	8.5
	Observation	164	163	1511	9.3
	Instruction	8	8	124	15.5
	Admin	4	4	69	17.3
Section		26	26	88	3.4
Demographic		32	29	169	5.8
Total		523	504	3982	7.9

Table 4. Top edge suggestions for “dosage” and “examination of lung.”

Clinical concept	Different threshold of p(c_j\|e(C_i))^a
	Top 3%	Top 5%	Top 8%	Top 10%
Dosage	Dosage	Dosage	Dosage	Dosage
	Medication order	Medication order	Medication order	Medication order
		Therapeutic direction	Therapeutic direction	Therapeutic direction
			Medication	Medication
			Medication authorization	Medication authorization
Examination of lung	Examination of a lung	Examination of a lung	Examination of a lung	Examination of a lung
	Auscultation of lung	Auscultation of lung	Auscultation of lung	Auscultation of lung
	Pulmonary function test	Pulmonary function test	Pulmonary function test	Pulmonary function test
	Macroscopic findings-lung cancer	Macroscopic findings-lung cancer	Macroscopic findings-lung cancer	Macroscopic findings-lung cancer
	Macroscopic findings-lung cancer	Macroscopic findings-lung cancer	Macroscopic findings-lung cancer	Examination findings-posterior chamber of eye
				Examination of a breast
				Examination of a burn

^ac_j=”dosage” and “examination of lung,” respectively.

Next, we compared the results of our method based on G₃ with baseline methods. To comprehensively validate the performance, we selected the MAP, AP, and precision at 10 (P@10) as evaluation metrics. Archetypes in the CKM are updated regularly, so it is difficult for us to compare the result on the same collection. We searched relevant archetypes in the CKM for the three queries given on December 12, 2018, and evaluated its performance against the ground truth. The result (Table 6) shows that our method outperforms all the baseline methods, achieving the best AP and P@10 across different test queries, as well as the best MAP. For instance, for query 1, our method, CKM, Bayesian network, and BM25F achieved a P@10 of 0.50, 0.40, 0.20, and 0.20, respectively. Furthermore, we can observe that the MAP of BM25F (MAP=0.177) and Bayesian network (MAP=0.127) was lower than that of CKM (MAP=0.227), which means that there are limitations in using clinical concepts and data elements to represent each archetype. Our approach takes into account the semantic associations between concepts and effectively compensates for this deficiency.

The same trend is observed when evaluating precision-recall graphs across all test queries. Figure 5 shows the precision-recall curves evaluated against the ground truth. Here, BM25F falls short in performance. For instance, for a recall of 0.3, our method, CKM, Bayesian network, and BM25F achieved a precision of 0.38, 0.30, 0.05, and 0, respectively. Additionally, the 11-point MAP curve of the Bayesian network is similar to that of our approach, but the performance is much worse than ours. Meanwhile, compared with the curve of the CKM, our curve is smoother and has higher precision when the recall is below 0.6. These results may be explained by the fact that dependencies between concepts could help identify relevant archetypes.

Table 5. Average precision performance of graphs with different similarity thresholds.

Graphs with different similarity thresholds^a	Mean average precision	Average precision
		Query 1 (medication)	Query 2 (laboratory test)	Query 3 (diagnosis)
G₁ (top 3%)	0.253	0.36	0.10	0.30
G₂ (top 5%)	0.277	0.27	0.26	0.30
G₃ (top 8%)	0.320	0.35	0.31	0.30
G₄ (top 10%)	0.313	0.33	0.31	0.30

^aGraphs with percentages of values of p(c_j|e(C_i)).

Table 6. Retrieval performance comparison.

Method	MAP^a	Query 1 (medication)		Query 2 (laboratory test)		Query 3 (diagnosis)
		AP^b	P@10^c	AP	P@10	AP	P@10
CKM	0.227	0.26	0.40	0.31	0.30	0.11	0.10
BM25F	0.177	0.08	0.20	0.18	0.30	0.27	0.30
Bayesian network	0.127	0.11	0.20	0.22	0.30	0.05	0.10
Our method	0.320	0.35	0.50	0.31	0.50	0.30	0.30

^aMAP: mean average precision.

^bAP: average precision.

^cP@10: precision at 10.

Figure 5. Precision-recall curves of the four retrieval methods. BM25F: an extension of the BM25 ranking function; BN: Bayesian network; CKM: Clinical Knowledge Manager.

Principal Findings

The dual model methodology used by openEHR distinguished the clinical content domain from the technical domain, which enabled reusable CIMs (archetypes) [9]. We were interested in identifying relevant CIMs online to standardize clinical concept representation within EHRs, so we developed a graphical retrieval method based on an extended Bayesian network and validated its feasibility using an online clinical information knowledge source: OpenEHR CKM. We combined a qualitative representation of the retrieval task, by using a graphical representation of relationships among data elements, concepts, and archetypes, with quantitative representation of the uncertainty of information needs, using a probabilistic approach. Compared with three typical retrieval methods (BM25F, Bayesian network, and CKM) in the medication, laboratory test, and diagnosis retrieval tasks, our method achieved the best MAP (MAP=0.32). In the diagnosis retrieval task, CKM and BM25F could not find the relevant archetype “openEHR- EHR-SECTION.problems_and_diagnoses.v1.” Our method could successfully identify the models covering “diagnostic reports,” “problem list,” “patients background,” “clinical decision,” etc, as well as “problems and diagnoses.”

Although end users were mainly concerned about whether an archetype covered the concept name and core data items, we could not obtain satisfied performances without considering any potential knowledge that might be mined from the collection. Here, BM25F and Bayesian network just used clinical concepts and data elements as main features to represent each archetype and performed worse compared with the other models. In the laboratory test retrieval task, the recall of BM25F was 0.158, whereas ours was 1.0 and CKM was 0.895. In the diagnosis retrieval task, the value of precision at 3 of Bayesian network was 0, whereas ours was 1.0 and CKM was 0.333. A possible reason was that we used exact matching instead of fuzzy matching. The most important reason was that they only encoded the dependence relationships between variables and did not take into account the semantic associations between them. Previous studies showed that using the structure of existing knowledge resources and distributional statistics drawn from text corpora could help estimate semantic similarity and relatedness between medical concepts [43]. In the openEHR framework, archetypes should map to clinical terminologies (such as SNOMED CT). However, most archetypes currently in the CKM lacked this kind of mapping, which could have limited the calculation of semantic relatedness. In this study, we learned relationships between concepts by a probabilistic approach and constructed a concept subnetwork with two layers. The results showed that the performance significantly improved, which explained the effectiveness of using prior knowledge to improve retrieval results.

Accordingly, how to find the top n concepts relevant with each concept became crucial. We used e(C_i) as an event representing some type of evidence about the relevance of a concept C_i, and keywords were used as evidence in the experiment. With their help, we could find that the concepts “medication list” and “medication order list” were related, even though their concept name and data elements were totally different. There was also other semantic information that could be used as evidence, such as “purpose” and “use.” How to use them to better support retrieval might need to be further clarified. However, this method could also include in the network some lower relevant concepts, as shown in the column “Top 10%” in Table 4. For better results, we used AP and MAP as evaluation metrics to help select relevant concepts; meanwhile, we noticed that many concepts had the same values of conditional probabilities. This was because of the probabilistic approach we applied, which reminded us that we could not simply select the top n concepts as the relevant ones. Here, we adopted concepts with top n percentages of values of conditional probabilities.

When modeling clinical resources network, we took the relationship of specialization between archetypes into consideration. It helped us find “openEHR-EHR- COMPOSITION.report-result.v1,” a specialized archetype of “openEHR-EHR- COMPOSITION.report.v1,” which BM25F could not find. In addition, we could also find relevant compositional parent archetypes successfully, even though we did not use the relationship of aggregation. For example, in the diagnosis retrieval task, our method could find “openEHR-EHR-SECTION.clinical_decision.v0,” which defined an archetype slot to allow “openEHR-EHR- EVALUATION.problem_diagnosis.v1.” It was because the compositional archetype used the clinical concept of the allowed archetype as its data element. When we linked the data element node to its corresponding concept node, we in fact modeled the relationship of aggregation.

The key idea of our approach lay in identifying potentially relevant clinical concepts from the input. In a two-level model methodology, clinicians were usually the end users. In most scenarios, they were not familiar with openEHR archetypes and did not know what archetype-friendly concepts were. This requires the retrieval method to be as insensitive to the input as possible. For example, take the medication retrieval task. If the user inputs “medication item, order start date/time, dosage, dose unit, comment,” using some frequent words in the archetype library, the CKM performed better than ours. The AP value of CKM was 0.82 (P@10=0.7, recall=1) whereas ours was 0.45 (P@10=0.6, recall=1). However, when the user used uncommon words, such as “medicine name” (Table 1), our method, CKM, Bayesian network, and BM25F achieved an AP of 0.35, 0.26, 0.11, and 0.08, respectively. In addition, as Table 6 shows, our AP was almost equal across different retrieval tasks (0.35, 0.31, and 0.30, respectively), whereas the other retrieval methods were not. From the clinical domain, queries 2 and 3 mainly belonged to the topic of detection/treatment results, whereas query 1 belonged to treatment, which indicated that our performance was relatively stable across different clinical domains. All these showed that our method was more robust than the others.

Additionally, better retrieval results could help users to identify reusable archetypes quickly, promote reuse of archetypes, and improve standardization of CIMs, thereby enhancing interoperability of EHRs. Archetype modeling methodology [15,23] showed that clinicians and domain experts should compare archetype design specifications with retrieved archetypes to decide whether new archetypes need to be developed or whether an existing one could be adapted. Our method could successfully identify relevant archetypes that the CKM could not find, such as “openEHR-EHR- SECTION.problems_and_diagnoses.v1” in the diagnosis retrieval task. If this archetype was the case need, domain experts might create a new one as they thought it did not exist in the CKM. Our method achieved the best recall (recall=1) in different retrieval tasks, which could help reuse archetypes and promote the semantic interoperability of EHRs.

Limitations

Our study has important limitations. First, it is a feasibility study based on openEHR archetypes. Whether our method can be applied to other CIMs, such as HL7 templates, and to what extent it needs to be localized still need to be clarified and validated. In fact, the key features used in our method are data elements, clinical concepts, CIMs (archetypes), and their relationships. It indicates that our method has potential feasibility if these features are available for other CIMs. Which results are potentially possible will be discussed in future work.

Second, our method presented in this study lacks the calculation of the semantic relevance of synonyms or homonyms, both for queries and network modeling. However, relevant semantic computing methods [43] can be applied to our retrieval method. With their help, we may be able to identify that “medication item” and “medicine item” referred to the same term, and the results would be improved. In the future, we will validate its feasibility and effectiveness.

Third, we did not validate the impact of our method on interoperability. In fact, the basic problem of semantic interoperability in EHRs must be solved from the perspective of the business domains the concepts originally belong to. Our approach only addresses specific technical issues in the CIM modeling process.

Furthermore, there are other limitations. First, the relevant archetypes in the three retrieval tasks that we manually annotated may be controversial, according to different experts. Second, we compared our performances with the CKM on different archetype collections, which may lead to inaccurate results.

Conclusions

In this paper, we proposed an extended Bayesian network retrieval method for finding relevant CIMs. We graphically represented openEHR archetypes using an extended Bayesian network with two concept layers. The results show that it is an effective approach to meet the uncertainty of retrieval tasks, and the key step in modeling this network is to learn the dependencies between concepts. Our better retrieval results could encourage clinicians and domain experts to reuse existing CIMs to represent EHR data in a standard manner, thereby enhancing the interoperability of EHRs. Furthermore, our study provided how the inference process was carried out. Comparing the results of our method with baseline methods, we had the best performance. To optimize the method, further research should focus on the potential feasibility for other CIMs and the calculation of semantic relevance of synonyms or homonyms.

Acknowledgments

This research is supported by the Chinese Academy of Medical Sciences (grant #2017PT63010, 2018PT33024) and the National Key R&D Program of China (grant #2016YFC0901901, 2017YFC0907503).

Conflicts of Interest

None declared.

Sherman RE, Anderson SA, Dal Pan GJ, Gray GW, Gross T, Hunter NL, et al. Real-world evidence-what is it and what can it tell us? N Engl J Med 2016 Dec 08;375(23):2293-2297. [CrossRef] [Medline]
Framework for FDA’s Real-World Evidence Program. Silver Spring, MD: US Food and Drug Administration; 2018 Dec. URL: https://www.fda.gov/media/120060/download [accessed 2019-05-15] [WebCite Cache]
Verheij RA, Curcin V, Delaney BC, McGilchrist MM. Possible sources of bias in primary care electronic health record data use and reuse. J Med Internet Res 2018 May 29;20(5):e185 [FREE Full text] [CrossRef] [Medline]
European Commission. 2012 Dec 07. eHealth Action Plan 2012-2020: innovative healthcare for the 21st century URL: https://ec.europa.eu/digital-single-market/en/news/ehealth-action-plan-2012-2020-innovative-healthcare-21st-century [accessed 2019-05-18] [WebCite Cache]
He J, Baxter SL, Xu J, Xu J, Zhou X, Zhang K. The practical implementation of artificial intelligence technologies in medicine. Nat Med 2019 Jan;25(1):30-36. [CrossRef] [Medline]
International Organization for Standardization. 2011 Apr. ISO 18308:2011 health informatics-requirements for an electronic health record architecture URL: https://www.iso.org/standard/52823.html [accessed 2019-05-15] [WebCite Cache]
Rector AL, Nowlan WA, Kay S, Goble CA, Howkins TJ. A framework for modelling the electronic medical record. Methods Inf Med 1993 Apr;32(2):109-119. [Medline]
Goossen W, Goossen-Baremans A, van der Zel M. Detailed clinical models: a review. Healthc Inform Res 2010 Dec;16(4):201-214 [FREE Full text] [CrossRef] [Medline]
Leslie H. ResearchGate. 2014 Jul. The openEHR approach URL: https://www.researchgate.net/publication/277667443_The_openEHR_approach [accessed 2019-05-15] [WebCite Cache]
International Organization for Standardization. ISO 13606 Standard URL: https://www.iso.org/home.html [accessed 2019-05-15] [WebCite Cache]
OpenEHR. URL: https://www.openehr.org/ [accessed 2019-05-15] [WebCite Cache]
Health Level Seven. URL: http://www.hl7.org/ [accessed 2019-05-15] [WebCite Cache]
Moreno-Conde A, Moner D, Cruz WD, Santos MR, Maldonado JA, Robles M, et al. Clinical information modeling processes for semantic interoperability of electronic health records: systematic review and inductive analysis. J Am Med Inform Assoc 2015 Jul;22(4):925-934. [CrossRef] [Medline]
Wang L, Min L, Wang R, Lu X, Duan H. Archetype relational mapping-a practical openEHR persistence solution. BMC Med Inform Decis Mak 2015 Nov 05;15:88 [FREE Full text] [CrossRef] [Medline]
Min L, Tian Q, Lu X, An J, Duan H. An openEHR based approach to improve the semantic interoperability of clinical data registry. BMC Med Inform Decis Mak 2018 Mar 22;18(Suppl 1):15 [FREE Full text] [CrossRef] [Medline]
Cardoso de Moraes JL, de Souza WL, Pires LF, do Prado AF. A methodology based on openEHR archetypes and software agents for developing e-health applications reusing legacy systems. Comput Methods Programs Biomed 2016 Oct;134:267-287 [FREE Full text] [CrossRef] [Medline]
Min L, Liu J, Lu X, Duan H, Qiao Q. An implementation of clinical data repository with openehr approach: from data modeling to architecture. Stud Health Technol Inform 2016;227:100-105. [Medline]
Marco-Ruiz L, Moner D, Maldonado JA, Kolstrup N, Bellika JG. Archetype-based data warehouse environment to enable the reuse of electronic health record data. Int J Med Inform 2015 Sep;84(9):702-714. [CrossRef] [Medline]
Wulff A, Haarbrandt B, Tute E, Marschollek M, Beerbaum P, Jack T. An interoperable clinical decision-support system for early detection of SIRS in pediatric intensive care using openEHR. Artif Intell Med 2018 Jul;89:10-23 [FREE Full text] [CrossRef] [Medline]
Chen R, Klein GO, Sundvall E, Karlsson D, Ahlfeldt H. Archetype-based conversion of EHR content models: pilot experience with a regional EHR system. BMC Med Inform Decis Mak 2009 Jul 01;9:33 [FREE Full text] [CrossRef] [Medline]
Saalfeld B, Tute E, Wolf K, Marschollek M. Introducing a method for transformation of paper-based research data into concept-based representation with openEHR. Stud Health Technol Inform 2017;235:151-155. [Medline]
Mar M, Begoña M. Towards the interoperability of computerised guidelines and electronic health records: an experiment with openEHR archetypes and a chronic heart failure guideline. In: Riaño D, ten Teije A, Miksch S, Peleg M, editors. Knowledge Representation for Health-Care. KR4HC 2010. Lecture Notes in Computer Science. Berlin: Springer; 2011:101-113.
Moner D, Maldonado JA, Robles M. Archetype modeling methodology. J Biomed Inform 2018 Dec;79:71-81 [FREE Full text] [CrossRef] [Medline]
openEHR. Clinical Knowledge Manager URL: https://www.openehr.org/ckm/ [accessed 2019-05-18] [WebCite Cache]
Teodoro D, Sundvall E, João Junior M, Ruch P, Miranda Freire S. ORBDA: an openEHR benchmark dataset for performance assessment of electronic health record servers. PLoS One 2018;13(1):e0190028 [FREE Full text] [CrossRef] [Medline]
Maranhão PA, Bacelar-Silva GM, Ferreira DN, Calhau C, Vieira-Marques P, Cruz-Correia RJ. Nutrigenomic information in the openEHR data set. Appl Clin Inform 2018 Jan;9(1):221-231 [FREE Full text] [CrossRef] [Medline]
Pahl C, Zare M, Nilashi M, de Faria Borges MA, Weingaertner D, Detschew V, et al. Role of OpenEHR as an open source solution for the regional modelling of patient data in obstetrics. J Biomed Inform 2015 Jun;55:174-187 [FREE Full text] [CrossRef] [Medline]
Finlayson SG, LePendu P, Shah NH. Building the graph of medicine from millions of clinical narratives. Sci Data 2014;1:140032 [FREE Full text] [CrossRef] [Medline]
Goodwin T, Harabagiu SM. Automatic generation of a qualified medical knowledge graphits usage for retrieving patient cohorts from electronic medical records. 2013 Sep 16 Presented at: IEEE Seventh International Conference on Semantic Computing; Sep 16-18, 2013; Irvine, CA p. 978 URL: http://www.hlt.utdallas.edu/~travis/papers/icsc_2013.pdf
Rotmensch M, Halpern Y, Tlimat A, Horng S, Sontag D. Learning a health knowledge graph from electronic medical records. Sci Rep 2017 Jul 20;7(1):5994 [FREE Full text] [CrossRef] [Medline]
Turtle HR, Croft WB. Efficient probabilistic inference for text retrieval. In: Proceedings RIAO '91 Intelligent Text and Image Handling. 1991 Presented at: RIAO '91 Intelligent Text and Image Handling; Apr 2-5, 1991; Barcelona, Spain p. 644-661 URL: https://dl.acm.org/citation.cfm?id=3171012
de Campos LM, Fernandez-Luna J, Huete J. The BNR model: foundations and performance of a bayesian network-based retrieval model. Int J Approx Reason 2003 Nov;34(2-3):265-285 [FREE Full text]
de Campos LM, Fernandez-Luna J, Huete J. Clustering terms in the bayesian network retrieval model: a new approach with two term-layers. Appl Soft Comput 2004 May;4(2):149-158 [FREE Full text]
Garrouch K, Omri M. Bayesian network based information retrieval model. 2017 Presented at: International Conference on High Performance Computing & Simulation; July 17, 2017; Genoa, Italy URL: https://www.researchgate.net/publication/317185437_Bayesian_Network_Based_Information_Retrieval_Model
Xu JM, Tang WS. A word similarity based belief network IR model with two term layers. 2009 Presented at: WRI Global Congress on Intelligent Systems; May 19-21, 2009; Xiamen, China p. 19-21 URL: https://ieeexplore.ieee.org/document/5209386
Acid S, de Campos LM, Fernandez-Luna J. An information retrieval model based on simple Bayesian networks. Int J Intell Syst 2003;18(2):251-265 [FREE Full text]
Meystre SM, Lovis C, Bürkle T, Tognola G, Budrionis A, Lehmann CU. Clinical data reuse or secondary use: current status and potential future progress. Yearb Med Inform 2017 Aug;26(1):38-52 [FREE Full text] [CrossRef] [Medline]
Rajkomar A, Oren E, Chen K. Scalable and accurate deep learning with electronic health records. NPJ Digital Medicine 2018 May;1(1):18 [FREE Full text]
Hruby GW, Hoxha J, Ravichandran PC, Mendonça EA, Hanauer DA, Weng C. A data-driven concept schema for defining clinical research data needs. Int J Med Inform 2016 Jul;91:1-9 [FREE Full text] [CrossRef] [Medline]
Denaxas SC, Morley KI. Big biomedical data and cardiovascular disease research: opportunities and challenges. Eur Heart J Qual Care Clin Outcomes 2015 Jul 01;1(1):9-16. [CrossRef] [Medline]
GitHub. Adl-parser URL: https://github.com/openEHR/java-libs/tree/master/adl-parser [accessed 2019-05-15] [WebCite Cache]
Zaragoza H, Craswell N, Taylor M. Microsoft Cambridge at TREC 2004: Web and HARD track. 2004 Presented at: TREC 2004; Nov 16, 2004; Gaithersburg, MD URL: https://trec.nist.gov/pubs/trec13/papers/microsoft-cambridge.web.hard.pdf
Yu Z, Wallace BC, Johnson T, Cohen T. Retrofitting concept vector representations of medical concepts to improve estimates of semantic similarity and relatedness. Stud Health Technol Inform 2017;245:657-661 [FREE Full text] [Medline]

‎

ADL: Archetype Definition Language

AP: average precision

CIM: clinical information model

CKM: Clinical Knowledge Manager

EHR: electronic health record

HL7: Health Level Seven

MAP: mean average precision

P@10: precision at 10

RM: reference model

Edited by G Eysenbach; submitted 27.01.19; peer-reviewed by X Lu, D Moner, G Tognola, J Lee, R Correia, W Goossen; comments to author 21.02.19; revised version received 18.04.19; accepted 02.05.19; published 28.05.19

©Lin Yang, Xiaoshuo Huang, Jiao Li. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 28.05.2019.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.

This paper is in the following e-collection/theme issue:

Discovering Clinical Information Models Online to Promote Interoperability of Electronic Health Records: A Feasibility Study of OpenEHR