This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
Electronic data capture (EDC) tools provide automated support for data collection, reporting, query resolution, randomization, and validation, among other features, for clinical trials. There is a trend toward greater adoption of EDC tools in clinical trials, but there is also uncertainty about how many trials are actually using this technology in practice. A systematic review of EDC adoption surveys conducted up to 2007 concluded that only 20% of trials are using EDC systems, but previous surveys had weaknesses.
Our primary objective was to estimate the proportion of phase II/III/IV Canadian clinical trials that used an EDC system in 2006 and 2007. The secondary objectives were to investigate the factors that can have an impact on adoption and to develop a scale to assess the extent of sophistication of EDC systems.
We conducted a Web survey to estimate the proportion of trials that were using an EDC system. The survey was sent to the Canadian site coordinators for 331 trials. We also developed and validated a scale using Guttman scaling to assess the extent of sophistication of EDC systems. Trials using EDC were compared by the level of sophistication of their systems.
We had a 78.2% response rate (259/331) for the survey. It is estimated that 41% (95% CI 37.5%-44%) of clinical trials were using an EDC system. Trials funded by academic institutions, government, and foundations were less likely to use an EDC system compared to those sponsored by industry. Also, larger trials tended to be more likely to adopt EDC. The EDC sophistication scale had six levels and a coefficient of reproducibility of 0.901 (
The adoption of EDC systems in clinical trials in Canada is higher than the literature indicated: a large proportion of clinical trials in Canada use some form of automated data capture system. To inform future adoption, research should gather stronger evidence on the costs and benefits of using different EDC systems.
Electronic data capture (EDC) systems are used in all phases of clinical trials to collect, manage, and report clinical and laboratory data [
Such systems have been discussed in the literature for more than a decade [
The number of published trials that use an EDC system has been rising [
The primary objective of this study was to estimate the proportion of phase II/III/IV Canadian clinical trials that used an EDC system in 2006 and 2007. The secondary objective was to investigate three factors that can have an impact on adoption: trial size, source of funding, and type of participants.
Trial size was measured in terms of the target number of patients recruited and number of sites. We expected that larger trials would be more likely to use an EDC system. The total cost of a trial is partially driven by the number of patients recruited. Therefore, if a technology reduces the effort spent per patient (eg, on date entry and query resolution), then larger trials will likely benefit more from EDC technology than smaller trials, making it more likely that EDC would be adopted in the larger trials.
Source of funding indicated whether the trial was commercially or academically/foundation funded. Controlling for size differences, we expected commercially sponsored trials to be more likely to use an EDC system. A main reason is that academic/foundation trials are less likely to have the funding to license and implement an enterprise-level computerized system.
Type of participant indicates whether the participants were adult or pediatric. We had no a priori expectations about the direction of impact of this factor and included it for exploratory purposes.
The contributions of this work are as follows: (1) We have developed a scale to assess whether an EDC system is being used and determine its level of sophistication, (2) We have performed a content validation and unidimensional (Guttman) scaling of the EDC sophistication scale, (3) We provided an updated estimate of EDC adoption in Canadian clinical trials, and (4) We have identified which trial factors have an impact on EDC adoption.
Previous studies of EDC adoption did not have a clear definition of what precisely an EDC system is (see the review in
The use of an EDC system in a clinical trial does not preclude the parallel use of paper case report forms (CRFs). Because of uncertainty about whether regulatory authorities will accept electronic documents as source documents (e-source), many sites still maintain source documents on paper [
To ensure consistent interpretation of what an EDC system is in our study, we asked questions about the features of the systems that were used in the clinical trial. If an electronic system was used for data capture and management and it had at least a minimum set of features, then it was considered to be an EDC system. We define a minimum set of features as allowing trial sites to submit data electronically into the central database and to be able to query that central database for reports and aggregate statistics.
All trials have to enter/transfer their data at some point into an electronic database or file for analysis. If trial sites send paper CRFs or fax them to a central coordinating site and the on-paper data are transcribed into a central database, that database would not be considered an EDC system by our definition because data are not submitted electronically.
The feature set we used was obtained from comparative product reviews [
We can divide EDC systems into those offering “basic” and those offering “advanced” features. Thus, it is natural to have variation in the features that are implemented in different EDC systems. The more features that an EDC system implements, the more “advanced” it is considered.
If an EDC system implements the “advanced” features, then it would by definition also implement the “basic” features as well. The former would include the latter. This type of cumulative relationship can be modeled through a Guttman scalogram [
The original intention of Guttman scaling was that such a scale would measure a single underlying dimension of a phenomenon (eg, job satisfaction or symptoms of fear during battle [
Previous applications of the Guttman scaling approach include the study of the evolution, progression, or growth of various objects. For example, anthropologists utilize scalogram techniques for studying the evolution of cultures [
The Guttman scale is suitable for defining cumulative functionality levels for an EDC system such that if a system implements, say, feature 5, then it is likely to have also implemented features 1, 2, 3, and 4. If features can be ordered, then the higher features signify more EDC sophistication.
We therefore used Guttman scalogram analysis to create an ordered scale of EDC sophistication, with lower scores indicating an EDC system that is more basic with fewer features, and higher scores indicating an EDC system that is more advanced. The coefficient of reproducibility [
Clinical trials with Canadian sites were identified through two main international clinical trials registries: ClinicalTrials.gov and Current Controlled Trials. Such registries have been used in the past to perform descriptive analysis, such as on the global growth of clinical trials [
Since the 1997 FDA Modernization Act, FDA-regulated efficacy drug trials for serious or life-threatening diseases or conditions have to be registered with ClinicalTrials.gov [
The Canadian Institutes of Health Research (CIHR), which is the main public funding agency for health research in Canada, has a requirement that the randomized controlled trials it funds be registered with an International Standard Randomised Controlled Trial Number (ISRCTN) and that basic information about each trial be posted on the ISRCTN registry (Current Controlled Trials) [
Our sampling frame consists of registered trials that were running in Canada from January 1, 2006, to December 31, 2007 inclusive. This means trials were included that were started or terminated during that period, as well as ongoing trials that started before 2006 and those that were still running at the end of 2007.
Based on our systematic review (see
To analyze the factors affecting adoption, we constructed a logistic regression model [
Funding source and size were available/discernable from the two trial registries. We performed a log transformation on the target patient recruitment variable to ameliorate the heavy tail (the transformed variable does not deviate from normality according to the Kolmogorov-Smirnov test).
For the impact of size and whether a trial was academic or industry, our initial hypotheses in the introduction were directional. Therefore, we used one-tailed tests on the parameters for these two variables in our logistic model. For the adult versus pediatric impact on adoption, our initial hypothesis was nondirectional. Therefore, we adopted a two-tailed test for that analysis.
At 80% power and a baseline adoption probability of 0.2, a 246 sample size for the multivariate logistic regression model can detect an odds ratio (OR) of 1.57 at a one-tailed alpha level of 0.05 for a one standard deviation increase in the log target recruitment variable [
The commercial SurveyMonkey system was used to run and manage the survey.
It has been noted that contact information in online clinical trials registries has created a burden on principal investigators (PIs) through excessive emails from patients, other clinicians, and direct marketers [
The registries did not always provide detailed contact information for the site coordinators. In such cases, we had to determine the contact information for the Canadian site coordinators ourselves. Two approaches were followed. Initially, an email was sent to the main contact of the clinical trial listed with ClinicalTrials.gov or Current Controlled Trials asking him or her to send us the contact information for the Canadian sites. If the above did not work (eg, often trials do not have contact information if the trial has stopped recruiting, the trial may provide a generic sponsor address as a contact, or a PI contact may not respond), we contacted the administrative person responsible for clinical research at the sponsor or for the Canadian sites listed in the registries asking for assistance in locating the coordinator.
Each study coordinator was contacted by email inviting him or her to participate in the survey. Three reminders were sent out at one-week intervals. Respondents were also entered into a raffle for three iPod Shuffles. A summary of the Web survey details according to the CHERRIES guidelines [
The adoption rates are presented descriptively as a proportion with 95% confidence intervals [
The overall logistic regression model significance test is performed using the G statistic [
In total, there were 947 registered trials with sites in Canada that were running at some point in time during 2006 and 2007. This excludes five trials for which the central coordinating site was our home institution.
The median target number of participants to recruit was 226; the median number of sites was 5, and the median percentage of sites that were Canadian was 100%. The number of patients and sites are skewed, with some trials having a much larger recruitment target: the largest trial had 782 sites and a target recruitment of 35,000 participants. There were 498/947 trials (52.6%) funded by academic institutions, government funding agencies, or foundations (henceforth “academic” trials), and the remaining 449/947 trials (47.4%) were funded by industry (henceforth “industry” trials). Therefore, there was a relatively equal split of trials in terms of funding source.
As can be seen in
Differences between academic and industry trials (two-tailed tests)
Academic (median) | Industry (median) |
|
|
Number of participants | 130 | 400 | < .001 |
Total sites | 1 | 39 | < .001 |
Canadian sites | 100% | 11% | < .001 |
There were 84/947 pediatric-only trials (approximately 9%), and 863/947 adult trials (approximately 91%). In this classification, trials that included adults and youth in their recruitment criteria were classified as adult since they did not focus specifically on a pediatric population. Adult trials were equally likely to be academic as industry (433 vs 430), whereas pediatric trials were much more likely to be academic (chi-square test:
As can be seen in
Differences between adult and pediatric trials (two-tailed tests)
Adult (median) | Pediatric (median) |
|
|
Number of participants | 236 | 141 | < .001 |
Total sites | 6 | 1 | .003 |
Canadian sites | 57% | 100% | .001 |
As shown in
Trials for which we did not get contact information tended to be larger industry trials. For some, no contact information was available at all. For others, we had a sponsor or PI contact, whom we followed up with to get Canadian site coordinator contact information. In
Responses to the survey
Reasons given by sponsors or PIs for refusing to provide contact information (node L in
In all of our subsequent analyses, weights were used to ensure that our responding sample adequately represented the population of Canadian trials [
Out of the 331 trials for which we obtained coordinator contacts, 72 did not respond (78% response rate to the survey). We compared those nonrespondents to respondents on the same set of variables. There was no statistically significant difference in the response rates for industry and for academic trials by chi-square criteria. Neither was there a statistically significant difference in response rate for adult trials and for pediatric trials. Furthermore, we did not find any significant differences between survey respondents and nonrespondents on the other three variables (number of patients, number of sites, and proportion of Canadian sites) at a Bonferroni adjusted alpha level of 0.05.
Of the 331 trial coordinators to whom we sent the survey, we wanted to determine if there was a nonresponse bias in terms of their adoption of EDC. A common way to evaluate this is to compare early versus late respondents, where late respondents are a proxy for nonrespondents [
Trials that did not select any of the features were clearly not EDC system users. There was considerable variation in the features of the electronic systems that the remaining trials used. System features can be grouped into six cumulative levels of sophistication (see
Based on our definition, systems at a sophistication level of 1 would not be considered an EDC system. For example, if a coordinating center used a password-protected stand-alone database to manually enter paper CRFs that were sent in by courier from other sites, then it would have a system at the first level of sophistication.
Therefore, we only considered systems with a sophistication level of 2 and above as an EDC system. It is estimated that 41% of all trials (95% CI 37.5%-44%) are using an EDC system with a sophistication level of 2 or above.
The grouping of features into a six-level cumulative scale of EDC sophistication as determined through a Guttman scalogram analysis: higher levels signify more sophistication
Sophistication Level | Features | |
1 |
f1. |
There is a unique account and password for each user to access the online system. |
2 |
f2. |
Subject visit data are entered by sites through a Web interface into electronic case report forms (eCRFs). |
f3. |
The completion status of each eCRF for each subject can be tracked automatically online; for example, you can see which visits have complete data and which still have incomplete eCRFs for each subject. |
|
f4. |
The system provides an audit trail for all data entry and data modification. |
|
3 |
f5. |
Data validation happens automatically when data are entered into the eCRF (either right away or when the user presses the SUBMIT button), for example, to check for out-of-range values. |
f6. |
The system will automatically log the user off after a period of inactivity. |
|
4 |
f7. |
Subjects are randomized automatically, either through an automated telephone response system or through a Web interface. |
5 |
f8. |
Subject recruitment can be tracked online for each site; for example, the user can see a graph of recruited and not withdrawn subjects over time. |
6 |
f9. |
The system allows tracking of medication inventory at the sites. |
The most basic EDC systems in use today have Web-based data entry forms, form completion tracking, and audit trails. Automated randomization is a feature of relatively sophisticated EDC systems. Few trials are able to track subject recruitment over time, and tracking medication inventory is quite uncommon. The median EDC sophistication level was 4 for both academic and industry trials. The median EDC sophistication level for adult trials was 4, and for pediatric trials it was 5. This difference was statistically significant (Mann-Whitney U two-tailed test,
The logistic regression model to predict EDC adoption had a Nagelkerke
The clinical trials landscape in Canada is evenly split between academic and industry trials. However, industry trials tended to be larger with more patients and sites. More than 90% of trials were of adults, and these tended to be larger than pediatric trials. Our results reveal that the 41% adoption rate of EDC systems in Canadian clinical trials is twice the commonly cited value. Larger trials and those sponsored by industry are more likely to use an EDC system. We found that the type of participants did not have an impact on adoption, but this may be because the sample was under-powered to detect this effect given that the distribution of adult/pediatric trials was quite skewed.
While there is no difference in the level of sophistication of EDC systems used between academic and industry trials, pediatric trials tended to have more sophisticated EDC use than those with predominantly adult participants.
It is not surprising that industry-funded trials included in the sample were larger than academic ones. Pharmaceutical companies in Canada invested between $1.1425 billion and $1.67 billion on R&D in 2003 [
To the extent that the need for heavy investments in information technology (IT) can act as a barrier to use, cost would have been a deterrent for academically funded trials to use IT to the same extent as industry trials during the 2006-2007 period that we studied. This concurs with the observation that the median number of sites for academic trials was one; it may be more difficult to justify an investment in EDC for single-site trials. However, recently more EDC systems are adopting the Software as a Service (SaaS) model, where sites access the EDC through their Web browser. Such systems demand less of an IT capacity at each site to get started and do not require a large capital expenditure at the outset of the study to purchase equipment and software licences. Therefore, over time it is plausible that the adoption rate for academic trials will catch up to industry trials.
Despite academic trials having a lower adoption rate, there were no differences in terms of the sophistication of the EDC systems that were used by industry-sponsored and academic trials. Therefore, when academic trials do adopt an EDC system, they do not opt for systems with fewer features.
A commonly accepted descriptive model of the diffusion of innovations is an S-shaped curve, as shown in
To the extent that this model applies to EDC adoption, we are currently in the steepest point of adoption among the early majority of Canadian trials. Consequently, it would be reasonable to expect increased use of EDC systems in trials in the immediate future. This trend is consistent with other evidence showing rising adoption of health IT in general, and specifically, electronic health records [
The S-shaped diffusion of technology curve
High adoption rates of EDC systems have a number of practical and research implications. First, the characteristics of the adopters change over time and so does the nature of suitable evidence to inform their adoption decisions [
Second, EDC systems make it much more practical to make the frequent design changes that are required in adaptive clinical trials [
Third, for commercial trials, electronic submissions to regulatory authorities would become more practical with the increased use of EDC systems.
Finally, to the extent that EDC improves the data quality and efficiency of trials, higher EDC adoption would be expected to enable such benefits to materialize in the future.
In terms of the EDC systems themselves, the median sophistication level of EDC systems indicates that many trials are not able to track recruitment in real time. This suggests an important feature that EDC developers need to make sure is added to their systems.
Our systematic review of the literature (see
It is most likely that reality is a mixture of the above five reasons.
It would be of value to track the adoption of EDC over time using regular surveys similar to the current one. This will provide evidence as to whether the adoption is actually following the S-shaped adoption curve in
Additional comparisons with the United States and Europe would be informative. If there are significant regional differences in adoption rates, then there may be policy or structural choices that explain the differential. For example, if one region has adopted a certain set of policies or incentives, or has an existing health informatics infrastructure that supports the use of EDC, then other regions may consider duplicating those drivers to accelerate their EDC adoption rates.
There are other factors that could have an impact on the adoption of EDC that would be useful to investigate in future research. For example, for academically funded trials, one would consider the age of the PI, his or her technical skill/knowledge, the existence of a senior informatics person to provide support, whether there is an existing research systems infrastructure in place with programming or database resources available for investigators to use, and whether or not the academic institution already has a sophisticated EDC system available for use by any investigators. For industry-funded trials, one could consider the size of the organization running the trial (whether it is the industry sponsor or a CRO), the size of trials usually conducted, and the number of trials conducted per year in the geographical region of study (say, Canada or the United States).
Since we have developed an EDC sophistication measure, it would now be easier to evaluate the relationship between EDC sophistication and the benefits of EDC. One can hypothesize that more sophisticated EDC use will be associated with greater benefits, such as faster trial completion and fewer data errors.
One limitation of our results is that individuals conducting clinical trials may not have registered their trials [
Our results are limited to Canada, and the adoption rates may be different in other jurisdictions.
We wish to thank the reviewers for their valuable feedback on an earlier version of this paper.
None declared.
Systematic Review of EDC Adoption Surveys
Questionnaire Development and Validation
CHERRIES Summary
Canadian Institutes of Health Research
case report form
contract research organization
electronic case report form
electronic data capture
Food and Drug Administration
International Standard Randomised Controlled Trial Number
information technology
principal investigator