This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
The 36-Item Short Form Health Survey (SF-36) is a popular questionnaire for measuring the self-perception of quality of life in a given population of interest. Processing the answers of a participant comprises the calculation of 10 scores corresponding to 8 scales measuring several aspects of perceived health and 2 summary components (physical and mental). Surprisingly, no study has compared score values issued from a telephone interview versus those from an internet-based questionnaire self-completion.
This study aims to compare the SF-36 score values issued from a telephone interview versus those from an internet-based questionnaire self-completion.
Patients with an internet connection and returning home after hospital discharge were enrolled in the SENTIPAT multicenter randomized trial on the day of discharge. They were randomized to either self-completing a set of questionnaires using a dedicated website (internet group) or providing answers to the same questionnaires administered during a telephone interview (telephone group). This ancillary study of the trial compared SF-36 data related to the posthospitalization period in these 2 groups. To anticipate the potential unbalanced characteristics of the responders in the 2 groups, the impact of the mode of administration of the questionnaire on score differences was investigated using a matched sample of individuals originating from the internet and telephone groups (1:1 ratio), in which the matching procedure was based on a propensity score approach. SF-36 scores observed in the internet and telephone groups were compared using the Wilcoxon-Mann-Whitney test, and the score differences between the 2 groups were also examined according to Cohen effect size.
Overall, 29.2% (245/840) and 75% (630/840) of SF-36 questionnaires were completed in the internet and telephone groups, respectively (
The telephone mode of administration of SF-36 involved an interviewer effect, increasing SF-36 scores. Questionnaire self-completion via the internet should be preferred, and surveys combining various administration methods should be avoided.
ClinicalTrials.gov NCT01769261; https://www.clinicaltrials.gov/ct2/show/record/NCT01769261
The 36-Item Short Form Health Survey (SF-36) is a popular questionnaire for measuring the self-perception of quality of life (QoL) in a given population of interest [
Several randomized trials compared the SF-36 scores issued from different administration modes, such as paper versus the internet [
The SENTIPAT trial [
This research was an ancillary study of the multicenter, randomized SENTIPAT trial [
Briefly, as previously reported [
Inpatients were enrolled on the day of hospital discharge by a clinical research technician of the trial. At that time, the patients were informed about the study. Eligible patients not opposed to participating in the study were randomized into two parallel groups—internet or telephone follow-up (inherently resulting in an open-label trial)—in a ratio of 1:1. On the basis of centralized randomization that allocated the eligible patient either to the internet or to the telephone group through a website and using an underlying permutation block randomization stratified by hospital unit, a computer-generated list of permutations was established by a statistician independent of the study. At the time of patient inclusion, the technician also collected baseline variables (length of stay, sex, age, relationship status, level of education, activity, and type of insurance). The patient was then informed and discharged with documents explaining the corresponding questionnaire administration. A total of 1680 eligible patients (840 randomized in the internet group and telephone group each) were enrolled in the SENTIPAT trial between February 25, 2013, and September 8, 2014.
The SENTIPAT study was approved by the Comité de Protection des Personnes Île de France IX (decision CPP-IDF IX 12-014; June 12, 2012), the Comité Consultatif sur le Traitement de l’Information en matière de Recherche dans le domaine de la Santé (decision 12.365; June 20, 2012), and the Commission Nationale de l’Informatique et des Libertés (decision DR-2012-582; December 12, 2012). According to the French law in force at the time of the study, the formal consent of participants was waived and replaced by the following: patients received full information on their participation in the study, and the nonopposition of each participant in the study was notified (including date of nonopposition declaration) in the SENTIPAT study register.
Patients in the internet group had access to the French version of the SF-36 questionnaire 40 days after discharge on a website dedicated to SENTIPAT. Oral and written instructions had been delivered to these patients for a personal connection to the SENTIPAT website, and they received 1 reminder email per week for 3 weeks in case of nonresponse. Patients in the telephone group were interviewed by telephone approximately 42 days after discharge, and the data were simultaneously entered into the system by the interviewer using a website interface identical to that used in the internet group. The appointments for the telephone interviews of the patients in the telephone group were scheduled at the moment of patient inclusion, and up to 3 calls were tried whenever the first call did not reach the patient.
The 8 scale scores and the 2 summary scores of SF-36 were calculated according to the Medical Outcomes Study SF-36 French scoring manual [
Bivariate analyses were performed using Fisher exact test or chi-square test of independence for categorical variables and the Wilcoxon-Mann-Whitney test for quantitative variables. The latter test was notably used to compare the SF-36 score differences between the internet and telephone groups. Several authors have discussed the task of interpreting observed differences in terms of
The difference between the observed SF-36 score estimates in responders of the internet group and responders of the telephone group may be mainly due to two features: (1) the difference in the mode of administration of the questionnaire, strictly speaking (self-completion of the patient via the internet vs completion of a research technician via a telephone interview with the patient), and (2) unbalanced characteristics of the individuals in the 2 groups issued from a selection bias of the responders (an unavoidable situation inherent to the modes of administration of the questionnaire). Assessing the respective impact of these 2 features on the observed differences between the SF-36 scores observed in internet and telephone responders is of primary importance, and to get more insight into this issue, we developed a procedure in which responders of the internet group were matched to similar responders of the telephone group according to their baseline characteristics, and we further examined how the score differences between the 2 groups changed in this matched sample, as compared with the score differences observed in the initial unmatched populations.
Internet responders were matched to telephone responders according to a propensity score–based procedure, and the R package MatchIt [
Flow of participants through the study. SF-36: 36-Item Short Form Health Survey.
Demographic characteristics of responders and nonresponders in the internet and telephone groups (N=1640).
Feature | Internet | Telephone | ||||
|
Responders (n=245) | Nonresponders (n=595) | Responders (n=630) | Nonresponders (n=210) | ||
|
||||||
|
Female | 109 (44.5) | 269 (45.2) | 254 (40.3) | 103 (49) | |
|
Male | 136 (55.5) | 326 (54.8) | 376 (59.7) | 107 (51) | |
|
||||||
|
Values, mean | 49.5 | 46.6 | 47.2 | 43.8 | |
|
Values, median (IQR) | 50 (37-61) | 47 (33-59) | 47 (34-58) | 41 (30-54) | |
|
||||||
|
Values, mean | 4.0 | 4.0 | 4.0 | 4.1 | |
|
Values, median (IQR) | 1 (1-5) | 1 (1-5) | 1 (1-5) | 1 (1-6) | |
|
||||||
|
Conventional | 102 (41.6) | 256 (43) | 269 (42.7) | 91 (43.3) | |
|
1-day stay | 120 (49) | 285 (47.9) | 297 (47.1) | 103 (49.1) | |
|
Week stay | 23 (9.4) | 54 (9.1) | 64 (10.2) | 16 (7.6) | |
|
||||||
|
General and digestive surgery | 67 (27.3) | 138 (23.2) | 161 (25.6) | 44 (21) | |
|
Gastroenterology and nutrition | 65 (26.5) | 139 (23.4) | 147 (23.3) | 59 (28.1) | |
|
Hepatogastroenterology | 28 (11.4) | 75 (12.6) | 75 (11.9) | 29 (13.8) | |
|
Infectious and tropical diseases | 55 (22.4) | 150 (25.2) | 160 (25.4) | 45 (21.4) | |
|
Internal medicine | 30 (12.2) | 93 (15.6) | 87 (13.8) | 33 (15.7) | |
|
||||||
|
Currently employed | 158 (65) | 353 (59.3) | 375 (59.5) | 132 (63.2) | |
|
Job seeker | 17 (7) | 43 (7.2) | 47 (7.5) | 15 (7.2) | |
|
Retired | 47 (19.3) | 98 (16.5) | 101 (16) | 29 (13.9) | |
|
Student | 6 (2.5) | 38 (6.4) | 48 (7.6) | 17 (8.1) | |
|
Does not work because of health | 11 (4.5) | 48 (8.1) | 49 (7.8) | 11 (5.3) | |
|
Without work | 2 (0.8) | 9 (1.5) | 8 (1.3) | 4 (1.9) | |
|
Other | 2 (0.8) | 6 (1) | 2 (0.3) | 1 (0.5) | |
|
||||||
|
Farmer | 0 (0) | 1 (0) | 0 (0) | 0 (0) | |
|
Self-employed or trader | 4 (1.6) | 25 (4.2) | 27 (4.3) | 11 (5.3) | |
|
Manager | 80 (32.7) | 135 (22.7) | 159 (25.2) | 49 (23.4) | |
|
Intermediate profession | 39 (15.9) | 91 (15.3) | 105 (16.7) | 31 (14.8) | |
|
Middle-class occupation | 52 (21.2) | 135 (22.7) | 123 (19.5) | 55 (26.3) | |
|
Employee | 5 (2) | 20 (3.4) | 25 (4) | 8 (3.8) | |
|
Worker | 42 (17.1) | 83 (13.9) | 92 (14.6) | 22 (10.5) | |
|
No work | 23 (9.4) | 105 (17.6) | 99 (15.7) | 33 (15.8) | |
|
||||||
|
Primary or less | 18 (7.3) | 58 (9.7) | 47 (7.5) | 31 (14.8) | |
|
High school | 75 (30.6) | 193 (32.4) | 178 (28.3) | 60 (28.7) | |
|
Superior short time | 37 (15.1) | 95 (16) | 94 (14.9) | 33 (15.8) | |
|
Graduate or postgraduate | 115 (46.9) | 249 (41.8) | 311 (49.4) | 85 (40.7) | |
|
||||||
|
Living alonea | 103 (42) | 291 (48.9) | 293 (46.5) | 121 (57.9) | |
|
Living as a coupleb | 142 (58) | 304 (51.1) | 337 (53.5) | 88 (42.1) | |
|
||||||
|
<450 | 6 (2.4) | 28 (4.7) | 31 (4.9) | 10 (4.8) | |
|
450-1000 | 3 (1.2) | 37 (6.2) | 31 (4.9) | 11 (5.3) | |
|
1000-1500 | 17 (6.9) | 61 (10.3) | 51 (8.1) | 17 (8.1) | |
|
1500-2100 | 34 (13.9) | 75 (12.6) | 78 (12.4) | 27 (12.9) | |
|
2100-2800 | 26 (10.6) | 70 (11.8) | 66 (10.5) | 25 (12) | |
|
2800-4200 | 44 (18) | 79 (13.3) | 108 (17.1) | 28 (13.4) | |
|
≥4200 | 43 (17.6) | 64 (10.8) | 82 (13) | 16 (7.7) | |
|
No response | 72 (29.4) | 181 (30.4) | 183 (29) | 75 (35.9) | |
|
||||||
|
State medical help or universal health insurance | 2 (0.8) | 26 (4.4) | 24 (3.8) | 8 (3.8) | |
|
Compulsory health insurance | 15 (6.1) | 43 (7.2) | 43 (6.8) | 26 (12.4) | |
|
Compulsory health insurance plus complementary private health insurance | 228 (93.1) | 526 (88.4) | 563 (89.4) | 175 (83.7) |
aSingle, widowed, divorced, or separated.
bMarried, living together under a civil solidarity pact, or simply living together without legal ties.
c€1 (in 2013)=US $0.71 (in 2022).
The response rate observed in the intervention group (245/840, 29.2%) was significantly lower (
In terms of internal validity of questionnaire completion, Cronbach's α values calculated for each of the 8 scales comprising the SF-36 form in the internet and telephone groups (
The matching procedure matched the 245 responders in the internet group (no individual was dropped) with 245 individuals in the telephone group. The standardized mean difference of the global distance between internet and telephone groups was 0.4167 and 0.0215 before and after matching, respectively, with a corresponding balance improvement of 95%.
Differences in baseline variables between the internet and telephone responders before and after the matching procedure.
Observed mean score differences (telephone–internet) of SF-36 scales and summary components before and after matching. SF-36: 36-Item Short Form Health Survey.
The 36-Item Short Form Health Survey scores in the internet and telephone groups after matching (N=245 each).
Scale or component summary and group | Score, median (IQR) | Score, mean (SD) | Score, mean (95% CI) | Score difference (telephone−internet) | ||||||
|
|
|
|
Mean difference | Effect size | |||||
|
.02 | 4.57 | 0.18 | |||||||
|
Internet | 85 (65-95) | 76.08 (24.56) | 76.08 (72.92-79.08) |
|
|
|
|||
|
Telephone | 90 (70-100) | 80.65 (24.93) | 80.65 (77.47-83.71) |
|
|
|
|||
|
.002 | 9.39 | 0.22 | |||||||
|
Internet | 50 (0-100) | 51.53 (41.67) | 51.53 (46.22-56.73) |
|
|
|
|||
|
Telephone | 100 (0-100) | 60.92 (44.59) | 60.92 (55.31-66.43) |
|
|
|
|||
|
.045 | 4.56 | 0.16 | |||||||
|
Internet | 72 (41-100) | 66.84 (26.12) | 66.84 (63.55-70.11) |
|
|
|
|||
|
Telephone | 84 (41-100) | 71.40 (32.23) | 71.40 (67.42-75.40) |
|
|
|
|||
|
.99 | 0.04 | 0.00 | |||||||
|
Internet | 57 (42-72) | 55.10 (20.47) | 55.10 (52.57-57.65) |
|
|
|
|||
|
Telephone | 57 (37-77) | 55.15 (25.90) | 55.15 (51.96-58.34) |
|
|
|
|||
|
.57 | 0.82 | 0.04 | |||||||
|
Internet | 50 (35-65) | 48.29 (20.16) | 48.29 (45.78-50.80) |
|
|
|
|||
|
Telephone | 50 (35-65) | 49.10 (21.30) | 49.10 (46.41-51.78) |
|
|
|
|||
|
<.001 | 7.96 | 0.29 | |||||||
|
Internet | 75 (50-100) | 71.17 (24.27) | 71.17 (68.16-74.18) |
|
|
|
|||
|
Telephone | 100 (62.5-100) | 79.13 (31.24) | 79.13 (75.15-82.96) |
|
|
|
|||
|
.002 | 9.93 | 0.25 | |||||||
|
Internet | 100 (33.33-100) | 67.89 (39.04) | 67.89 (63.13-72.65) |
|
|
|
|||
|
Telephone | 100 (66.66-100) | 77.82 (39.84) | 77.82 (72.79-82.59) |
|
|
|
|||
|
.002 | 5.01 | 0.26 | |||||||
|
Internet | 64 (52-80) | 63.56 (18.77) | 63.56 (61.21-65.91) |
|
|
|
|||
|
Telephone | 72 (56-84) | 68.57 (20.20) | 68.57 (65.94-71.10) |
|
|
|
|||
|
.18 | 0.99 | 0.09 | |||||||
|
Internet | 44.95 (37.27-53.30) | 44.48 (10.04) | 44.48 (43.20-45.75) |
|
|
|
|||
|
Telephone | 48.33 (37.81-54.43) | 45.47 (11.05) | 45.47 (44.09-46.82) |
|
|
|
|||
|
.02 | 2.72 | 0.25 | |||||||
|
Internet | 47.49 (35.37-52.60) | 44.68 (10.62) | 44.68 (43.34-46.01) |
|
|
|
|||
|
Telephone | 50.86 (41.81-55.50) | 47.40 (11.15) | 47.40 (46.01-48.76) |
|
|
|
To our knowledge, this study is the first reported to date to compare SF-36 questionnaire data collected either via telephone interviews or via self-completion on a dedicated internet website. More precisely, the availability of SF-36 data collected in the SENTIPAT trial provided a perfect opportunity to precisely investigate the influence of the mode of administration of the questionnaire on SF-36 scores. This investigation was the aim of the ancillary study of the SENTIPAT trial reported here and constitutes the major contribution of our report. This investigation has benefited from 3 main strengths. First, the study is based on a randomized trial with a substantial number of patients included in both arms. Second, the population under study had a large patient case mix variability because of the fact that patients originated from 5 very different hospital wards. The third strength of the study is the construction of a matched subsample of comparable responders in the 2 arms according to baseline variables to mitigate the impact of an unavoidable selection bias of responders as much as possible.
For all but 2 out of 8 scales, the mean difference in scores between the groups was statistically significant and >4.5 points (
The main limitation of the study concerns the selection bias related to responder status in both arms; however, such a bias is inherent to the 2 corresponding modes of administration, and this bias is likely different from one mode of administration to the other. In this study, selection biases were mitigated as much as possible by conducting a part of the analyses in a matched subpopulation of responders. A detailed analysis comparing the scores observed in the whole set of responders (before matching) and in a subpopulation enhancing the similarity of the compared individuals (after matching) constitutes an important strength of the study. Our results evidence an interviewer effect, which artificially increased SF-36 scores when the questionnaire was administered through a telephone interview. Therefore, the telephone interview as a mode of administration of SF-36 cumulates two types of bias: the unavoidable associated selection bias of responders and the interviewer effect, which is discussed in more detail in the following sections. In general, several methods can be used for mitigating the selection bias of responders as much as possible: one takes advantage of the distribution of baseline values observed in the responders and nonresponders to correct initial responder estimates to estimates more representative of the whole population under study [
For the rest, some of the estimates reported here raise concerns in terms of generalizability and should only be viewed as minor side results that were required in the global process of the main goal of the study, which was to investigate the impact of the mode of administration of the SF-36 questionnaire on the collected scores. For example, the response rates reported here should not be considered emblematic of the corresponding modes of administration of the questionnaires. As detailed below in the
To our knowledge, this study is the only one to date that compared modes of administration of SF-36 on a matched sample of responders to mitigate—as much as possible—the inherent lack of initial comparability of responders according to the mode of administration of the questionnaire. Nevertheless, our results are in agreement with previous studies that reported higher SF-36 scores when administered by telephone than those issued from a mailed paper mode of administration [
Despite the reminders sent to the patients, the internet group response rate (245/840, 29.2%) to the survey was dramatically lower than that of the telephone group (630/840, 75%). Blumenberg and Barros [
In our view, the numeric value of the difference between the response rates observed in the 2 modes of administration of the present survey should be considered as a minor side result. Indeed, the heterogeneity of the comparisons reported in reviews [
As compared with the mode of administration based on telephone interviews, the response rate of volunteer patients communicating their SF-36 data via the internet was much lower; however, our study indicates that a substantial proportion of hospitalized patients volunteered to actively document their health data via the internet. Most of all, the study indicates that the telephone interviewer might be viewed as an intermediate subjective pattern in the collection of patient data, resulting in a nonnegligible increase in SF-36 scores. Therefore, self-administration of SF-36 should be preferred, including via the internet, which is likely a low-cost method. Importantly, the results of this study also strongly advocate avoiding the conduction of surveys combining methods of SF-36 administration that mix self-reporting and interviews.
Internal reliability of the 36-Item Short Form Health Survey in the internet and telephone groups.
CONSORT-eHEALTH checklist (V 1.6.1).
quality of life
role-physical
36-Item Short Form Health Survey
The Assistance Publique-Hôpitaux de Paris (Département de la Recherche Clinique et du Développement) was the trial sponsor.
The SENTIPAT study was funded by grant AOM09213 K081204 from Programme Hospitalier de Recherche Clinique 2009 (Ministère de la Santé).
The sponsor and the funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
The authors would like to thank the SENTIPAT study group (scientific committee: Fabrice Carrat, Bérengère Couturier, Gilles Hejblum, Morgane Le Bail, Alain-Jacques Valleron, and the below-mentioned heads and physicians within the poles and units in which patients were recruited), personnel of the study group within the poles and units concerned (heads: Marc Beaussier, Jean-Paul Cabane, Olivier Chazouillères, Jacques Cosnes, Jean-Claude Dussaule, Pierre-Marie Girard, Emmanuel Tiret, and Dominique Pateron), additional corresponding physicians (Laurent Beaugerie, Laurence Fardet, Laurent Fonquernie, François Paye, and Laure Surgers), and nursing supervisors (Françoise Cuiller, Catherine Esnouf, Hélène Haure, Valérie Garnier, Josselin Mehal-Birba, Nelly Sallée, and Sylvie Wagener).
The authors are indebted to the excellent technical team of the study—the clinical research technicians (Élodie Belladame, Azéline Chevance, Magali Girard, and Laurence Nicole, who included patients, collected baseline data, and interviewed the patients followed-up by telephone) and the software staff (especially Pauline Raballand but also Frédéric Chau and Frédéric Fotré, who created and maintained the trial’s dedicated website).
The authors would like to thank all the medical and nursing and administrative staff of the General and Digestive Surgery (including Ambulatory Surgery), Gastroenterology, Hepatology, Infectious Diseases, and Internal Medicine departments of Hôpital Saint-Antoine.
The authors would like to thank all patients who participated in the study.
GH had full access to all the raw data in the study and can take responsibility for the integrity of the data and the accuracy of the data analysis. GH was involved in the study conception and design and data acquisition. AA and GH were involved in the analysis and writing of the first draft of the manuscript. AA, FC, and GH were involved in the interpretation of data. All authors approved the final version of the manuscript.
None declared.