This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
With the growing scientific appeal of e-epidemiology, concerns arise regarding validity and reliability of Web-based self-reported data.
The objectives of the present study were to assess the validity of Web-based self-reported weight, height, and resulting body mass index (BMI) compared with standardized clinical measurements and to evaluate the concordance between Web-based self-reported anthropometrics and face-to-face declarations.
A total of 2513 participants of the NutriNet-Santé study in France completed a Web-based anthropometric questionnaire 3 days before a clinical examination (validation sample) of whom 815 participants also responded to a face-to-face anthropometric interview (concordance sample). Several indicators were computed to compare data: paired t test of the difference, intraclass correlation coefficient (ICC), and Bland–Altman limits of agreement for weight, height, and BMI as continuous variables; and kappa statistics and percent agreement for validity, sensitivity, and specificity of BMI categories (normal, overweight, obese).
Compared with clinical data, validity was high with ICC ranging from 0.94 for height to 0.99 for weight. BMI classification was correct in 93% of cases; kappa was 0.89. Of 2513 participants, 23.5% were classified overweight (BMI≥25) with Web-based self-report vs 25.7% with measured data, leading to a sensitivity of 88% and a specificity of 99%. For obesity, 9.1% vs 10.7% were classified obese (BMI≥30), respectively, leading to sensitivity and specificity of 83% and 100%. However, the Web-based self-report exhibited slight underreporting of weight and overreporting of height leading to significant underreporting of BMI (
Web-based self-reported weight and height data from the NutriNet-Santé study can be considered as valid enough to be used when studying associations of nutritional factors with anthropometrics and health outcomes. Although self-reported anthropometrics are inherently prone to biases, the magnitude of such biases can be considered comparable to face-to-face interview. Web-based self-reported data appear to be an accurate and useful tool to assess anthropometric data.
Overweight and obesity have reached pandemic proportions and it is considered as one of the major public health issues by the World Health Organization (WHO) [
Body mass index (BMI), defined as weight (kg) divided by squared height (m2), is highly correlated to excess fat mass. It is commonly used to classify overweight and obesity in adults: overweight excluding obesity (BMI 25-29 kg/m2) and obesity (BMI ≥30 kg/m2) [
However, it is acknowledged that self-reported height and weight are biased proxies of the true measures. Indeed, bias between self-reported and measured anthropometrics has been widely described in the scientific literature, in many American and European studies [
A novel approach for large-scale epidemiologic studies lies in the use of Internet to administer Web-based questionnaires [
In the NutriNet-Santé study, comparison of self-reported weight and height in a Web-based anthropometric questionnaire with the traditional paper form of the same questionnaire showed satisfying results, which were published elsewhere [
To date, only 1 study focused on assessing validity of Web-based self-reported weight compared with direct measure [
The objectives of the present study were to (1) assess the validity of Web-based self-reported weight, height, and resulting BMI compared with measured data in a subsample of the NutriNet-Santé study, and (2) evaluate the concordance (ie, agreement) between Web-based self-reported anthropometrics and face-to-face declaration. We hypothesized that (1) we would observe underreporting of BMI with the Web-based questionnaire compared with the gold standard (ie, clinical measurement), and (2) social desirability in front of the computer would be less important than on the phone compared with the face-to-face interview.
The present analyses were carried out on a subsample of the NutriNet-Santé study, an ongoing Web-based prospective cohort study launched in France in May 2009 [
Briefly, at inception, participants complete a set of Web-based questionnaires assessing socioeconomic and sociodemographic conditions, dietary intake, physical activity, anthropometrics, lifestyle, and health status [
Moreover, participants are invited to attend one of the specific health centers involved in the study, located in various French cities. During the visit, they undergo blood and urine sampling and a clinical examination including anthropometric measurements. Height is measured by a trained technician with a wall-mounted stadiometer without shoes to the nearest 0.5 cm [
This study was approved by the International Research Board of the French Institute for Health and Medical Research (IRB Inserm no: 0000388FWA00005831) and the French National Information and Citizen Freedom Committee (CNIL no: 908450 and no: 909216). The collection of biological samples and clinical data was approved by the Consultation Committee for the Protection of Participants in Biomedical Research (C09-42 on May 5, 2010) and the French National Information and Citizen Freedom Committee (CNIL no: 1460707).
To validate the self-reported anthropometrics, a random subsample of the participants with a scheduled clinical examination were invited to fill in a Web-based anthropometric questionnaire 3 days before their appointment at the health center. This minimizes weight variations because of a long time lag between reported and measured weight. The validation study started in November 2011 and ended in July 2012. All participants with a scheduled visit in this time range were invited to fill in the anthropometric questionnaire. A total of 2513 participants completed the questionnaire 3 days before and had attended the subsequent clinical visit. This constitutes the validation sample.
Among them, some randomly assigned participants were asked by the trained technicians to declare their height and weight on the day of the examination, before being measured. The concordance study started in February 2012. By July 2012, a total of 815 participants had provided Web-based weight and height 3 days before and in a face-to-face interview, constituting the concordance sample. We chose to stop inclusions and start the analyses in July 2012 because it provided a good balance between an acceptable sample size as reviewed [
Socioeconomic variables were collected at study baseline. Education referred to the highest achieved level (primary school, secondary school, high school diploma, university bachelor degree or less, university graduates with higher than bachelor degree) and was further regrouped into 3 categories (up to high school diploma, university bachelor degree or less, university graduates with higher than bachelor degree); occupational category was defined according to the current job or the last job held for unemployed or retired individuals (never employed, self-employed, farmers, manual workers, intermediate professions, managerial/professional staff). Monthly household income and household composition (marital status, number and age of children) were also reported, which allowed calculating monthly income per household unit (in euros) by using a standardized algorithm [
Leisure time physical activity (LTPA) was assessed by the International Physical Activity Questionnaire (IPAQ) [
For comparison to self-declared data, measured weight was rounded to the nearest kilogram and height to the nearest centimeter. Log-transformation was applied to height, weight, and resulting BMI to improve normality. BMI was categorized as normal (BMI<25 kg/m2), overweight excluding obesity (BMI 25-29 kg/m2), and obese (BMI ≥30 kg/m2). Throughout this paper,
Population characteristics (sex, age, socioeconomic status, tobacco use, LTPA, and anthropometrics) were compared between the validation and concordance samples and with the entire NutriNet-Santé cohort by
A summary of the indicators used for validation and concordance analyses is provided in
Several statistical procedures were used to assess the validity of Web-based self-reported anthropometrics by comparing them to the reference values measured by the technician. The difference between self-reported and measured weight, height, and resulting BMI were calculated.
Percentage of agreement between self-reported and measured categories of BMI were calculated and the degree of misclassification was assessed through weighted kappa coefficient. McNemar tests were carried out for the binary variables (1) overweight including obesity and (2) obese. Sensitivity and specificity for overweight and obese were also calculated as true positives/(true positives + false negatives) and true negatives/(true negatives + false positives), with the true measure being the clinical data.
The same procedures were used for the concordance study between self-reported Web-based questionnaire and face-to-face interview, namely paired
Because participants who answered the Web-based anthropometric questionnaire 3 days before attending the visit knew that they would be measured, this could lead to overagreement between self-reported and measured data. To overcome this potential bias, we performed the following sensitivity analyses: a second validity sample included participants who filled in the regular Web-based anthropometric questionnaire (available every 6 months) within 2 months before attending the visit. The visit was not necessarily scheduled at time of completion; hence, participants were unaware of an upcoming measurement. A time lag of a maximum 2 months was chosen to limit actual weight variations. The second validity sample consisted of 2078 participants. Among them, a second concordance sample of 233 participants was drawn that had available data from the face-to-face declaration.
All statistical tests were 2-sided and
The characteristics of the entire NutriNet-Santé cohort and of the validity and concordance samples are presented in
Men and women underreported their weight by –0.40 kg (SD 1.45) and –0.52 kg (SD 1.42), respectively, and overreported their height by 0.61 cm (SD 1.40) and 0.55 cm (SD 2.66), leading to an underreporting of BMI of –0.32 kg/m2 (SD 0.66) for men and –0.34 kg/m2 (SD 1.67) for women (all
Validity of continuous variables is presented in
To investigate determinants of differential bias, we regressed the difference between self-reported and measured BMI values on covariates. BMI category showed a significant effect (crude and adjusted for covariates: sex, age, LTPA, occupation, education, and smoking). BMI underreporting was –0.16, –0.36, and –0.63 kg/m2 among normal, overweight, and obese participants, respectively, in the adjusted model. Weight underreporting was significantly associated with BMI category (more underreporting among obese and overweight vs normal) and sex (women underreported more than men). Height overreporting was positively associated with BMI category (more overreporting among obese and overweight vs normal) and age. Crude differences by sex, across BMI and age categories are reported in
As shown in
As presented in
Characteristics of the validation study sample (N=2513) and the concordance study sample (n=815) from the NutriNet-Santé Study, 2012, France.
Participants’ characteristics | Validity sample (V) |
Concordance sample (C)a
|
NutriNet-Santé cohort (CO) |
|
||
|
|
|
|
V vs CO | C vs CO | |
Age (years), mean (SD) | 53.8 (13.3) | 53.6 (13.0) | 45.1 (14.5) | <.001 | <.001 | |
|
|
|
|
|
|
|
|
Mean (SD) | 66.8 (13.2) | 66.5 (13.4) | 67.3 (15.1) | .06 | .11 |
|
Median (IQR) | 65 (57-75) | 64 (57-74) | 64 (57-75) |
|
|
|
|
|
|
|
|
|
|
Mean (SD) | 166.3 (8.3) | 165.7 (8.5) | 166.8 (8.5) | .003 | .001 |
|
Median (IQR) | 165 (160-172) | 165 (160-170) | 166 (161-172) |
|
|
|
|
|
|
|
|
|
|
Mean (SD) | 24.1 (4.3) | 24.2 (4.4) | 24.2 (5.2) | .49 | .95 |
|
Median (IQR) | 23.3 (21.1-26) | 23.5 (21.2-26.2) | 23.1 (20.8-26.2) |
|
|
Female, n (%) | 1835 (73.0) | 606 (74.4) | 90,382 (78.1) | <.001 | .01 | |
Living with a partner, n (%) | 1860 (74.0) | 607 (74.5) | 82,480 (71.2) | .001 | .04 | |
|
|
|
|
.26 | .61 | |
|
Normal (<25 kg/m2) | 1604 (63.8) | 513 (62.9) | 76,879 (67.2) |
|
|
|
Overweight (25-29 kg/m2) | 643 (25.6) | 210 (25.8) | 25,396 (22.2) |
|
|
|
Obese (≥30 kg/m2) | 266 (10.6) | 92 (11.3) | 12,125 (10.6) |
|
|
|
|
|
|
.44 | .96 | |
|
Primary school | 78 (3.2) | 22 (2.8) | 3854 (3.4) |
|
|
|
Secondary school | 491 (20.1) | 156 (19.8) | 19,971 (17.6) |
|
|
|
High school diploma | 374 (15.3) | 113 (14.3) | 20,557 (18.1) |
|
|
|
University < bachelor degree | 746 (30.5) | 264 (33.4) | 33,362 (29.5) |
|
|
|
University ≥ bachelor degree | 757 (31.0) | 235 (29.8) | 35,552 (31.4) |
|
|
|
|
|
.39 | .94 | ||
|
Never employed | 55 (2.2) | 18 (2.2) | 6646 (5.7) |
|
|
|
Self-employed. farmers | 101 (4.0) | 33 (4.1) | 3951 (3.4) |
|
|
|
Manual workers | 53 (2.1) | 21 (2.6) | 3509 (3.0) |
|
|
|
Intermediate professions | 1372 (54.6) | 436 (53.5) | 65,223 (56.3) |
|
|
|
Managerial/professional | 932 (37.1) | 307 (37.7) | 36,455 (31.5) |
|
|
|
|
|
|
<.001 | .001 | |
|
Current smoker | 241 (9.6) | 86 (10.5) | 2079 (18.0) |
|
|
|
Former smoker | 999 (39.7) | 320 (39.3) | 38,324 (33.1) |
|
|
|
Never smoker | 1273 (50.7) | 409 (50.2) | 5667 (48.9) |
|
|
|
|
|
<.001 | .79 | ||
|
Low | 498 (20.3) | 176 (22.1) | 27,212 (25.6) |
|
|
|
Medium | 1002 (40.8) | 300 (37.6) | 44,239 (41.7) |
|
|
|
High | 954 (38.9) | 322 (40.3) | 34,695 (32.7) |
|
|
|
||||||
|
|
<.001 | <.001 | |||
|
Don’t want to answer | 261 (10.4) | 91 (11.2) | 14,929 (13.5) |
|
|
|
<1257 | 302 (12.0) | 112 (13.7) | 23,511 (21.3) |
|
|
|
1257-1835 | 508 (20.2) | 166 (20.4) | 23,606 (21.4) |
|
|
|
1835-2700 | 674 (26.8) | 225 (27.6) | 24.,329 (22.1) |
|
|
|
>2700 | 768 (30.6) | 221 (27.1) | 23.,849 (21.6) |
aNo significant difference was observed between the validity and concordance samples (all
b
c
dReduced sample size because of missing values; validity sample: n=2454 for physical activity level; concordance sample: n=798 for physical activity level; cohort: n=114,400 for BMI, n=113,296 for education, n=106,146 for physical activity level.
Validity indicators of weight, height, and body mass index (BMI) including intraclass correlation coefficient (ICC) between the Web-based self-report and measurement at the clinical examination, Bland–Altman mean agreement, and limits of agreement (LOA) from the NutriNet-Santé Study, 2012, France (N=2513).
Anthropometric variables | Web-based | Measured | Difference |
|
ICCb | % mean agreementc | % LOAd | ||||||
|
Mean | SD | Mean | SD | Mean | SD |
|
ICC | 95% CI | % | 95% CI | Lower limit | Upper limit |
Weight (kg) | 66.84 | 13.60 | 67.33 | 13.74 | –0.49 | 1.43 | <.001 | 0.99 | 0.99, 0.99 | 99.28 | 99.20, 99.37 | 95.11 | 103.64 |
Height (cm) | 166.30 | 8.48 | 165.73 | 8.32 | 0.56 | 2.39 | <.001 | 0.94 | 0.94, 0.95 | 100.33 | 100.27, 100.40 | 97.06 | 103.72 |
BMI (kg/m2) | 24.12 | 4.44 | 24.46 | 4.41 | –0.34 | 1.47 | <.001 | 0.97 | 0.97, 0.97 | 98.61 | 98.47, 98.77 | 91.12 | 106.74 |
a
bICC(2,1) calculated on log-transformed variables.
cBland–Altman mean agreement (average of difference self-reported – measured). A mean agreement of 100% represents exact agreement between the 2 methods.
dLOA: limits of agreement of self-reported value expressed as a percent of the measured value. Because results were antilogged after analysis, the LOA are given as ratio Web:measured.
Validity indicators for categorical variables including percent of similar classification and weighted kappa coefficient for overweight and obesity classification between the Web-based declaration and reference measurement at clinical examination from the NutriNet-Santé Study, 2012, France (N=2513).
Categorical anthropometric variable | Web-based |
Measured |
Agreement (%) | Weighted kappaa |
|
Sensitivityc,d | Specificityc,e | ||||||
|
n | % | n | % | % | 95% CI | κ | 95% CI |
|
% | 95% CI | % | 95% CI |
BMI classification |
|
|
|
|
93.2 | 92.2, 94.1 | 0.89 | 0.88, 0.91 |
|
|
|
|
|
Normal (BMI<25) | 1695 | 67.45 | 1 598 | 63.59 |
|
|
|
|
|
|
|
|
|
Overweight (BMI 25-29.9) | 590 | 23.48 | 645 | 25.67 |
|
|
|
|
<.001 | 87.9 | 0.86, 0.90 | 99.1 | 98.7, 99.6 |
Obese (BMI≥30) | 228 | 9.07 | 270 | 10.74 |
|
|
|
|
<.001 | 83.3 | 78.9, 87.8 | 99.9 | 99.7, 100 |
aCicchetti–Allison weight. For a given cell in row i, column j, wij=1–(|i–j|/2).
b
cSensitivity and specificity for binary variables: overweight including obesity (BMI≥25) and obese (BMI≥30).
dSensitivity=true positives/(true positives + false negatives).
eSpecificity=true negatives/(true negatives + false positives). True = clinical data.
Bland - Altman plot of self-reported versus measured values of BMI, NutriNet-Santé study, 2012, France. Horizontal lines represent the % mean difference and 95% limits of agreement.
Concordance indicators for continuous variables including intraclass correlation coefficient (ICC) between Web-based and face-to-face reported data, Bland–Altman mean agreement, and limits of agreement (LOA) from the NutriNet-Santé Study, 2012, France (n=815).
Anthropometric variable | Web-based | Face-to-face | Difference |
|
ICCb | % mean agreementc | % LOAd | ||||||
|
Mean | SD | Mean | SD | Mean | SD |
|
ICC | 95% CI | % | 95% CI | Lower limit | Upper limit |
Weight (kg) | 66.60 | 13.45 | 66.60 | 13.49 | 0.00 | 1.14 | .31 | 0.996 | 0.995, 0.996 | 100.01 | 99.89, 100.14 | 96.46 | 103.69 |
Height (cm) | 165.75 | 8.50 | 165.71 | 8.24 | 0.04 | 2.21 | .77 | 0.958 | 0.951, 0.963 | 100.02 | 99.91, 100.12 | 97.13 | 102.98 |
BMI (kg/m2) | 24.20 | 4.40 | 24.19 | 4.28 | 0.01 | 1.20 | .78 | 0.979 | 0.976, 0.982 | 100.00 | 99.80, 100.27 | 93.43 | 107.10 |
a
bICC: intraclass correlation (2,1) calculated on log-transformed variables.
cBland and Altman mean agreement (average of differences “Web-based minus face-to-face”). A mean agreement of 100% represents exact agreement between the 2 questionnaires.
dLOA: limits of agreement of Web-based self-reported value expressed as a percent of the face-to-face reported value. Because results were antilogged after analysis, the LOA are given as ratio Web-based/face-to-face.
Concordance indicators for categorical variables: percent of similar classification and weighted Kappa coefficient for overweight and obesity classification between Web-based and face-to-face reported data from the NutriNet-Santé Study, 2012, France (n=815).
Categorical anthropometric variable | Web-based | Face-to-face | Agreement (%) | Weighted kappaa |
|
||||
|
n | % | n | % | % | 95% CI | κ | 95% CI |
|
BMI classification |
|
|
|
|
98.5 | 97.7, 99.4 | 0.97 | 0.96, 0.99 |
|
Normal (BMI<25) | 547 | 67.1 | 546 | 67.0 |
|
|
|
|
|
Overweight (BMI 25-29.9) | 193 | 23.7 | 188 | 23.1 |
|
|
|
|
1.00 |
Obese (BMI≥30) | 75 | 9.2 | 81 | 9.9 |
|
|
|
|
.01 |
aCicchetti–Allison weight. For a given cell in row i, column j, wij=1–(|i–j|/2)
b
Sensitivity analyses in the second validity sample (n=2078) showed similar results as the validity sample, the validity indicators (ICC, kappa, percent agreement) were even slightly higher (
In the present study, we observed that Web-based self-report of anthropometrics in the NutriNet-Santé study is equivalent to a face-to-face interview. Although, as hypothesized, it is subject to bias as compared with direct measures, the bias is reasonably small and the validity indicators show good reliability of this data.
Overall, our results showed high validity of self-reported anthropometric data compared with measured values. However, we observed a small although significant underreporting of weight and BMI and an overreporting of height, which was expected and is consistent with previous research [
No difference in misreporting was observed between men and women for height, whereas it has been previously suggested that men tended to overreport their height more strongly than women [
Although underreporting of BMI and weight and overreporting of height was observed in every BMI category, their magnitude differed and we found that objective overweight and obesity were the strongest predictors for underreporting of weight and BMI and overreporting of height, similar to many studies [
Method of data collection can influence responses to surveys [
Contrarily, and as hypothesized, in our study we showed almost perfect agreement between the Web-based reporting and the face-to-face interview, arguing that behind the computer screen, participants do not seem more prone to social desirability bias. This can be explained by the greater feeling of anonymity on the Web than on the telephone [
We were aware that the Web-based reporting might be partly biased because participants theoretically knew they would be weighed a few days later; thus, limiting prevarication bias. However, the sensitivity analysis provided similar results, with even higher values of Web-based weight vs face-to-face, closer to the true measure. This shows an advantage of Web-based self-report compared with telephone interview as we previously demonstrated concerning dietary data [
The first limitation pertains to a potential underestimation of the difference between Web-based reports and measures because participants in our study knew they would attend the visit 3 days after filling in the Web-based questionnaire. However, the sensitivity analyses with data collected within 2 months before the visit showed similar results—even slightly higher validity—indicating that the difference seems not to be reduced by awareness of the upcoming examination.
Second, caution is also advised regarding the generalizability of our results. Indeed, the participants of the NutriNet-Santé study were recruited on a voluntary basis, implying that they might be particularly likely to engage in healthy behaviors; thus, a self-selection bias could have occurred in our population as in most prospective cohort studies. In particular, participants were invited to answer an anthropometric questionnaire twice a year, so they were likely to be more aware of their true weight. Further, the present validation study is subject to an additional selection bias related to the participation to the visit because some characteristics, such as age, smoking status, or LTPA, were significantly different between the validation sample and the entire cohort. However, even if some socioeconomic characteristics were different, educational level, occupation, and the main outcomes, anthropometric values, were not significantly different of the entire cohort. Also, among the participants who attended the clinical examination, those participating in the face-to-face interview were randomly allocated.
A major strength of this validation study is its originality. This is the second study assessing validity of anthropometric data collected through a Web-based tool, but we used a wider range of statistical tools that allowed analyzing the validity in more depth on a wider sample than in the recently published study [
In conclusion, this study indicates that Web-based weight and height data from the NutriNet-Santé study can be considered as valid enough to be used when studying associations of nutritional factors with anthropometric and health outcomes. However, underreporting of weight and BMI and overreporting of height was stronger among overweight and obese and we showed misclassification of overweight (sensitivity 87.8%) and obesity (sensitivity 83.3%) which leads us to advise caution when overweight and obesity are the main outcomes. Although it is subject to biases inherent to self-reported anthropometric measurements, the magnitude of such biases can be considered comparable to face-to-face interviews. Therefore, Web-based self-reported data appear to be an accurate and useful tool to assess anthropometric data.
Statistical analyses for validity and concordance of anthropometrics, NutriNet Santé Study, France, 2012.
Difference between Web-based self-report and measured anthropometrics according to BMI classification and age class, by sex, NutriNet-Santé study, France 2012.
Sensitivity analyses among subsample with a time lag between Web-based self-report and measurement < 2 months, NutriNet-Santé study, France, 2012.
body mass index
intraclass correlation coefficient
limits of agreement
leisure time physical activity
This work was supported by grants from the Région Ile de France (CORDDIM) and Fondation Coeur et Artères.
The NutriNet-Santé study is supported and has received grants by the following institutions: Ministère de la Santé (DGS), Institut de Veille Sanitaire (InVS), Institut National de la Prévention et de l’Education pour la Santé (INPES), Fondation pour la Recherche Médicale (FRM), Institut de Recherche en Santé Publique (IRESP), Institut National de la Santé et de la Recherche Médicale (INSERM), Institut National de la Recherche Agronomique (INRA), Conservatoire National des Arts et Métiers (CNAM), and Université Paris 13.
We thank all the scientists who carried out the NutriNet-Santé study and their teams, technicians, and assistants, who assisted their work. We especially thank Laurent Bourhis, data manager; Yasmina Chelghoum, Paul Flanzy, and Mohand Ait Oufella, computer scientists; and Karine Prevost and Nabil Cherifi, technicians.
None declared.