This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
The
Our study aimed to conduct an external validation of the C-Score in the US population and expand the original score to improve its predictive capabilities in the US population. The C-Score is intended for mobile health apps on wearable devices.
We conducted a literature review to identify relevant variables that were missing in the original C-Score. Subsequently, we used data from the 2005 to 2014 US National Health and Nutrition Examination Survey (NHANES; N=21,015) to test the capacity of the model to predict all-cause mortality. We used NHANES III data from 1988 to 1994 (N=1440) to conduct an external validation of the test. Only participants with complete data were included in this study. Discrimination and calibration tests were conducted to assess the operational characteristics of the adapted C-Score from receiver operating curves and a design-based goodness-of-fit test.
Higher C-Scores were associated with reduced odds of all-cause mortality (odds ratio 0.96,
Our study shows that this digital biomarker, the C-Score, has good capabilities to predict all-cause mortality in the general US population. An expanded health score can predict 87% of the mortality in the US population. This model can be used as an instrument to assess individual mortality risk and as a counseling tool to motivate behavior changes and lifestyle modifications.
In the United States, 60% of all adults have at least one chronic condition, and 42% have >1 [
In recent years, a number of risk-scoring algorithms and models have demonstrated the capacity to predict adverse health outcomes such as the risk of developing cardiovascular disease [
The C-Score, derived from metrics that are easily reported by a person and augmented by measures derivable from most smartphones, is designed as a tool for individualized health risk prediction and can be used as a basis for directing targeted lifestyle modifications to reduce the risk of future adverse outcomes. Clift et al [
In this study, we conducted an external validation of the C-Score in the US population and expanded the original score to improve its predictive capabilities in the US population [
For the external validation, we assessed the discrimination and calibration of the original C-Score in the US population using the US National Health and Nutrition Examination Survey (NHANES). For the expansion and adaptation of the model, we reviewed the literature and tested additional predictors of all-cause mortality in the US population to improve the predictive capacity of the model.
The risk models were developed following an extensive literature review that identified key risk factors for all-cause mortality [
Points-based score assigned to each explanatory variable for the original C-Score model.a
C-Score input | Points assigned, range |
Resting heart rate (beats per minute) | 0-7.83 |
Average hours of sleep per night | 0-10.26 |
Waist to height ratio | 0-10.8 |
Self-rated health (ordinal scale: excellent, good, fair, and poor) | 0-31.32 |
Cigarette smoking (status and cigarettes per day) | 0-12.96 |
Alcohol consumption (units per week) | 0-19.44 |
Reaction time | 0-6.75 |
aThe reaction time variable is not present in the main National Health and Nutrition Examination Survey sample. Therefore, we did not include this in the main analysis. For the sensitivity analysis, we did not include alcohol consumption or sleep duration as these variables were not present in the National Health and Nutrition Examination Survey III.
The NHANES is a large cross-sectional population-based survey that combines interviews with physical examinations, thereby serving as a rich source of both self-reported and directly measured biometric data. Each survey round includes a nationally representative sample of approximately 5000 individuals and is conducted regularly. The NHANES questionnaire elicits information pertaining to sociodemographic, dietary, physical, and health-related characteristics. Details of the NHANES study design have been described in previous studies [
As mortality data are not readily collected as part of the NHANES, the National Center for Health Statistics has matched 1999 to 2014 data with death certificate records from the National Death Index (NDI), which have been made available for public use. Mortality ascertainment was based on a probabilistic match between the NHANES and NDI death certificate records. These data were, in turn, linked with NDI mortality data using participants’ social security number, first name, middle initial name, last name or father’s surname, month of birth, day of birth, year of birth, state of birth, state of residence, race, and sex, yielding a sample of 28,033 participants with complete information on mortality. The methodology for the data linkage has been described in detail by the National Center for Health Statistics [
We linked the anonymized NHANES survey data with the anonymized NDI mortality data, which included mortality follow-up data from December 31, 2015. The matching yielded a sample of 28,033 participants. This was the sample for which the external validation of the C-Score was conducted. It was also the sample for which the C-Score model was adapted and expanded to improve its performance in the US population.
Following the development of the adapted model, we conducted another round of validation as a sensitivity analysis, using data obtained from NHANES III, a survey conducted from 1988 to 1994, which included the mortality data of 6591 participants. The NHANES III data analysis missed 2 of the 7 variables included in the risk model (sleep duration and alcohol consumption); therefore, the C-Score was calculated in the absence of these risk factors.
The explanatory variables in this study were extracted from the questionnaire data and examination data from the 5 NHANES waves. The questionnaire data included age (in years), cigarette consumption (average number of cigarettes per day), alcohol consumption (average number of alcoholic drinks per week), and sleep duration (hours per day). Self-rated health was transformed from a 5-point scale (from poor to excellent) into a 4-point scale in which
We conducted a subsequent literature review of predictors of all-cause mortality in the United States and identified a set of clinical factors and sociodemographic variables for which there is evidence of an association with mortality. As we wanted to ensure the usability of the smartphone app, we sought to create the most parsimonious model with maximal performance based on the combination of the Akaike Information Criterion, AUC, and goodness of fit. In addition to the variables used to construct the original C-Score, we investigated the predictive value of including sociodemographic characteristics such as gender, race or ethnicity, marital status, and educational attainment, as well as simple medical history variables shown to be associated with mortality, such as binary variables
To validate the original C-Score, we tested the model using the pooled NHANES data. However, as NHANES lacks the reaction time variable, which is one of the variables used to compute the C-Score, we conducted a sensitivity analysis using data from NHANES III, a smaller survey that collected data on reaction time, to measure the marginal effect of the reaction time variable. Following the validation and sensitivity analysis, we incorporated additional variables into the model and investigated their internal and external validity.
For all models, we used a complete case approach, whereby the only participants included were those for whom a risk score based on all risk factors could be computed (ie, for whom there were no missing data on any of the included variables). We pooled NHANES data from 2005 to 2014, which included 6 out of 7 variables included in the original C-Score model (missing reaction time). As the NHANES survey did not have the reaction time variable, all individuals were assumed to have the maximum score for that variable in this validation exercise.
In the complete case analysis, there were 21,015 participants (aged 18-85 years) with complete information on mortality, age, and all metrics included in the C-Score. This population with a wide age range was selected as one would expect to see greater variability in the exposure variables, thus permitting better exploration of the models. Furthermore, to produce estimates with a population similar to that in the Clift et al [
In all cases, we ran an additional analysis including both the C-Score and the logarithm of age, as performed by Clift et al [
As the NHANES survey lacks one of the variables used for validation—the reaction time variable—we performed a sensitivity analysis with a different data set. We conducted a sensitivity analysis using data from NHANES III, a survey conducted from 1988 to 1994 containing data for 33,994 people aged ≥2 months, including mortality data, to ascertain the marginal effect of the reaction time variable from the analysis. Owing to the limited number of people with neurobehavioral indicators, we did not impose age limits in this sensitivity analysis.
The NHANES III data set contains the reaction time variable but lacks 2 of the 7 variables included in the risk model (sleep duration and alcohol consumption). The lack of these variables should drive the fit and calibration of the model downward, and therefore, any results in this sensitivity analysis would be conservative. In this sensitivity analysis, we tested the sensitivity of the 5-variable model to the inclusion and exclusion of the reaction time variable. The complete case analysis yielded data from 1440 participants.
All data analyses were performed using Stata 15 (StataCorp), using survey weights to specify the survey and sample design characteristics. In addition, a dummy variable for the survey round was included in the models with pooled data. For all models,
We examined the impact of including additional variables on calibration and discrimination [
This study follows the TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis) guidelines for multivariable prediction models [
The NHANES survey is approved by the National Center for Health Statistics Institutional Ethics Review Board. Written informed consent was obtained from all adult participants. Ethical approval to conduct this analysis was not required as we used publicly available data. This study was approved by the institutional review board of the Johns Hopkins Bloomberg School of Public Health and was deemed nonhuman subject research (13743).
The data sets analyzed in this study are publicly available on the NHANES website. The C-Scores are proprietary information but can be provided as restricted data to the reviewers.
From 2005 to 2014, we obtained 28,078 records from the NHANES. Of these, 99.84% (28,033/28,078) were matched with mortality data and 74.84% (21,015/28,078) had complete information on all variables. A flowchart of the sample sizes for the main analysis, sensitivity analysis, and adaptation of the model is shown in
Flowchart for sample sizes for National Health and Nutrition Examination Survey (NHANES) study samples.
Descriptive statistics for the different samples used in the study.a
Variable | Full study sample (N=21,015) | Age-restricted sample (40-69 years; N=9994) | NHANESb III subsample for the sensitivity analysis (N=1440) | |
Age (years), mean (SD) | 47.43 (17.97) | 53.78 (8.53) | 47.85 (5.80) | |
|
||||
|
Male | 10,094 (48.03) | 4764 (47.67) | 655 (45.49) |
|
Female | 10,921 (51.97) | 5230 (52.33) | 785 (54.51) |
|
||||
|
Mexican American | 3334 (15.86) | 1621 (16.22) | 373 (25.9) |
|
Other Hispanic | 1891 (9) | 997 (9.98) | 36 (2.5) |
|
Non-Hispanic White | 9519 (45.3) | 4215 (42.18) | 615 (42.71) |
|
Non-Hispanic Black | 4413 (21) | 2314 (23.15) | 403 (27.99) |
|
Other race—including multiracial | 1858 (8.84) | 847 (8.48) | 13 (0.9) |
Resting heart rate, mean (SD) | 72.83 (12.11) | 72.42 (11.94) | 69.14 (10.80) | |
Waist to height ratio, mean (SD) | 0.59 (0.10) | 0.60 (0.09) | 0.58 (0.09) | |
Weekly alcohol intake, mean (SD) | 3.63 (8.36) | 3.88 (9.07) | N/Ac | |
Sleep duration, mean (SD) | 6.85 (1.40) | 6.73 (1.38) | N/A | |
|
||||
|
Excellent or very good | 8169 (38.87) | 3515 (35.17) | 558 (38.75) |
|
Good | 8425 (40.09) | 4007 (40.09) | 538 (37.36) |
|
Fair | 3790 (18.03) | 2078 (20.79) | 292 (20.28) |
|
Poor | 631 (3) | 394 (3.94) | 52 (3.61) |
Number of cigarettes per day, mean (SD) | 3.28 (7.60) | 3.94 (8.63) | 6.67 (11.71) | |
Comorbidities, n (%) | 3790 (18.41) | 2217 (22.27) | 183 (12.81) |
aSurvey weights are not included in this descriptive analysis.
bNHANES: National Health and Nutrition Examination Survey.
cN/A: not applicable.
There were 21,015 participants in the pooled data with complete information on mortality, age, and other C-Score metrics. The mean age of the sample was 47.43 (SD 17.97) years, the mean resting heart rate was 72.83 (SD 12.11) beats per minute, the mean WtHR was 0.59 (SD 0.10), mean weekly alcohol intake was 3.63 (SD 8.63) drinks per week, and mean sleep duration was 6.85 (SD 1.40) hours. For self-rated health, 38.87% (8169/21,015) were
In the validation subsample (among participants aged 40-69 years), there were 9994 participants with a mean age of 53.78 (SD 8.53) years, mean resting heart rate of 72.42 (SD 11.94) beats per minute, mean WtHR of 0.60 (SD 0.09), mean weekly alcohol intake of 3.88 (SD 9.07) drinks per week, and mean sleep duration of 6.73 (SD 1.38) hours. For self-rated health, 35.17% (3515/9994) were
Performance of the C-Score models for all-cause mortality by subsample.a
Outcome | C-Score model | C-Score plus log (age) | |||||||||||||||
|
Score ORb ( |
AUCc (95% CI) | AICd | Score OR ( |
AUC (95% CI) | AIC | |||||||||||
|
|
|
|
|
|
|
|||||||||||
Full study sample (N=21,015) | 0.96 (<.001) | 0.52 (9,71) | .86 (good) | 0.72 (0.70-0.73) | 8897.78 | .96 (<.001) | 7.25 (9,71) | <.001 (poor) | 0.86 (0.85-0.87) | 7272.50 | |||||||
Age-restricted sample (40-69 years; N=9994) | 0.95 (<.001) | 1.16 (9, 71) | 0.34 (good) | 0.72 (0.70-0.75) | 3458.24 | .95 (<.001) | 0.50 (9,71) | .87 (good) | 0.75 (0.73-0.77) | 3366.48 |
aAll models include dummy variables for the survey rounds. Survey weights were included in all analyses.
bOR: odds ratio.
cAUC: area under the curve.
dAIC: Akaike Information Criterion.
In the sensitivity analysis, we obtained data from NHANES III (1988-1994) on 6591 participants, of whom 21.85% (1440/6591) had complete data to conduct the validation.
Sensitivity analysis on all-cause mortality for the marginal effect of the reaction time variable using NHANESa III (N=1440).b
Outcome | C-Score model | ||||||
|
Score ORc ( |
AUCd (95% CI) | AICe | ||||
|
|
|
|
||||
C-Score model performance with reaction time | 0.92 (<.001) | 2.97 (9,41) | .01 (poor) | 0.68 (0.65-0.72) | 1556.57 | ||
C-Score model performance without reaction time | 0.91 (<.001) | 1.82 (9,41) | .09 (good) | 0.68 (0.65-0.72) | 1555.48 | ||
C-Score model plus log age performance with reaction time | 0.92 (<.001) | 0.86 (9,41) | .56 (good) | 0.72 (0.69-0.75) | 1438.43 | ||
C-Score model plus log age performance without reaction time | 0.92 (<.001) | 0.97 (9,41) | .48 (good) | 0.72 (0.69-0.75) | 1485.04 |
aNHANES: National Health and Nutrition Examination Survey.
bAll models included a dummy variable for the survey rounds. Survey weights were included in all analyses. The C-Score was calculated using five out of seven covariates: waist to height ratio, self-rated health, resting heart rate, smoking, and reaction time. The C-Score was calculated using 4 out of 7 covariates.
cOR: odds ratio.
dAUC: area under the curve.
eAIC: Akaike Information Criterion.
Of the 21,015 participants with complete information on the C-Score metrics, 20,626 (98.15%) had information on sociodemographic characteristics and of those, 16,671 (80.82%) had complete information on medical history variables. Thus, the final analytic sample in which the C-Score was adapted comprised 16,671 participants.
Characteristics of the research sample (N=16,671).
Variable | Analytical sample | ||
Age (years), mean (SD) | 50.43 (17.32) | ||
|
|||
|
Male | 7840 (47.03) | |
|
Female | 8831 (52.97) | |
|
|||
|
Mexican American | 2142 (12.85) | |
|
Other Hispanic | 1447 (8.68) | |
|
Non-Hispanic White | 7944 (47.65) | |
|
Non-Hispanic Black | 3543 (21.25) | |
|
Other race (including multiracial) | 1595 (9.57) | |
Resting heart rate (beats per minute), mean (SD) | 72.53 (12.04) | ||
Waist to height ratio, mean (SD) | 0.59 (0.096) | ||
Weekly alcohol intake (drinks per week), mean (SD) | 3.32 (7.37) | ||
Sleep duration (hours per night), mean (SD) | 6.84 (1.39) | ||
|
|||
|
Excellent or very good | 6581 (39.48) | |
|
Good | 6617 (39.69) | |
|
Fair | 2948 (17.68) | |
|
Poor | 525 (3.15) | |
Number of cigarettes per day, mean (SD) | 2.97 (7.25) | ||
Comorbidities, n (%) | 3497 (21.03) | ||
Deaths, n (%) | 1062 (6.3) |
The addition of sociodemographic variables and medical history variables (model 3), in contrast, similarly increased the AUC of the original C-Score model from 0.72 to an AUC of 0.87 (95% CI 0.86-0.88), although without a loss in the goodness of fit.
Upon inclusion of interaction terms between each of the covariates and the C-Score variable, we did not obtain significant increases in AUC or fit, indicating that this more complex model does not offer much improvement compared with a more parsimonious model. In addition, the C-Score odds ratio was not significant, implying no change in the odds of all-cause mortality associated with the change in the C-Score.
Performance of original C-Score versus expanded models for all-cause mortality.a
Model | Independent variables | Participants, N | Score ORb ( |
Goodness of fit ( |
AUCc (95% CI) | AICd |
1 | C-Scoree | 21,015 | 0.96 (<.001) | Good fit (.86) | 0.72 (0.70-0.73) | 8897.78 |
2 | C-Scoree+sociodemographic variablesf | 20,626 | 0.97 (<.001) | Poor fit (.04) | 0.87 (0.85-0.88) | 6977.07 |
3 | C-Scoree+sociodemographic variablesf+medical historyg | 16,671 | 0.96 (<.001) | Good fit (.06) | 0.87 (0.86-0.88) | 5705.134 |
4 | C-Scoree+sociodemographic variablesf+medical historyg+interactionsh | 16,671 | 1.0 (.25) | Good fit (.19) | 0.87 (0.86-0.89) | 5693.319 |
aAll models include dummy variables for the survey rounds. Survey weights were included in all analyses.
bOR: odds ratio.
cAUC: area under the curve.
dAIC: Akaike Information Criterion.
eC-Score included six variables: cigarette consumption, alcohol consumption, sleep duration, self-rated health, waist to height ratio, and resting heart rate.
fSociodemographic variables included age, gender, race or ethnicity, marital status, and educational attainment.
gMedical history variables were
hEach sociodemographic variable and medical history variable interacted with the C-Score.
Receiver operating characteristic curve for original C-Score versus expanded models for all-cause mortality. Model 1: C-Score; model 2: C-Score+sociodemographic variables; model 3: C-Score+sociodemographic variables+medical variables; model 4: C-Score+sociodemographic variables+medical history+interactions.
Calibration plots of predicted versus observed probabilities for original C-Score versus expanded models for all-cause mortality. Model 1: C-Score; model 2: C-Score+sociodemographic variables; model 3: C-Score+sociodemographic variables+medical variables; model 4: C-Score+sociodemographic variables+medical history+interactions.
The validity of our final model (model 3) was assessed using k-fold cross-validation. We used 10 random samples to determine the discrimination capability of the model in predicting the future incidence of all-cause mortality. The AUCs for these random samples ranged from 0.85 to 0.87, showing high consistency in the discrimination of the model (
Internal validation using k-fold procedure (folds=10). cvAUC: cross-validation area under the curve; ROC: receiver operating characteristic.
The best-performing model (model 3) of the main analysis was used for external validation.
Calibration plot of predicted versus observed probabilities of all-cause mortality for model 3 for all-cause mortality. Model 3: C-Score+sociodemographic variables+medical variables. NHANES: National Health and Nutrition Examination Survey.
In this study, we conducted external validation of the C-Score in the US population and expanded the original score to improve its predictive capabilities in the US population.
We found that the C-Score had generally good prediction and calibration capabilities and that it is a promising model that could provide fast and accurate information on all-cause mortality through a digital health app. Our results reveal similar AUCs compared with those found in the United Kingdom by Clift et al [
Given the lack of the reaction time variable in the main NHANES sample, we conducted a sensitivity analysis with another survey (NHANES III), which contains the reaction time variable, to assess its marginal effect in predicting all-cause mortality. The results suggest that the absence of the reaction time variable did not meaningfully change the calibration or the discrimination attributes of the assessed model. We believe that the marginal effect is likely to be low as part of the variance explained by the reaction time variable might be captured by other variables in the C-Score.
In addition, we showed that the incorporation of a set of basic sociodemographic and medical history variables greatly boosted the model’s predictive performance in the US general population. The AUC for our final model greatly increased from 0.72 (95% CI 0.71-0.73) for the basic C-Score model to 0.87 (95% CI 0.86-0.88) in the expanded model. We further assessed the internal and external validity of the expanded model and found that the model performed equally well in the 10-fold cross-validation sample and the external NHANES III data set.
The incorporation of this model into a user-friendly digital health app can motivate users to predict their current and future health status and take actions to modify their health, thus potentially shaping their future trajectories. Consumer demand for technological innovations that measure health status and predict health outcomes is evidenced by the recent proliferation in the use of commercial wearable technologies, ranging from simple activity or exercise monitors to more sophisticated home-based connected medical devices [
Recent evidence confirms the utility of wearable technology in predicting clinical outcomes with high accuracy [
Our findings should be viewed in light of some limitations. First, we used a cross-sectional survey that did not follow individuals over time. NHANES is the only survey that is nationally representative of the US general population, which contains most of the variables present in the original C-Score model. The NHANES survey contains 6 out of the 7 variables included in the original UK population-based model, potentially leading to a C-Score that artificially underperforms when predicting all-cause mortality. However, our sensitivity analysis showed that the reaction time variable did not marginally provide additional value to the C-Score in this sample. Even if the subsample in which we tested the reaction time variable did not have the external validity to inform the results of the NHANES subsample, the lack of the reaction time variable would likely lead to an underperforming score, implying that the ability of the score to predict all-cause mortality would be higher, if the reaction time variable had been available in the main NHANES data set. Moreover, although the association between death and other covariates has been investigated using Cox proportional hazards models in other publications, including the original C-Score model [
Limitations notwithstanding, the findings of this validation indicate that the performance of the C-Score is fairly good for predicting all-cause mortality in the US population. The adapted risk score had even better prediction capabilities, as evidenced by the finding that it predicted 87% of the mortality in the US population.
In conclusion, our study findings validate and expand a novel risk-scoring algorithm that can predict the risk of all-cause mortality among adults in the general population with high accuracy and which could be incorporated into a digital health application. The use of high-performing risk scores could be instrumental in clinical counseling, choice of care pathways, and even patient-driven behavior change interventions targeting modifying lifestyles and promoting behavioral change. Despite known effective strategies to reduce NCD-related deaths worldwide, chronic and preventable NCDs continue to drive adult mortality. High-performing risk scores that trigger behavior change could be instrumental in stemming this tide of death and decreased global productivity.
area under the curve
noncommunicable disease
National Death Index
National Health and Nutrition Examination Survey
Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis
weight to height ratio
This study was supported by Huma, formerly Medopad (award number 133913). The content is the responsibility of the authors and does not necessarily reflect the views of Huma. The funders had no role in the study design, data collection and analysis, decision to publish, or manuscript preparation.
SE conducted the data analysis for both the validation and adaptation phases, contributed to the literature review, contributed to the methodological design, and drafted the manuscript. AIVO contributed to the methodological design, coordinated the project, and drafted the manuscript. DGG contributed to the methodological design and to the drafting of the manuscript. SA contributed to the methodological design and to the drafting of the manuscript. AJT contributed to the methodological design and to the drafting of the manuscript. YZ conducted the data analysis for the validation phase, contributed to the literature review, contributed to the methodological design, and drafted the manuscript. ABL is the principal investigator of the project and contributed to the methodological design and drafting of the manuscript.
None declared.