Published on in Vol 21, No 10 (2019): October

Preprints (earlier versions) of this paper are available at, first published .
Hospital-Based Back Surgery: Geospatial-Temporal, Explanatory, and Predictive Models

Hospital-Based Back Surgery: Geospatial-Temporal, Explanatory, and Predictive Models

Hospital-Based Back Surgery: Geospatial-Temporal, Explanatory, and Predictive Models

Authors of this article:

Lawrence Fulton1 Author Orcid Image ;   Clemens Scott Kruse1 Author Orcid Image

Original Paper

Department of Health Administration, Texas State University, San Marcos, United States

Corresponding Author:

Lawrence Fulton, PhD

Department of Health Administration

Texas State University

Encino Hall

712 N Comanche St

San Marcos, 78666

United States

Phone: 1 (512) 245 3492


Background: Hospital-based back surgery in the United States increased by 60% from January 2012 to December 2017, yet the supply of neurosurgeons remained relatively constant. During this time, adult obesity grew by 5%.

Objective: This study aimed to evaluate the demand and associated costs for hospital-based back surgery by geolocation over time to evaluate provider practice variation. The study then leveraged hierarchical time series to generate tight demand forecasts on an unobserved test set. Finally, explanatory financial, technical, workload, geographical, and temporal factors as well as state-level obesity rates were investigated as predictors for the demand for hospital-based back surgery.

Methods: Hospital data from January 2012 to December 2017 were used to generate geospatial-temporal maps and a video of the Current Procedural Terminology codes beginning with the digit 63 claims. Hierarchical time series modeling provided forecasts for each state, the census regions, and the nation for an unobserved test set and then again for the out-years of 2018 and 2019. Stepwise regression, lasso regression, ridge regression, elastic net, and gradient-boosted random forests were built on a training set and evaluated on a test set to evaluate variables important to explaining the demand for hospital-based back surgery.

Results: Widespread, unexplained practice variation over time was seen using geographical information systems (GIS) multimedia mapping. Hierarchical time series provided accurate forecasts on a blind dataset and suggested a 6.52% (from 497,325 procedures in 2017 to 529,777 in 2018) growth of hospital-based back surgery in 2018 (529,777 and up to 13.00% by 2019 [from 497,325 procedures in 2017 to 563,023 procedures in 2019]). The increase in payments by 2019 are estimated to be US $323.9 million. Extreme gradient-boosted random forests beat constrained and unconstrained regression models on a 20% unobserved test set and suggested that obesity is one of the most important factors in explaining the increase in demand for hospital-based back surgery.

Conclusions: Practice variation and obesity are factors to consider when estimating demand for hospital-based back surgery. Federal, state, and local planners should evaluate demand-side and supply-side interventions for this emerging problem.

J Med Internet Res 2019;21(10):e14609




In 2012, there were 3689 practicing board-certified neurosurgeons in the United States [1]. That number was largely unchanged in 2016 [2]. During these years, demand for back surgery (Current Procedural Terminology [CPT] codes beginning with the digit 63) increased by 49% from 311,028 to 464,391, and by the end of 2017, that increase was 60% [3]. CPT 63 medical codes are a series of spinal procedures including laminectomies, laminotomies, decompressions, and corpectomies. These procedures do not include needle decompression, catheter implantation, and, as of 2019, endoscopic decompression [4]. Given the stable supply and increasing demand, it is not surprising that the average payment procedure increased from US $4166 to US $4859 from 2012 to 2016 and to US $5452 by the middle of 2018, an effective 4.5% inflation rate [3]. Forecasting models that address increasing demand are necessary to evaluate potential supply and demand-side interventions.

Unsurprisingly, there is a marked variation in the treatment of back disorders such as spondylolisthesis [5]. This variation affects costs [6] as well as outcomes [5] associated with back surgery. The implication of this variation is increased demand. By evaluating the current geographic demand, policy makers can prioritize efforts for cost and variation reduction by evaluating those states and counties that exhibit high practice area variation, implementing evidence-based best practice policies and guidelines, educating populations about obesity risks, and implementing interventions for those at risk of obesity (eg, those living in food deserts).

During the same time that back surgeries have increased, adult obesity rates in the United States have also increased. The rate of this increase was 5% from 2012 to 2016 (34.9%-39.6%) [7]. Obesity has been linked to increased costs of medical care [8]. Although obese patients benefit from at least some back surgeries, they do not fare as well as nonobese patients [9]. Although obesity has been linked to back pain [10], no studies were found that directly link obesity to back surgery requirements. This study evaluated that relationship as well.


This study addressed 3 specific aspects of hospital-based CPT 63 surgery. First, a geospatial-temporal analysis by zip code is conducted to describe the previous and current demand for CPT 63 surgery. The significance of this geospatial-temporal analysis is that practice variation is highlighted for evaluation by federal, state, and local policy makers. Second, forecasting models estimate the demand and payments overall, by census region and by state. These models are also designed for state policy makers to assess potential supply- and demand-side intervention requirements. Third, explanatory models are developed to correlate obesity rates and financial, technical, workload, temporal, and geospatial variables with demand for CPT 63 procedures. This analysis does not appear to be previously investigated and is an important but overlooked correlational analysis. The study focused specifically on hospital-based knee surgery with CPT 63 codes (some of which reflect inpatient procedures) and was delimited to knee surgery only.


Definitive Healthcare provided the hospital, zip code, and state-level procedure and cost data from January 2012 to June 2018 through the hospital revenue center analytics query, which includes queries by CPT code. Data in Definitive Healthcare are derived from the Standard Analytical Files by the Centers for Medicare and Medicaid Services (CMS)]. From these data, the organization uses undisclosed algorithms to estimate all-payor claims. Columns with fewer than 11 claims or procedures are not shown because of privacy requirements [3]. For this analysis, only complete annual data from 2012 to 2017 were used, as the CMS datafile and thus the associated estimates from January 2018 through June 2018 were approximately only 93% complete [3].

The Centers for Disease Control and Prevention’s Behavioral Risk Factor Surveillance System (BRFSS) prevalence data provided the information for state-level adult obesity rates by year, from 2012 to 2017 [11]. Guam, Puerto Rico, and the US Virgin Islands were excluded from the analysis because of small sample sizes in both datasets.

Geospatial Analysis

Heat maps are used to plot the zip code unit of analysis procedure data by year. Heat maps provide the intensity of the number of claims by time and geographic region. These types of maps have been used for improving minority health surveillance [12], examining birth outcomes [13], and evaluating a variety of other applications in health care. The value in geospatial-temporal analysis is the graphical depiction of change in demand over time. A video display from 2012 to 2017 with standardized heat intensities provides an animated view of the change in demand by location. An analysis of cost and demand centers is then provided.

Forecasting Analysis

The data in the Definitive Healthcare dataset are nonseasonal as they provide annual-level observations by the hospital unit of analysis. Even so, generating nonseasonal forecasting models that have predictive capability on a blind withhold set at the proper level of aggregation can provide decision support for supply- and demand-side interventions. These types of models have found support in many areas of health care such as radiology [14] and Alzheimer disease [15].

To this end, hierarchical time series (HTS) [16] using R statistical software [17] evaluated the number of claims as a function of time series components. An HTS recognizes that data are aggregated at various levels. In this case, the hierarchy evaluated include the states, the census regions, and the nation. The models are built on a training set of data for the years 2012 to 2015 and forecast on a blind test set, years 2016 and 2017. Although Bayesian hierarchical models have been used for spatially correlated health outcomes and utilization rates [18], there is no readily found use of HTS for prediction in health care.

To understand HTS, one needs to only consider a single medical system that operates in 2 separate states with 3 hospitals per state. There are then 4 basic ways using which one might forecast annual visits as an example:

  1. A forecast might be generated for each hospital, aggregated at the state level and then further aggregated at the system level. A variety of different forecast methods might be used to generate the forecasts. The term for this method is bottoms up.
  2. Forecasts might be generated at the state level and then disaggregated (eg, via historical proportions) to the hospitals and aggregated to the system level. Again, the forecasts might be generated in multiple ways. The term for this method is middle out.
  3. Forecasts might be generated at the system level (via multiple methods) and then disaggregated to all levels below (eg, proportions). The term for this is top down.
  4. One might take some combination of the previous methods to minimize forecast error. This is an ensemble method that might be termed the optimal reconciliation approach, which is optimal if the forecasts are unbiased [19].

To avoid selection bias, all methods were evaluated for performance on the test set. Furthermore, the method for forecasting at these levels of hierarchy was using autoregressive integrated moving average (ARIMA) components and as well as smoothed error and trend components (exponential, trend, seasonality [ETS] without seasonality).

ARIMA models focus on autocorrelation of components for stationary time series data. The AR components are autoregressive terms, offset by time. For example, the number of claims at time t might be forecast by using the number of claims at time t-1. This would be an AR1 model, as there is 1 offset. As ARIMA models assume stationarity of the time series for forecasting, 2 other components are necessary. The first is differencing or integration, the I in ARIMA, which helps stabilize the mean (whereas transformations help stabilize the variance). Although seasonality and trend might make an ARIMA nonstationary, differencing often corrects this. Sometimes, more than 1 difference is required to make the time series stationary, for example, yt-yt-1-(yt-1-yt-2) is a 2d order differencing. The last component, the MA or moving average, corrects for autocorrelated errors as well. This component averages previous observation(s) with the previous forecast(s) [19].

ETS models have 3 components: error, trend, and seasonality. As the data in this study are not seasonal, only the error smoothing (identical to a moving average) and the trend component (a Holt model [20]) are evaluated.

With HTS bottom-up models, a separate ARIMA/ETS is built for each bottom-level component. For middle-out models, all middle-level components have separate forecasts. For top-down models, a single forecast is generated and proportioned down to the lower levels.

Explanatory Analysis

Stepwise linear regression (both forward and backward), lasso regression, robust regression, elastic net regression, and extreme gradient-boosted random forests are built on unaggregated data as well as state-level aggregated data to estimate the number of claims. These models are built on a random 80% training set (10,771 unaggregated, 245 aggregated observations) and evaluated on a 20% withhold set (2693 unaggregated, 61 aggregated) as well. The total number of valid observations were 13,464 unaggregated and 306 aggregated. The primary hypothesis is that the inclusion of obesity rates as an independent variable will yield better explanatory models for both the number of claims and the payment per claim.

Stepwise linear regression based on minimum Akaike Information Criterion was selected over best subset because of the computational complexity. By using forward and backward simultaneously, variables are added in sequence but might be removed if they no longer contribute to the model [21].

Lasso regression is a form of constrained regression that penalizes a model that selects too many variables by using an L1-norm formulation (absolute value), whereas ridge regression is similar but penalizes using an L2-norm formulation (squared coefficient estimates). Elastic net uses a weighted L1 and L2 norm penalty function to reduce the number of coefficients in the model. Formulae for estimating the parameters of the linear model, the lasso regression, the ridge regression, and the elastic net are shown in Figure 1.

Figure 1. Argmin equations for the regression models.
View this figure

Random forests, a machine learning technique, use an ensemble of decorrelated tree models and average the estimates of those trees to build forecasts. A tree model itself classifies counts of observations by splitting variables at points based on some decision criteria. An example of a tree with a depth of 3 (3 branches) is given in Figure 2, which splits observations by obesity rate less than 31.63 and then again by obesity rate less than 25.75 and number of discharges less than 10,558.78 and then finally by net income less than US $35,018,392, cash less than US $25,522,424, and cash less than US $8,122,498 [21]. The graph was produced by the xgboost package of R [22]. Gradient-boosted random forests optimize a cost function based on the (pseudo-)residuals of a function using nonlinear optimization techniques. Essentially, the residuals of each tree in the forest are refitted with the possible independent variables in another tree model to estimate a better fit of the original function. Often, a learning rate (shrinkage) is applied to the residuals to allow for better generalization. A discussion of gradient boosting is available in Chapter 10 of Elements of Statistical Learning [21].

Figure 2. An example of a single tree model with 3 branches. The graph was produced by the xgboost package of R. (NumDischarges indicates the number of discharges).
View this figure


All the considered variables from the Definitive Healthcare dataset are shown in Table 1 with reasons for exclusion/inclusion. Most variables were, by default, included during analysis; however, those variables that were linear combinations of each other or were necessarily unknowable when forecasting CPT 63 codes were omitted.

There is 1 primary dependent variable of interest: number of claims for CPT 63 codes. This variable is measured at the hospital level over time and is also aggregated by zip code/year for geospatial mapping and by state/year for forecasting and additional modeling analysis. The number of claims include third-party invoices provided by the hospital, regardless of the payer. The number of claims provides a measure of the met demand for services.

For the geospatial and temporal analyses, the variables year and zip code (aggregated hospital-level data) are used to describe the intensity of both the number of claims and the payment per claim. Zip code provides a high resolution for geographic claims data. For the HTS forecasting analysis, time components are used without external regressors.

Explanatory stepwise regression, lasso regression, ridge regression, elastic net, and gradient-boosted random forest models investigate financial variables, technical variables, workload variables, geospatial variables, a temporal variable (year), and obesity rates (defined as the proportion of individuals with a body mass index greater than or equal to 30%). A discussion of each of the variable groups and variables follows.

The financial variables investigated include net patient revenue, net income, cash on hand, total assets, total liabilities, and proportion of Medicare/Medicaid reimbursement. The financial variables were carefully selected from the set of available financials such that they are not a linear combination of other variables or nearly a linear combination (see Table 1). Although available, total payments and charges for CPT 63 were not used in the models, as they (1) would not be known in advance and (2) would necessarily be direct functions of the number of claims.

Quantitative workload variables include the number of staffed beds, discharges, emergency room visits, surgeries, affiliated physicians, and employees. Categorical technical variables include ownership type, medical school affiliation, and hospital type. These variables are investigated because of their availability and possible confounding effects. Geographic variables include urban/rural location, state, and zip code. These variables are important in evaluating practice area variation and associated effects.

Obesity rates are of interest to the study. These rates are assigned based on the state, as county and zip code level data are not available. This independent variable is of importance to the study.

Table 1. Variables in the study.
VariableTypeDefinitionScale of measurement
Number of claimsDependentFiled third-party claims{0, 1, 2, ...k}
Obesity rateObesityPercentage obese by state[0%, 40%]
CPTa 63 paymentsFinancialTotal CPT 63 paymentsUS $
CPT 63 chargesFinancialTotal CPT 63 chargesUS $
Net patient revenueFinancialTotal revenue from patientsUS $
Total revenuesFinancialAll revenue, patient-related or otherwiseUS $
Net incomeFinancialRevenues less expensesUS $
Total expensesFinancialTotal dollars attributed to expensesUS $
Cash on handFinancialFunds immediately availableUS $
Total assetsFinancialCurrent and noncurrent assetsUS $
Total liabilitiesFinancialCurrent and long-term debtUS $
Percentage Medicare/MedicaidFinancialPercentage claims from either source%
StateGeospatialHospital\'s state (address)AK, AL, ...
Zip codeGeospatial5-digit hospital zip code78666, ...
Geographic classificationTechnicalRural or urban locationRural, urban
OwnershipTechnicalHospital ownershipNonprofit, profit, government
Medical school affiliationTechnicalLevel of affiliation if anyGraduate, major, limited, none
Hospital typeTechnicalType of hospitalShort-term acute, children’s, etc
YearTemporalYear of report2012, 2018
Number of staffed bedsWorkloadPer Medicare report{0, 1, ...n}
Number of dischargesWorkloadTotal number of inpatient discharges{0, 1, ...n}
Number of Medicare dischargesWorkloadNumber of Medicare discharges{0, 1, ...n}
Estimated number of emergency room visitsWorkloadNumber of emergency room visits{0, 1, ...n}
Total surgeriesWorkloadNumber of surgeries{0, 1, ...n}
Total acute daysWorkloadNumber of acute bed days{0, 1, ...n}
Number of affiliated physiciansWorkloadNumber of affiliated physicians{0, 1, ...n}
Number of employeesWorkloadNumber of employees{0, 1, ...n}

aCPT: Current Procedural Terminology.

Descriptive Statistics: Missing Data

Missing data were present in the Definitive Healthcare dataset. As the percentage of missing data was small, the data were imputed via regression trees (simple imputation). The total number of valid observations at the hospital unit of analysis from January 2012 to December 2017 was 13,769. There were 2244 unique zip codes with data resulting in 13,464 observations from 2012 to 2017, although many of these were true zeros. Aggregated at the state level, there were 306 observations of the 50 states plus the District of Columbia over the 6-year span.

Descriptive Statistics: Quantitative Data

Important descriptive statistics for the Definitive Healthcare data are shown (Table 2). The average number of CPT 63 claims by hospital by year was 182, and the average payment was over US $4045.99, about 50.37% of mean charges (US $8,032.13). On average, hospitals performing these claims were large (227 beds with 1629 employees and 299 affiliated physicians). These hospitals had on average positive net income (US $22 million) and assets exceeding liabilities. Overall, 45% of their patients used Medicare or Medicaid reimbursement.

Table 2. Descriptive statistics for all years and all hospitals (N=13,769). The “K” suffix indicates dollars in thousands, while the “M” suffix indicates dollars in millions.
VariableMean (SD)MedianMinimumMaximum
Claims, n182 (244)89113592
Payments/claim, US $4045.99 (2448.96)3659.12034,975.76
Payments, US $767.9K (1097.8K)366.2K018,966.3K
Charges, US $1517.6K (2467.4K)616.4K035,317.2K
Charges/claim, US $8032.13 (6993.90)6370.080137,058.80
Net patient revenue, US $343.8M (430.4M)217.5M−98.6M5340.9M
Net income, US $22.8M (102.8M)11.3M−1648M1316.0M
Cash, US $30.3M (145.3M)2.8M−1992.7M3597.8M
Total assets, US $443.0M (820.1M)203.5M−231.7M9969.4M
Total liabilities, US $178.2M (465.7M)69.2M−2583.8M6372.4M
Staffed beds, n227.45 (201.36)177.001.002626
Discharges, n11,822.61 (11,395.52)8899.001.00127,600
Emergency room visits, n47,439.59 (39,580.09)39,209.000543,457
Surgeries, n9643.21 (9666.56)7019.000134,638
Affiliated physicians, n298.57 (333.43)198.001.003483
Employees, n1629.46 (2044.45)1027.007.0024,673
Percentage of Medicare/Medicaid, %0.45 (0.14)0.4401
Obesity rate, n29.44 (3.42)29.9220.2038

The number of hospitals reporting CPT 63 claims increased by 91 from 2012 to 2017. Charges increased from US $2.115 million to US $4.75 million, whereas payments increased from US $1.233 million to US $2.467 million. The proportion of charges paid fluctuated between 45% and 58%. The number of claims increased from 320K to 504K, a 60% increase (Table 3).

Variation across states from 2012 to 2017 for CPT 63 is impressive. The maximum average payment per claim was in Delaware (US $5190.62); however, the number of actual claims was small (5569). New York had the second highest payment per claim (US $5043.79) with 72,186 claims. Texas had the largest number of claims (260,208), yet the average payment per claim was only US $4223.22. On average, 60% of charges were paid (Table 4).

Obesity rates by state have increased from 2012 to 2017 (Table 5). In 2012, the mean obesity rate per state was 27.95%. By 2017, this rate was 30.59%; however, this increase is not weighted by population size. As discussed previously, the aggregate increase for the United States from 2012 to 2017 was 5% (34.9% to 39.6%) [7]. The state data include the District of Columbia (51 observations per year) but are not population weighted.

Table 3. Average statistics by year show the growth in both claims and payments.
YearHospitals, nTotal payments, in millions of US $Total charges, in millions of US $Total claims, nPayments/claim, in US $Charges/claim, in US $
Table 4. Payments, charges, number of claims, payment per claim, charge per claim, and percentage of charges paid by state.
StatePayments, in millions of US $Charges, in millions of US $Number of claims, nPayment/claim, in US $Charge/claim, in US $Percentage paid, %
District of Columbia26.3153.9056554651.789531.1648.80
North Carolina503.97886.27126,3443988.877014.7456.90
North Dakota36.8743.3690814060.134774.5885.00
New Hampshire74.81120.3819,0913918.496305.4062.10
New Jersey169.55328.0035,4794778.939244.9051.70
New Mexico31.1768.5285833631.597983.1945.50
New York364.09395.7472,1865043.795482.2192.00
Rhode Island15.9416.3737724225.094339.1097.40
South Carolina293.49492.0360,9774813.148069.0359.60
South Dakota70.42204.7817,8703940.8411,459.5434.40
West Virginia117.51131.2823,8524926.485504.0789.50
Table 5. State statistics for the proportion of the population identified as obese by the Behavioral Risk Factor Surveillance System by year.

Mean (SD) proportions27.95 (3.38)28.65 (3.44)29.23 (3.42)29.28 (3.87)29.78 (3.74)30.59 (3.86)
Median proportions27.6029.4029.6029.8329.9231.30

Descriptive Statistics: Categorical Data

Of the 13,769 hospital observations, 3153 were rural and the remaining 10,616 were urban. Most hospital observations were classified as voluntary nonprofits (8866, 64%), whereas proprietary corporations and government entities constituted 3426 (25%) and 1466 (11%), respectively, (11 hospital observations had no ownership specification). Most of the hospital observations (8311, 60%) had no affiliation with medical schools. The vast majority of the observations were from short-term acute care hospitals (13,040, 95%) with nearly all of the remainder (678, 5%) associated with critical access hospitals.

Descriptive Statistics: Correlational Analysis

Hierarchical clustered correlational analysis revealed strong relationships among many of the quantitative variables. Payments and claims are (as to be expected) highly correlated (r=0.9). Most financial and workload metrics are highly correlated as well (eg, net patient revenue and the number of employees; r=0.95). Owing to the large sample size, nearly all correlations are statistically significant at the alpha=.05 level (see Figure 3). The matrix was produced using ggcorrplot [23].

The inclusion of obesity in this study is because of a correlational finding that the number of CPT 63 procedures appears to be influenced by state obesity rates at the aggregate level (Figure 4 [24]). The question, though, is whether this apparent correlation in the logs is sustained when other financial, geographic, technical, and temporal variables are considered.

Figure 3. The correlation matrix depicts the variable relationships. The X’s indicate no statistically significant correlation. Owing to the large sample size, nearly all correlations are statistically significant at the alpha=.05 level. The matrix was produced using ggcorrplot.
View this figure
Figure 4. Correlation between the natural logarithm of obesity rates and the natural logarithm of the number of CPT 63 surgeries performed by hospitals.
View this figure

Exploratory Data Analysis: Feature Engineering and Transformations

Although random forest regression models are scale invariant, traditional regression techniques such as stepwise, lasso, ridge, and elastic net are not [21]. Investigating transformations to achieve multivariate normality (assuming random-effects regression) is important to meet model assumptions. Furthermore, time series forecasting often benefits from these same techniques [16]. In addition, investigating additional features that might be generated from the existing ones through linear combinations and other methods often results in disentangling collinear variables and finding interesting relationships that might otherwise remain undiscovered.

A multivariate Box-Cox transformation was run using the car package in R [25] for all modeled quantitative variables simultaneously after these variables had been location adjusted to make the variables strictly positive, definite, and scale adjusted by dividing by the standard deviation. Multivariate Box-Cox seeks to find power transformations (values of λ for each variable) that make the data multivariate normal enough for use in traditional linear models [26]. These transformations help alleviate the problem of collinearity and address multivariate normal assumptions of random-effects regression. The null hypothesis is that the proposed transformation generated through the transformation is a good fit. The alternative is that it is not a good fit. The proposed transformation was a vector of primarily natural logarithms (values near zero) with some exceptions. The likelihood ratio test resulted in a P value >.99, indicating that the assumption of multivariate normality cannot be rejected. The actual vector of transformations follows: λ={0.1, 0.3, 0.33, 0.38, 0.21, −0.07, 1.66, 0.76, −0.03, 0.55, 0.28, 0.17, 1.03, 1.04} for x={number of claims, number of staffed beds, number of discharges, ER visits, total surgeries, net patient revenue, net income, cash, total assets, total liabilities, affiliated physicians, employees, percent Medicare/Medicaid, obesity rate}, respectively.

Univariate histograms for the number of claims and obesity pre- and posttransformation are in Figure 5. The transformed graph of the number of claims shows some slight skew but is otherwise unremarkable. However, the graph of obesity rates is telling. Although the transformation fails to reject the assumption of multivariate normality, the obesity graph is bimodal. It is possible that linear-in-parameter models will not be able to correctly fit the importance of this variable, whereas tree-based models will find patterns.

Figure 5. The untransformed and transformed histograms of the number of claims and the obesity rate variables.
View this figure

Geospatial Analysis Results: Zip Code Unit of Analysis

Geospatial heat map analysis of CPT 63 number of claims by year and parsed by zip code is shown in panels (Figure 6). The maximum scale is approximately 7000 claims for each diagram to allow for comparison across years. Multimedia Appendix 1 shows this analysis in video format.

In 2012, there was very little high-intensity activity (Houston and Dallas, Texas, primarily, with some activity in the Carolinas). The Eastern seaboard has activity, but it is not intense, and the Western seaboard has minimal activity, except near Seattle.

By 2013, the Eastern seaboard (particularly New Jersey) has increased in intensity, and the areas around Chicago and Salem, Oregon, are emerging as well. Houston and Dallas remain the most intense regions for the number of claims.

In 2014, the number of claims in Seattle and San Antonio, Texas, shifted these cities to high-intensity areas (some red visible). It must be noted that 3 of the 4 cities with visible red tint are in Texas (San Antonio, Dallas, and Houston).

The year 2015 saw increasing intensity in both the New Jersey area and Chicago, Illinois. These 2 areas joined Houston, Dallas, San Antonio, and Seattle as high-intensity claims areas. Despite their populations, neither California nor Florida experienced the claims intensity of Texas.

Houston, Dallas, San Antonio, Seattle, Longview (Texas), Oklahoma City, the New Jersey area, and St. Louis were the notable areas of high intensity in 2016. The California coast became more intense along with Salt Lake City.

By 2017, the Eastern seaboard intensified (New Jersey, Delaware, Pennsylvania, Virginia, and District of Columbia) along with Phoenix, Arizona, and Atlanta, Georgia areas. The most intense areas remained Houston and Dallas.

Overall, the maps may suggest small area variations in practice patterns [27]. Although California and Florida have large populations, none of their major population centers reached the high-intensity scale of major cities in Texas. Furthermore, the Eastern seaboard’s increasing intensity suggests that something has changed. The questions then become are these changes in demand forecastable and how might they be explained.

Figure 6. Geospatial analysis of all CPT 63 claims from 2012 through 2017.
View this figure

Forecasting Results

Number of Claims

Being able to forecast demand is necessary for decision makers to investigate both supply- and demand-side interventions. To that end, HTS for state, census bureau region, and the nation using both ETS and ARIMA components were built on 2012-2015 training dataset and compared with the 2016-2017 test set using the hts package in R [16]. Bottom-up, top-down, middle-out, and combination approaches to this forecasting were analyzed.

The ETS models performed better on the test set in terms of both variance and bias as shown (Table 6), and the middle-out model performed better on all bias (mean error and mean percentage error) as well as variance (root mean squared error, mean absolute error, and mean absolute percentage error) metrics. The overall forecast from the ETS middle-out model for the unobserved years {2016, 2017} was {454,720.3, 482,049.9}, whereas the actual overall claims were {464,323, 497,325}, resulting in mean absolute percent error (MAPE) of {2.0%, 3.1%}. Table 7 illustrates the forecast and actual number of claims at the region-level hierarchy for the best performing model, whereas Table 8 provides the state-by-state forecasts.

Table 6. The performance metrics of the various hierarchical models show that the exponential, trend, seasonality middle-out model performed best on the test set.
ModelMean errorRoot mean squared errorMean absolute errorMean percent error (%)Mean absolute percent error (%)
ETSa-bottom up860.422905.641371.813.1610.71
ETS-top down688.262423.581266.133.0310.54
ETS-middle out611.752256.701219.592.7010.50
ARIMAb-bottom up5732.6316496.725762.2024.9326.65
ARIMA-top down5214.6114953.595312.3723.2625.66
ARIMA-middle out4606.3813420.784799.8820.7024.20

aETS: exponential, trend, seasonality.

bARIMA: autoregressive integrated moving average.

Table 7. Region-level forecasts demonstrate small error. The average mean absolute percent error (MAPE) for {2016, 2017} was {3.4%, 6.2%}, respectively.
Region2016 forecast2016 actual2017 forecast2017 actualMAPE 2016 (%)MAPE 2017 (%)
East North Central67,01967,67270,63072,7201.02.9
East South Central42,76542,74744,47747,1070.05.6
Middle Atlantic35,04437,53238,30942,8606.610.6
New England17,76319,47617,76320,7338.814.3
South Atlantic94,65096,093100,266103,5461.53.2
West North Central45,65346,55048,07548,6001.91.1
West South Central70,97275,35275,19779,5535.85.5
Table 8. Forecasts produced by the exponential, trend, seasonality middle-out model by state for 2016 and 2017 have an average mean absolute percent error (MAPE) of 10.1% and 13.2%, respectively.
State2016 forecast2016 actual2017 forecast2017 actualMean absolute error 2016 (%)Mean absolute error 2017 (%)
Dist. of Columbia10521015107010413.62.8
North Carolina21,17322,42921,52724,3145.611.5
North Dakota183916722017201310.00.2
New Hampshire30003073289033822.414.5
New Jersey66437067710266596.06.7
New Mexico135115701350175513.923.1
New York12,87614,20614,01415,4429.49.2
Rhode Island84169591475021.021.9
South Carolina11,05010,92111,85813,4941.212.1
South Dakota289636982977345421.713.8
West Virginia361940943680509611.627.8

HTS with the middle-out approach and ETS methods was refit on the entire dataset to generate forecasts. Figure 7 shows the regional forecasts for 2018 and 2019. The East North Central region of the country is likely to experience the largest growth in claims. The overall demand for 2018 and 2019 is forecasted to be {529,777, 562,023}, which represents growth of 6.52% growth in the first year (from 497,325 procedures in 2017 to 529,777 in 2018) and 13.00% by 2019 (from 497,325 procedures in 2017 to 562,023 in 2019). At US $5000 average per claim (a simple linear model would suggest US $4910 in 2018 and US $5123 in 2019), the net increase in cost for 2018 would be US $162.2 million for 2018 and US $323.9 million for 2019. The next question becomes what explains the predicted growth of these claims other than possibly practice variation.

Figure 7. Regional forecasts generated by the hierarchical time series middle-out model with exponential, trend, seasonality components.
View this figure

Explanatory Modeling Results

To investigate explanatory variables, several models were explored. Stepwise regression for the number of claims at the hospital level using the transformed variables and an 80% training set was successfully able to predict the number of claims on the withheld test set with some accuracy (adjusted R2=0.39 on the training set and adjusted R2=0.38 on the test set). This indicates that the sum of squared regression accounted for 38% of the variance of the sum of squared total on the test set. Payments and charges were excluded from the analysis as they are necessarily functions of claims. The variables evaluated were the number of staffed beds, discharges, surgeries, net patient revenue, net income, total assets, total liabilities, affiliated physicians, employees, percentage Medicare/Medicaid, state, year, urban/rural status, ownership, medical school status, and hospital type. Table 9 provides the remaining variables generated from the stepwise regression at the hospital unit of analysis. It should be noted that obesity did not remain in the final model.

Stepwise regression for the number of claims with data aggregated (mean) by state and by year (N=306 observations, 51 states/territories × 6 years) resulted in an impressive model using an 80% training set to predict a 20% withhold set. The adjusted R2 was 0.87 on the training set and 0.77 on the test set after dropping insignificant variables from the analysis. The variables in this model included state, year, number of discharges, and total liabilities (a parsimonious model; Table 10). Again, there is no evidence that obesity rates are predictive of CPT 63 surgery in this model.

Lasso, ridge, and elastic net regression models were able to predict the unaggregated test set with some accuracy (R2=0.38, 0.37, 0.38, respectively.) None of these penalty-weighted models improved upon the stepwise analysis significantly, although elastic net tied. Obesity was not retained in these models. For the aggregated set (state and year), the associated R2 were 0.78, 0.75, and 0.78, respectively. The lasso and elastic net models were slightly superior to the stepwise regression model (Figure 5). The top 10 variables by effect size in the state-aggregated elastic net model are shown in Table 11. The effect size of obesity was near zero (0.0098). If one were to make a conclusion using traditional and constrained linear models, obesity would not be a factor for explaining the number of claims; however, random forests would prove otherwise.

Table 9. Variables below from the stepwise regression predicted a withhold set with adjusted R2=0.38.
VariableSum of squaresMean squared errorF value (df)P value
Staffed beds95.7895.782423.29 (1)<.001
Discharges21.9321.93554.77 (1)<.001
ER visits17.5017.50442.68 (1)<.001
Surgeries41.1541.151041.03 (1)<.001
Net patient revenue5.575.57140.99 (1)<.001
Net income2.532.5363.97 (1)<.001
Total liabilities4.454.45112.70 (1)<.001
Affiliated physicians0.240.246.08 (1)<.01
Employees6.286.28158.78 (1)<.001
Percentage Medicare/Medicaid1.161.1629.34 (1)<.001
State52.351.0526.49 (50)<.001
Year8.091.6240.91 (5)<.001
Urban rural status1.991.9950.43 (1)<.001
Ownership10.760.9022.68 (12)<.001
Medical school affiliation2.160.5413.63 (4)<.001
Hospital type0.640.133.24 (5)<.01
Table 10. Variables in the analysis by state and by year.
VariableSum of squaresMean squared errorF value (df)P value
State1.160.0226.40 (50)<.01
Year0.170.0339.16 (5)<.01
Discharges0.060.0663.19 (1)<.01
Net income0.0040.0044.45 (1).04
Total liabilities0.0040.0044.65 (1).03
Table 11. Top 10 coefficients by effect size of the elastic net.
Total assets−1.539
Net patient revenue−1.044
Number of staffed beds0.212
Number of discharges0.186
New Jersey−0.178
Total surgeries0.162
New York−0.159

Gradient-boosted random forests with hyperparameter tuning outperformed all models: stepwise, lasso, ridge, elastic net regression. On the unaggregated withhold set, a well-pruned model (depth 4) with 2000 epoch runs and a slow learning rate of 0.1 accounted for more than 78.5% of the variability (R2=0.79) on the unobserved test set. Comparing this value with the approximately 38% variability accounted for in the other models suggests that the random forest model is superior. Figure 8 is a plot of the gain (the average improvement when the feature is used in a tree) for the top 5 items in the importance matrix, whereas Figure 9 is a plot of the cover (the average proportion of samples affected by splitting using this feature) for the top 5 items of the unaggregated model. These figures illustrate that obesity is one of the prominent features in both gain and cover of the unaggregated model.

Figure 8. Gain plot for the top 5 variables, unaggregated model.
View this figure
Figure 9. Cover plot for the top 5 variables, unaggregated model.
View this figure

Despite the exceptional gains of the extreme gradient-boosted random forests on the unaggregated, hospital-level data, the application of hyperparameter-tuned models to the aggregated data (by state and year) yielded only nominal improvement over the constrained regression methods, possibly because of the smaller sample due to aggregation. A well-pruned model (depth=3) after 3000 epochs with a slow learning rate (0.1) achieved an R2 of 0.80. The gain and cover graphs are shown in Figures 10 and 11, and obesity rate is the most important feature at the state-aggregated level.

Most importantly, the gradient-boosted random forests identified obesity as the second most important factor for gain at the hospital level and as the most important factor for both gain and cover at the state level of analysis. Furthermore, the gradient-boosted random forests performed better than any other model considered on a blinded test set.

Figure 10. Gain plot for the top 5 variables, aggregated model.
View this figure
Figure 11. Cover plot for the top 5 variables, aggregated model.
View this figure

Principal Findings

In this analysis, we evaluated the location, magnitude, and reasons for the growth of CPT 63 back surgeries in the United States. The GIS heat map analysis shows large-scale growth, particularly in the Northeastern region of the United States, and sustained activity in Texas. The entirety of the Eastern seaboard has seen growth in these procedures, and the associated increased cost is estimated to be US $323.9 million by the end of 2019.

The principal findings of this study are described here. Each of the following results includes a discussion of significance and (if appropriate) policy:

  1. The Northeastern seaboard is likely to see continued growth in CPT 63 procedures. The implication for states in this region is that they may see more unplanned expenditures on health care, affecting their budgets. Furthermore, cost controls and reduction of practice variation based on evidence will become more important.
  2. The cost associated with these procedures is outstripping inflation and will likely result in national expenditures in the triple-digit billions. The federal government may need to evaluate its own evidence-based, best practice policies associated with funding of procedures that link selected interventions with outcomes and that reasonably limit reimbursement.
  3. Interstate practice variation appears to be extreme. For example, large population centers in California have fewer claims than large population centers in Texas. States should also investigate intrastate variation.
  4. Hierarchical forecasts suggest an increase in the number of claims of 6.5% for 2018 and 13% in 2019. The initial models were built on a blind test set and performed well. These types of forecasts are reasonably effective for claims analysis.
  5. Explanatory regression models for nation-level claims data had only some success in internal predictions. These models excluded obesity as a predictor. Regression models were more successful at predicting aggregated state/year models, though. These traditional models should be abandoned in favor of random forests.
  6. Extreme gradient-boosted random forest models were highly successful in predicting both hospital-level unit of analysis number of claims and aggregate-level claims on an unobserved test set. These models identified obesity as an important factor in estimating the number of claims. Furthermore, the use of these models underscores that even after multivariate transformations, nonlinear functions may exist in modeled data. Random forests unearthed patterns not visible to regression and constrained regression models.


There are many limitations in this work. First, the algorithms used by Definitive Healthcare to extrapolate CMS data to all-payor data are not divulged. This omission is problematic for verification but understandable because of parochial concerns. Second, only ETS and ARIMA models were considered for the HTS fitting as these models are implemented in the R HTS package. There are an infinite number of models for forecasting, including random forest time series that might have performed better. Third, the explanatory variables are limited to those tracked by CMS and the BRFSS.


Hospital-based back surgeries are likely to increase dramatically over the next several years, yet the supply of neurosurgeons is constant. With that increase, the cost of the procedures (mostly borne by third-party payers) will increase as well. Practice variation appears to be prevalent across the country; however, obesity itself is a factor that must be considered as a significant influence. Policy interventions must be considered at many levels.

Clinical practice variation is something that may require intervention at the federal level. For example, a study in Scandinavia found significant differences among Norway, Sweden, and Denmark in the use of concomitant arthrodesis without any difference in treatment efficacy, increasing the cost without improving outcomes [28]. Controlling costs across states may require federal (and state) reimbursement interventions and incentives.

States should continue educational and financial interventions targeting obesity in adults and children. As the obesity epidemic continues to grow, the medical intervention costs are likely to grow accordingly. Furthermore, states should evaluate county-to-county practice variation as these variations often increase cost without improving quality [27].

Local interventions should consider the targeting of food deserts (urban areas where fresh, quality food is difficult to find) for eradication as well as educational interventions. Several studies have shown that the food environment is directly linked to obesity [29-31]. Eliminating or at least reducing the number of food deserts requires incentivizing grocery stores to populate areas where it may not be as lucrative because of poverty or demand.

Insurance companies themselves have a vested interest in both reducing obesity and controlling practice variation. Obesity is linked to numerous health disorders such as heart disease, type 2 diabetes, and bone and joint disease [32], any of which may result in additional costs to the health care system and insurer. Funding prevention efforts and establishing policies to reduce practice area variation are likely to benefit them as well as the population health over time.

Federal, state, and local policy makers need to address the increasing obesity epidemic and the likely associated increase in demand for back surgeries. The implications of not doing so are increased cost, questionable quality/cost trade-offs, and reduced access because of the small and steady number of available neurosurgeons. The fattening of America and the costs associated with it are likely to continue increasing otherwise.

Conflicts of Interest

None declared.

Multimedia Appendix 1

This video depicts heat maps for the number of claims from 2012 through 2017.

MP4 File (MP4 Video), 4334 KB

  1. American Association of Neurological Surgeons. 2012. Ensuring an Adequate Neurosurgical Workforce for the 21st Century   URL: [accessed 2019-09-03] [WebCite Cache]
  2. Greenwood B. Houston Chronicle. 2018. What Is the Job Outlook for a Neurosurgeon?   URL: [accessed 2019-05-05] [WebCite Cache]
  3. Definitive Healthcare: Healthcare Analytics & Provider Data. 2019.   URL: [accessed 2019-05-05] [WebCite Cache]
  4. SuperCoder: Medical Coding & Billing Tools. 2019. CPT® Codes — Surgical Procedures on the Spine and Spinal Cord   URL: [accessed 2019-08-31] [WebCite Cache]
  5. Azad TD, Vail D, O'Connell C, Han SS, Veeravagu A, Ratliff JK. Geographic variation in the surgical management of lumbar spondylolisthesis: characterizing practice patterns and outcomes. Spine J 2018 Dec;18(12):2232-2238. [CrossRef] [Medline]
  6. Alvin MD, Lubelski D, Alam R, Williams SK, Obuchowski NA, Steinmetz MP, et al. Spine surgeon treatment variability: the impact on costs. Global Spine J 2018 Aug;8(5):498-506 [FREE Full text] [CrossRef] [Medline]
  7. The State of Obesity – Better Policies for a Healthier America. 2019. National Obesity Rates & Trends   URL: [accessed 2019-08-31] [WebCite Cache]
  8. Finkelstein EA, Trogdon JG, Cohen JW, Dietz W. Annual medical spending attributable to obesity: payer-and service-specific estimates. Health Aff (Millwood) 2009;28(5):w822-w831. [CrossRef] [Medline]
  9. Chan AK, Bisson EF, Bydon M, Glassman SD, Foley KT, Potts EA, et al. Obese patients benefit, but do not fare as well as nonobese patients, following lumbar spondylolisthesis surgery: an analysis of the quality outcomes database. Neurosurgery 2018 Dec 12:- (epub ahead of print). [CrossRef] [Medline]
  10. Zhang TT, Liu Z, Liu YL, Zhao JJ, Liu DW, Tian QB. Obesity as a risk factor for low back pain: a meta-analysis. Clin Spine Surg 2018 Feb;31(1):22-27. [CrossRef] [Medline]
  11. Chronic Disease and Health Promotion Data & Indicators - CDC. 2019. BRFSS: Table of Overweight and Obesity (BMI)   URL: https:/​/chronicdata.​​Behavioral-Risk-Factors/​BRFSS-Table-of-Overweight-and-Obesity-BMI-/​fqb7-mgjf [accessed 2019-05-05] [WebCite Cache]
  12. Lee DC, Yi SS, Athens JK, Vinson AJ, Wall SP, Ravenell JE. Using geospatial analysis and emergency claims data to improve minority health surveillance. J Racial Ethn Health Disparities 2018 Aug;5(4):712-720 [FREE Full text] [CrossRef] [Medline]
  13. MacQuillan EL, Curtis AB, Baker KM, Paul R, Back YO. Using GIS mapping to target public health interventions: examining birth outcomes across GIS techniques. J Community Health 2017 Aug;42(4):633-638. [CrossRef] [Medline]
  14. Côté MJ, Smith MA. Forecasting the demand for radiology services. Health Syst (Basingstoke) 2018;7(2):79-88 [FREE Full text] [CrossRef] [Medline]
  15. Brookmeyer R, Johnson E, Ziegler-Graham K, Arrighi HM. Forecasting the global burden of Alzheimer's disease. Alzheimers Dement 2007 Jul;3(3):186-191. [CrossRef] [Medline]
  16. Hyndman R, Lee A, Wang E, Wickramasuriya S. RDDR. 2018. HTS-package: Hierarchical and Grouped Time Series   URL: [accessed 2019-08-31]
  17. The R Development Core Team. Servidor de software libre de la Universidad de Zaragoza. 2018. R: A Language and Environment for Statistical Computing   URL: [accessed 2019-09-03]
  18. MacNab YC. Hierarchical Bayesian modeling of spatially correlated health service outcome and utilization rates. Biometrics 2003 Jun;59(2):305-316. [CrossRef] [Medline]
  19. Hyndman RJ, Athanasopoulos G. Forecasting: Principles and Practice. Second Edition. Boston, Massachusetts: OTexts; 2019.
  20. Holt CC. Forecasting seasonals and trends by exponentially weighted moving averages. Int J Forecast 2004 Jan;20(1):5-10. [CrossRef]
  21. Hastie T, Tibshirani R, Friedman J. TheElements of Statistical Learning: Data Mining, Inference, and Prediction. Second Edition. New York: Springer; 2019.
  22. Chen T, He T, Benesty M, Khotilovich V, Tang Y, Cho H. XGBoost Documentation. 2018. XGBoost: Extreme Gradient Boosting   URL: [accessed 2019-08-31]
  23. Kassambara A. Statistical Tools for High-Throughput Data Analysis. 2018. ggcorrplot: Visualization of a Correlation Matrix Using ggplot2   URL: [accessed 2019-08-31]
  24. Softonic. 2016. Microsoft Excel 2016   URL: [accessed 2019-08-31]
  25. Fox J, Weisberg S. An R Companion to Applied Regression. Second Edition. Thousand Oaks, CA: Sage Publications; 2011.
  26. Box GE, Cox DR. An analysis of transformations. J R Stat Soc Series B Stat Methodol 2018 Dec 5;26(2):211-243 [FREE Full text] [CrossRef]
  27. Wennberg J, Gittelsohn N. Small area variations in health care delivery. Science 1973 Dec 14;182(4117):1102-1108. [CrossRef] [Medline]
  28. Lønne G, Fritzell P, Hägg O, Nordvall D, Gerdhem P, Lagerbäck T, et al. Lumbar spinal stenosis: comparison of surgical practice variation and clinical outcome in three national spine registries. Spine J 2019 Jan;19(1):41-49. [CrossRef] [Medline]
  29. Morland K, Wing S, Roux AD. The contextual effect of the local food environment on residents' diets: the atherosclerosis risk in communities study. Am J Public Health 2002 Nov;92(11):1761-1767. [CrossRef] [Medline]
  30. Morland K, Roux AV, Wing S. Supermarkets, other food stores, and obesity: the atherosclerosis risk in communities study. Am J Prev Med 2006 Apr;30(4):333-339. [CrossRef] [Medline]
  31. Holsten JE. Obesity and the community food environment: a systematic review. Public Health Nutr 2009 Mar;12(3):397-405. [CrossRef] [Medline]
  32. Stanford Health Care (SHC) - Stanford Medical Center. 2019. Effects of Obesity   URL: [accessed 2019-08-31]

ARIMA: autoregressive integrated moving average models
BRFSS: Behavioral Risk Factor Surveillance System
CPT: Current Procedural Terminology
CMS: Centers for Medicare and Medicaid Services
ETS: exponential, trend, seasonality
HTS: hierarchical time series
MAPE: mean absolute percent error

Edited by G Eysenbach; submitted 05.05.19; peer-reviewed by C Mavrot, A Kotlo; comments to author 26.07.19; revised version received 31.07.19; accepted 09.08.19; published 29.10.19


©Lawrence Fulton, Clemens Scott Kruse. Originally published in the Journal of Medical Internet Research (, 29.10.2019.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.