Transcatheter aortic valve replacement (TAVR) has revolutionized the treatment of severe aortic stenosis and has become the gold standard treatment for patients with severe symptomatic aortic stenosis as approved by the US Food and Drug Administration (FDA) (1). Annual TAVR volume in the United States has increased steadily with more than 500% growth rate from approximately 5,000 in 2012 to almost 250,000 in 2019 (2). More recent trials (3-6) have expanded indications for TAVR to include patients with intermediate surgical risk, and results of the PARTNER 3 trial imply that TAVR will soon be the treatment of choice for low-risk candidates (7). With aging population, increased prevalence of aortic stenosis, and expansion of TAVR to low-risk and younger patient population group, there is going to be a rising demand and thus adequate resource utilization is of paramount importance (8). There is an increased emphasis on improving the healthcare quality in the United States. Healthcare reimbursement models are being shifted from “payment for volume” to “payment for value.” In this scenario, hospital systems are increasingly motivated to curb the hospitalization costs.
Given the increasing healthcare costs, there is an interest in developing machine learning (ML) prediction models for estimating hospitalization charges. ML models for colorectal (9) and gastric cancer (10) have been used to predict hospitalization charges. Similarly, Muhlestein et al. (11) developed ensemble ML models for estimating charges following trans-sphenoidal surgery for pituitary tumors. However, to the best of our knowledge there does not exist cost prediction models for cardiovascular procedures including TAVR. Herein, we use ML algorithms to predict hospitalization charges for patients undergoing transfemoral TAVR (TF-TAVR) utilizing the National Inpatient Sample (NIS) database. We present the following article in accordance with the TRIPOD reporting checklist (available at https://cdt.amegroups.com/article/view/10.21037/cdt-21-717/rc).
Data source and study population
We used the Agency for Healthcare Research and Quality’s NIS, the largest all-payer database of hospitalized patients in the United States. Patients aged ≥18 years with a discharge diagnosis aortic valve stenosis who underwent TF-TAVR [International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) procedure code 35.05 or 35.06 and International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) procedure codes 02RF37H, 02RF37Z, 02RF38H, 02RF38Z, 02RF3JH, 02RF3JZ, 02RF3KH, 02RF3KZ] from 2012 to 2016 were included in the study. Because the study used de-identified data, it was exempted from Institutional Review Board (IRB) approval. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013).
Candidate variables and outcomes
Fifty-nine variables, including patient and hospital characteristics were collected for each hospitalization (description of variables in supplementary table). Patient comorbidities were identified using the Elixhauser Comorbidity Software administered by AHRQ.
The primary outcome was total hospitalization charges, calculated in US dollars. All the charges were adjusted for inflation.
The missing values were imputed using the k-nearest neighbors (KNN) algorithm. This algorithm uses ‘feature similarity’ to make predictions about the missing values by finding the k’s closest neighbors to the observation with missing data and then imputing them based on the non-missing values in neighborhood. The data was imputed after the training/testing data split.
ML model development and validation
The study dataset was divided into 80% training and 20% testing sets for the development and validation of ML algorithms respectively. In our study, we used the following ML regression algorithms: random forest, gradient boosting, KNN, multi-layer perceptron and linear regression. The important features were selected using the random forest algorithm. Grid search strategy was used to identify the combination of hyperparameters for enlisted ML algorithms based on cross-validation. The searched parameters included max_depth (range from 4 to 8), max_features (auto, sqrt, log2), and n_estimators (range from 10 to 200). The optimal values of RF model included: max_depth of 8, max_features: sqrt, and n_estimators =100. In our study, we built ML algorithms for 3 stages: Stage 1, including variables that were known pre-procedurally (prior to TF-TAVR) at the time of admission; Stage 2, including variables that were known post-procedurally; Stage 3, including length of stay (LOS) in addition to the stage 2 variables.
The performance of the ML models was compared to average and median models using four evaluation metrics: R2 score, mean absolute error (MAE), root mean squared area (RMSE), and root mean squared logarithmic error (RMSLE). Higher R2 score and lower MAE, RMSE and RMSLE signifies better model performance.
Lift charts were generated in order to visualize how accurately the ensemble ML model predicts the LOS and hospitalization charges in the validation cohort. To generate these charts, we ranked and divided the best performing ML model predictions into 10 ‘bins’ and calculated the average LOS and hospitalization charges for each bin. We then calculated the average actual LOS and hospitalization charges respectively for each decile and then plotted the average predicted values against the average actual values.
Partial dependence plots
Partial dependence plots allow one to visualize how a model reacts to changes in a single variable. Predictions are made using the test values and the mean value of the predictions calculated. The mean prediction is plotted over the test values to generate a visual representation of the model’s response to changes in the variable.
Additional statistical analysis was performed to describe patient characteristics. Continuous variables were compared using the 2-tailed student’s t-test, whereas chi-square or Fischer exact tests were used for categorical data as appropriate. The analysis was conducted using python 3.6.9. The libraries used in the Python for this project were SciPy, Scikit-Learn and Numpy.
A total of 18,793 individuals with age >18 years undergoing transfemoral TAVR from 2012 to 2016 were reviewed for the analysis. The mean age of the study population was 81.48 years and 46.6% were females. The mean adjusted hospitalization cost was $220,725.2 ($137,675.1) and the median adjusted hospitalization cost was $187,212.0 ($137,971.0–264,824.8). The distribution of adjusted hospitalization charges is described in Figure 1. In our study, around 14.2% patients had acute renal failure post-TAVR. About 2.45% (n=461) patients had cardiogenic shock and 1.78% required the use of mechanical circulatory support device. Table 1 shows the patient and hospital characteristics along with the mean (SD) adjusted hospitalization charges. The description of baseline characteristics and in-hospital outcomes in the study cohort in enlisted in the Table S1.
|Patient characteristics/co-morbidities and complications||Adjusted hospitalization charges, mean (SD)||P value|
|Stage 1 variables|
|African American||220,776.41 (143,384.43)|
|Hospital bed size||0.001|
|Urban non-teaching||196,018.42 (124,189.82)|
|Urban teaching||220,734.69 (134,003.58)|
|Fluid and electrolyte disorder||0.001|
|Congestive heart failure||0.001|
|Coronary artery disease||0.0015|
|Carotid artery disease||0.001|
|Peripheral vascular disease||0.90|
|Chronic lung disease||0.001|
|Solid tumor without metastasis||0.56|
|ESRD requiring dialysis||0.001|
|CKD stage 5||0.75|
|CKD stage 4||0.003|
|CKD stage 3||0.68|
|CKD stage 1–2||0.89|
|Stage 2 variables|
|Mechanical circulatory support device||0.001|
|Acute renal failure||0.001|
|New Pacemaker Insertion||0.001|
|In hospital sepsis||0.001|
PCI, percutaneous coronary intervention; HTN, high blood pressure; ESRD, end-stage renal disease; CKD, chronic kidney disease; STEMI, ST segment elevation myocardial infarction; NSTEMI, non-STEMI; DM, diabetes mellitus.
ML regression algorithms’ predictive performance for adjusted hospitalization charges
Table 2 shows the predictive performance of various ML regression algorithms in comparison to mean and median models for estimating the adjusted hospitalization charges in patients undergoing TF-TAVR. All the ML algorithms performed significantly better than the mean or median models. Random forest regression algorithm outperformed other ML algorithms at all stages with higher R2 score and lower MAE, RMSE and RMSLE (Stage 1: MAE 79,979.11, R2 0.157; Stage 2: MAE 76,200.09, R2 0.256; Stage 3: MAE 69,350.09, R2 0.453). Apart from random forest regression, gradient boosting regression for stage 1 variables and KNN regression for stage 3 variables performed better than linear regression algorithm. As expected, there was an increase in the predictive performance from Stage 1 to Stage 3 given the addition of variables.
|Random forest regression||79,979.11||0.157||122,091.48||0.499|
|Gradient boosting regression||81,544.21||0.114||125,124.47||0.509|
|Random forest regression||76,200.09||0.256||114,665.82||0.480|
|Gradient boosting regression||80,541.18||0.146||125,124.47||0.502|
|Random forest regression||69,350.09||0.453||98,307.96||0.444|
|Gradient boosting regression||74,903.48||0.27||114,208.19||0.463|
TAVR, transcatheter aortic valve replacement; MAE, mean absolute error; RMSE, root mean squared error; RMSLE, root mean squared logarithmic error; KNN, k-nearest neighbors; MLP, multilayer perceptron.
Predictors of hospitalization charges and partial dependence plots
Features selected for building ML algorithms at each stage in order of their importance are depicted in Figure 2. The top features using the random forest regression algorithm were based on variable importance. At the time of admission (pre-procedurally), hospital region, fluid and electrolyte disorders, age, race and elective admission were the most significant predictors of hospitalizations charges. Hospitalizations for TAVR in the west region [$263,406.45 ($183,016.34)] were more expensive than hospitalizations in north-east region [$236,153.86 ($138,075.82)] followed by south [$204,133.57 ($107,372.72)] and mid-west [$182,217.13 ($98,036.83)] regions. There were higher hospitalization charges incurred amongst the Hispanic [$279,291.35 ($167,575.43)] and African-American population [$220,776.41 ($143,384.43)] compared to the Caucasians. Patients undergoing elective TF-TAVR were likely to have less hospitalization charges [$202,184.10 ($112,270.59) vs. $254,383.26 ($167,198.82)]. There was a negative correlation of age with adjusted hospitalization charges (pearson correlation coefficient −0.0559). Individuals aged 60–75 years had higher hospitalization charges compared to ≥75 years ($227,806.93 vs. $219,196.70).
Amongst the stage 2 variables, mechanical ventilation, acute renal failure, cardiogenic shock, use of mechanical circulatory support devices, new pacemaker insertion and in-hospital sepsis were prominent predictors of increased hospitalization charges. Use of mechanical ventilation was associated with around 2.5-fold increase in mean adjusted hospitalization charges [$504,642.89 ($336,264.50) vs. $210,434.02 ($115,121.91); cardiogenic shock, $404,308.31 ($355,471.88) vs. $212,766.58 ($118,381.33)] and mechanical circulatory support device use [$389,175.93 ($294,305.99) vs. $214,489.27 ($126,256.49)] around 2-fold increase in hospitalization charges and acute renal failure around 1.5-fold increase [$314,855.94 ($219,052.69) vs. $201,970.45 ($105,287.01)].
For stage 3, LOS was the most important predictor of hospitalization charges. Figure 2 depicts the two-way partial dependence interaction between LOS and the second important variable (i.e., mechanical ventilation).
The actual and stage wise predicted hospitalization charges for the first 20 patients is shown in the Figure 3. The stage wise lift charts for the testing (validation) cohort are depicted in the Figure 4. Decile wise actual and stage wise predicted hospitalization costs in patients undergoing TF-TAVR can be seen in Table 3. The accuracy of various ML algorithms for predicted versus measured hospitalization charges is shown in the supplement.
|Deciles||Actual hospitalization cost ($)||Stage 1 predicted hospitalization cost ($)||Stage 2 predicted hospitalization cost ($)||Stage 3 predicted hospitalization cost ($)|
TF-TAVR, transfemoral transcatheter aortic valve replacement
Hospitalization charges is an important indicator of resource utilization (11) and understanding the predictors of higher hospitalization charges provides practitioners an opportunity to address potentially reversible drivers of charges. In our study, we found that ML regression algorithms performed significantly better than mean and median models for predicting adjusted hospitalization charges in patients undergoing TF-TAVR. Amongst the ML algorithms, random forest regression outperformed others at all stages. LOS was the most significant predictor of adjusted hospitalization charges.
We found that LOS is by far the strongest predictor of adjusted hospitalization charges. The influence of this variable is so strong that the relative impact of all other variables is nearly negligible. Decreasing LOS is thus of paramount importance for cost-reduction strategy for this patient population. LOS is an indicator of cumulative effects of multiple factors including patient’s baseline characteristics, co-morbidities, clinical presentation, urgency of procedure, post-procedure complications, and individual hospital protocols. The adjusted costs for next-day discharge (NDD) following TAVR is nearly $7,500 lower compared with non- NDD (12).
Mechanical ventilation, acute renal failure, cardiogenic shock, use of mechanical circulatory support device, in hospital sepsis, and new pacemaker insertion were all associated with increased hospitalization costs. Acute kidney injury (AKI) occurs frequently following TAVR and has been associated with worse outcomes (13). AKI is expensive and consumes a considerable amount of health care resources. Even the most conservative episodes attribute approximately $1,700 in excess costs for each episode of AKI and $11,000 in excess costs for each episode of dialysis-requiring AKI (14). Since from the NIS database it is not possible to determine whether cardiogenic shock or use of mechanical circulatory support device occurred before or after the procedure, we included these variables in stage 2 in our study.
Other predictors of hospitalization charges
In our study cohort, there was a negative correlation of age with adjusted hospitalization charges. Higher costs in younger population is likely because the younger patients undergoing TAVR are relatively sicker and thus are at an increased risk of post-procedure complications.
The hospitalization charges for patients undergoing TF-TAVR varied across regions with highest in the west followed by north-east, south and mid-west regions. Healthcare expenditure in general varies widely by hospital region in United States. It is thus important to understand the regional differences in practice and attempt to reduce the wasteful charges for TAVR.
Patients undergoing elective TAVR were likely to incur less hospitalization charges [$202,184.10 ($112,270.59)] than those undergoing urgent/emergent TAVR [$254,383.26 ($167,198.82)]. This is because of increased complications including AKI or dialysis requirement and increased mortality in individuals undergoing urgent/emergent TAVR (13).
It is well established that there exist racial disparities in healthcare system. In our study population, around 90% patient population were Caucasians and 4% African-Americans and Hispanics. There occurs increased racial disparity in utilization of structural heart disease interventions (15). In our study population, there existed significant differences in adjusted hospitalization charges being higher in Hispanics and African-Americans compared with Caucasians. It is possible that decreased access to healthcare results in minority race patients presenting with advanced disease, driving up hospitalization charges.
Fluid & electrolyte disorders was another significant predictor of increased hospitalization charges. In patients undergoing TAVR, fluid & electrolyte disorder have been shown to be an independent predictor of mortality (16). Fluid & electrolyte disorder could be a modifiable predictor of adjusted hospitalization charges in patients undergoing TAVR and efforts should be geared towards reducing its occurrence.
Strengths and limitations
To the best of author’s knowledge, this is the first attempt to develop a ML prediction model for estimating hospitalization charges in patients undergoing TAVR. Second, we predicted the performance of ML algorithms at various stages, from the time of admission until procedure completion and finally taking LOS into account. Third, a robust performance assessment was done for various ML algorithms using multiple evaluation metrics to identify which algorithm most accurately predicts our outcome of interest i.e., hospitalization charges in patients undergoing TF-TAVR. Fourth, ML algorithms were not only compared amongst themselves but also the mean and median models.
ML models are often criticized for overfitting. To overcome this, we validated our ML regression algorithms internally using a rigorous 5-fold cross fold validation technique. However, the models developed have not been externally validated on a separate cohort. We used the NIS database which inherently has certain limitations as have been described before. There are certain variables which were not available for analysis including type of valves (balloon-expandable or self-expandable), echocardiographic characteristics, acuity of condition, degree of pre-procedural shock, and others.
In conclusion, we built ML algorithms that predict hospitalization charges with good accuracy in patients undergoing TF-TAVR at different stages of hospitalization and that can be used by healthcare providers to better understand the drivers of charges. LOS was the strongest predictor of hospitalization charges. Post-procedure complications including need for mechanical ventilation, acute renal failure, cardiogenic shock and use of mechanical support devices, in-hospital mortality, and need for pacemaker insertion; fluid and electrolyte disorders, age, hospital region and race were other predictors of hospitalization charges.
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://cdt.amegroups.com/article/view/10.21037/cdt-21-717/rc
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://cdt.amegroups.com/article/view/10.21037/cdt-21-717/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013).
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
- Smith CR, Leon MB, Mack MJ, et al. Transcatheter versus surgical aortic-valve replacement in high-risk patients. N Engl J Med 2011;364:2187-98. [Crossref] [PubMed]
- TVT Registry Datamart Data. TAVR Update: New Insights and Perspectives from the U.S.: National STS/ACC TVT Registry: National Cardiovascular Data Registry; 2019. Available online: https://www.sts.org/sites/default/files/102419%201645.%20Bavaria.%20TVT.pdf
- Leon MB, Smith CR, Mack MJ, et al. Transcatheter or Surgical Aortic-Valve Replacement in Intermediate-Risk Patients. N Engl J Med 2016;374:1609-20. [Crossref] [PubMed]
- Adams DH, Popma JJ, Reardon MJ, et al. Transcatheter aortic-valve replacement with a self-expanding prosthesis. N Engl J Med 2014;370:1790-8. [Crossref] [PubMed]
- Leon MB, Smith CR, Mack M, et al. Transcatheter aortic-valve implantation for aortic stenosis in patients who cannot undergo surgery. N Engl J Med 2010;363:1597-607. [Crossref] [PubMed]
- Reardon MJ, Van Mieghem NM, Popma JJ, et al. Surgical or Transcatheter Aortic-Valve Replacement in Intermediate-Risk Patients. N Engl J Med 2017;376:1321-31. [Crossref] [PubMed]
- Mack MJ, Leon MB, Thourani VH, et al. Transcatheter Aortic-Valve Replacement with a Balloon-Expandable Valve in Low-Risk Patients. N Engl J Med 2019;380:1695-705. [Crossref] [PubMed]
- Osnabrugge RL, Mylotte D, Head SJ, et al. Aortic stenosis in the elderly: disease prevalence and number of candidates for transcatheter aortic valve replacement: a meta-analysis and modeling study. J Am Coll Cardiol 2013;62:1002-12. [Crossref] [PubMed]
- Lee SM, Kang JO, Suh YM. Comparison of hospital charge prediction models for colorectal cancer patients: neural network vs. decision tree models. J Korean Med Sci 2004;19:677-81. [Crossref] [PubMed]
- Wang J, Li M, Hu YT, et al. Comparison of hospital charge prediction models for gastric cancer patients: neural network vs. decision tree models. BMC Health Serv Res 2009;9:161. [Crossref] [PubMed]
- Muhlestein WE, Akagi DS, McManus AR, et al. Machine learning ensemble models predict total charges and drivers of cost for transsphenoidal surgery for pituitary tumor. J Neurosurg 2018;131:507-16. [Crossref] [PubMed]
- Lauck SB, Baron SJ, Sathananthan J, et al. Exploring the Reduction in Hospitalization Costs Associated with Next-Day Discharge following Transfemoral Transcatheter Aortic Valve Replacement in the United States. Structural Heart. 2019;3:423-30. [Crossref]
- Kolte D, Khera S, Vemulapalli S, et al. Outcomes Following Urgent/Emergent Transcatheter Aortic Valve Replacement: Insights From the STS/ACC TVT Registry. JACC Cardiovasc Interv 2018;11:1175-85. [Crossref] [PubMed]
- Silver SA, Chertow GM. The Economic Consequences of Acute Kidney Injury. Nephron 2017;137:297-301. [Crossref] [PubMed]
- Alkhouli M, Alqahtani F, Holmes DR, et al. Racial Disparities in the Utilization and Outcomes of Structural Heart Disease Interventions in the United States. J Am Heart Assoc 2019;8:e012125. [Crossref] [PubMed]
- Akinseye OA, Shahreyar M, Nwagbara CC, et al. Modifiable Predictors of In-Hospital Mortality in Patients Undergoing Transcatheter Aortic Valve Replacement. Am J Med Sci 2018;356:135-40. [Crossref] [PubMed]