Machine learning ensemble models predict total charges and drivers of cost for transsphenoidal surgery for pituitary tumor

Restricted access

OBJECTIVE

Efficient allocation of resources in the healthcare system enables providers to care for more and needier patients. Identifying drivers of total charges for transsphenoidal surgery (TSS) for pituitary tumors, which are poorly understood, represents an opportunity for neurosurgeons to reduce waste and provide higher-quality care for their patients. In this study the authors used a large, national database to build machine learning (ML) ensembles that directly predict total charges in this patient population. They then interrogated the ensembles to identify variables that predict high charges.

METHODS

The authors created a training data set of 15,487 patients who underwent TSS between 2002 and 2011 and were registered in the National Inpatient Sample. Thirty-two ML algorithms were trained to predict total charges from 71 collected variables, and the most predictive algorithms combined to form an ensemble model. The model was internally and externally validated to demonstrate generalizability. Permutation importance and partial dependence analyses were performed to identify the strongest drivers of total charges. Given the overwhelming influence of length of stay (LOS), a second ensemble excluding LOS as a predictor was built to identify additional drivers of total charges.

RESULTS

An ensemble model comprising 3 gradient boosted tree classifiers best predicted total charges (root mean square logarithmic error = 0.446; 95% CI 0.439–0.453; holdout = 0.455). LOS was by far the strongest predictor of total charges, increasing total predicted charges by approximately $5000 per day.

In the absence of LOS, the strongest predictors of total charges were admission type, hospital region, race, any postoperative complication, and hospital ownership type.

CONCLUSIONS

ML ensembles predict total charges for TSS with good fidelity. The authors identified extended LOS, nonelective admission type, non-Southern hospital region, minority race, postoperative complication, and private investor hospital ownership as drivers of total charges and potential targets for cost-lowering interventions.

ABBREVIATIONS LOS = length of stay; ML = machine learning; NIS = National (Nationwide) Inpatient Sample; RMSLE = root mean square logarithmic error; TSS = transsphenoidal surgery.

Article Information

Correspondence Whitney E. Muhlestein: Vanderbilt University, Vanderbilt University Medical Center, Nashville, TN. whitney.muhlestein@gmail.com.

INCLUDE WHEN CITING Published online September 21, 2018; DOI: 10.3171/2018.4.JNS18306.

Disclosures D.S.A. is an employee of and data scientist at DataRobot, Inc. W.E.M. is married to D.S.A.

© AANS, except where prohibited by US copyright law.

Headings

Figures

  • View in gallery

    Histogram demonstrating distribution of total charges (in dollars). The y-axis values represent the numbers of admissions.

  • View in gallery

    Chart demonstrating changes in length of hospitalization and total charges over the years included in the training database. Length of hospitalization is measured in days, and total charges are measured to the nearest dollar. Solid line denotes length of hospitalization; dashed line denotes total charges.

  • View in gallery

    A and B: Lift charts demonstrating graphically the accuracy of predicted total charges relative to actual total charges for each ensemble (with and without LOS as a variable). Predicted total charges are divided into 10 equal bins, or deciles. Mean predicted total charges and mean actual total charges are calculated and plotted for each decile bin. Solid line denotes actual total charges; dashed line denotes predicted total charges.

  • View in gallery

    A and B: Permutation importance analyses demonstrating the relative importance of the 5 most influential variables on the predictions of both ensembles. The most important variable is assigned the value “1.0” and all other variables are assigned numerical values based on their importance relative to the most important variable.

  • View in gallery

    Partial dependence plots demonstrating the independent impact of individual variables on the ensemble models. Left-side x-axis represents patient incidence for each patient group and corresponds to bars. Right-side x-axis represents predicted total charges and corresponds to round heads. A–E: Graphs depicting variables in ensemble 1 (with LOS). F–J: Graphs illustrating variables in ensemble 2 (without LOS). NFP = not for profit.

References

  • 1

    Agency for Healthcare Research and Quality: 2015 National Healthcare Disparities Report and Disparities Report and 5th Anniversary Update on the National Quality Strategy. Rockville, MD: U.S. Department of Health and Human Services2016 (http://www.ahrq.gov/research/findings/nhqrdr/nhqdr15/index.html) [Accessed May 29 2018]

    • Export Citation
  • 2

    Bodenheimer TFernandez A: High and rising health care costs. Part 4: can costs be controlled while preserving quality? Ann Intern Med 143:26312005

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation
  • 3

    Breiman L: Random forests. Mach Learn 45:5322001

  • 4

    Bureau of Labor Statistics: Consumer Price Index (CPI) Databases. Washington, DC: U.S. Department of Labor2017 (https://www.bls.gov/cpi/data.htm) [Accessed May 29 2018]

    • Export Citation
  • 5

    Burke MAFournier GMPrasad K: Physician Social Networks and Geographical Variation in Medical Care. Washington, DC: Brookings Institute2003 (https://www.brookings.edu/wp-content/uploads/2016/06/07healthcare_burke.pdf) [Accessed May 29 2018]

    • Export Citation
  • 6

    Cebul RDRebitzer JBTaylor LJVotruba ME: Organizational fragmentation and care quality in the U.S healthcare system. J Econ Perspect 22:931132008

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 7

    Centers for Medicare and Medicaid Services: CMS’ Value-Based Programs. Baltimore: Centers of Medicare and Medicaid Services2017 (https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/Value-Based-Programs/Value-Based-Programs.html) [Accessed May 29 2018]

    • Export Citation
  • 8

    Davies JMLawton MT: Improved outcomes for patients with cerebrovascular malformations at high-volume centers: the impact of surgeon and hospital volume in the United States, 2000–2009. J Neurosurg 127:69802017

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation
  • 9

    Eskoz RPeddecord KM: The relationship of hospital ownership and service composition to hospital charges. Health Care Financ Rev 6:51–581985

  • 10

    Forbes JAWilkerson JChambless LShay SDElswick CMAbblitt PW: Safety and cost effectiveness of early discharge following microscopic trans-sphenoidal resection of pituitary lesions. Surg Neurol Int 2:662011

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation
  • 11

    Fraser IEncinosa WGlied S: Improving efficiency and value in health care: introduction. Health Serv Res 43:178117862008

  • 12

    Friedman JH: Greedy function approximation: a gradient boosting machine. Ann Stat 29:118912322001

  • 13

    Garber AMSkinner J: Is American health care uniquely inefficient? J Econ Perspect 22:27502008

  • 14

    Healthcare Cost and Utilization Project Databases: Nationwide Inpatient Sample. Rockville, MD: Agency for Healthcare Research and Quality2018 (http://www.hcup-us.ahrq.gov/nisoverview.jsp) [Accessed May 29 2018]

    • Export Citation
  • 15

    Institute of Medicine: Unequal Treatment: Confronting Racial and Ethnic Disparities in Health Care. Washington, DC: National Academies Press2003

    • Export Citation
  • 16

    Institute of Medicine (US) Committee on Quality of Health Care in America: Crossing the Quality Chasm: A New Health System for the 21st Century. Washington, DC: National Academies Press2001

    • Export Citation
  • 17

    Karsy MBrock AAGuan JBisson EFCouldwell WT: Assessment of cost drivers in transsphenoidal approaches for resection of pituitary tumors using the value-driven outcome database. World Neurosurg 105:8188232017

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation
  • 18

    Khan NAQuan HBugar JMLemaire JBBrant RGhali WA: Association of postoperative complications with hospital costs and length of stay in a tertiary care center. J Gen Intern Med 21:1771802006

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 19

    Kramolowsky EVWood NLRollins KLGlasheen WPNelson CM: Impact of physician awareness on hospital charges for radical retropubic prostatectomy. J Urol 154:1391421995

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation
  • 20

    Lee CCKimmell KTLalonde ASalzman PMiller MCCalvi LM: Geographic variation in cost of care for pituitary tumor surgery. Pituitary 19:5155212016

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation
  • 21

    Lee SMKang JOSuh YM: Comparison of hospital charge prediction models for colorectal cancer patients: neural network vs. decision tree models. J Korean Med Sci 19:6776812004

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation
  • 22

    Leibman BDDillioglugil OAbbas FTanli SKattan MWScardino PT: Impact of a clinical pathway for radical retropubic prostatectomy. Urology 52:94991998

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation
  • 23

    MaCurdy TBhattacharya JPerlroth DShafrin JAu-Yeung ABashour H: Geographic Variation in Spending, Utilization, and Quality: Medicare and Medicaid Beneficiaries. Washington, DC: National Academy of Sciences2013 (http://www.nationalacademies.org/hmd/∼/media/Files/Report%20Files/2013/Geographic-Variation/Sub-Contractor/Acumen-Medicare-Medicaid.pdf) [Accessed May 29 2018]

    • Export Citation
  • 24

    McLaughlin NMartin NAUpadhyaya PBari AABuxey FWang MB: Assessing the cost of contemporary pituitary care. Neurosurg Focus 37(5):E72014

  • 25

    Muhlestein WEAkagi DSChotai SChambless LB: The impact of race on discharge disposition and length of hospitalization after craniotomy for brain tumor. World Neurosurg 104:24382017

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation
  • 26

    Newhouse JPGarber AM: Geographic variation in health care spending in the United States: insights from an Institute of Medicine report. JAMA 310:122712282013

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation
  • 27

    Sarkiss CALee JPapin JAGeer EBBanik RRucker JC: Pilot study on early postoperative discharge in pituitary adenoma patients: effect of socioeconomic factors and benefit of specialized pituitary centers. J Neurol Surg B Skull Base 76:3233302015

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation
  • 28

    Tetreault LTan GKopjar BCôté PArnold PNugaeva N: Clinical and surgical predictors of complications following surgery for the treatment of cervical spondylotic myelopathy: results from the multicenter, prospective AOSpine International Study of 479 patients. Neurosurgery 79:33442016

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 29

    Thomas JGGadgil NSamson SLTakashima MYoshor D: Prospective trial of a short hospital stay protocol after endoscopic endonasal pituitary adenoma surgery. World Neurosurg 81:5765832014

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 30

    Wang JLi MHu YTZhu Y: Comparison of hospital charge prediction models for gastric cancer patients: neural network vs. decision tree models. BMC Health Serv Res 9:1612009

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation

TrendMD

Metrics

Metrics

All Time Past Year Past 30 Days
Abstract Views 62 62 62
Full Text Views 12 12 12
PDF Downloads 29 29 29
EPUB Downloads 0 0 0

PubMed

Google Scholar