Browse

You are looking at 1 - 10 of 39,141 items for

  • Refine by Access: all x
Clear All
Free access

Dooman Arefan, Matthew Pease, Shawn R. Eagle, David O. Okonkwo, and Shandong Wu

OBJECTIVE

An estimated 1.5 million people die every year worldwide from traumatic brain injury (TBI). Physicians are relatively poor at predicting long-term outcomes early in patients with severe TBI. Machine learning (ML) has shown promise at improving prediction models across a variety of neurological diseases. The authors sought to explore the following: 1) how various ML models performed compared to standard logistic regression techniques, and 2) if properly calibrated ML models could accurately predict outcomes up to 2 years posttrauma.

METHODS

A secondary analysis of a prospectively collected database of patients with severe TBI treated at a single level 1 trauma center between November 2002 and December 2018 was performed. Neurological outcomes were assessed at 3, 6, 12, and 24 months postinjury with the Glasgow Outcome Scale. The authors used ML models including support vector machine, neural network, decision tree, and naïve Bayes models to predict outcome across all 4 time points by using clinical information available on admission, and they compared performance to a logistic regression model. The authors attempted to predict unfavorable versus favorable outcomes (Glasgow Outcome Scale scores of 1–3 vs 4–5), as well as mortality. Models’ performance was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC) with 95% confidence interval and balanced accuracy.

RESULTS

Of the 599 patients in the database, the authors included 501, 537, 469, and 395 at 3, 6, 12, and 24 months posttrauma. Across all time points, the AUCs ranged from 0.71 to 0.85 for mortality and from 0.62 to 0.82 for unfavorable outcomes with various modeling strategies. Decision tree models performed worse than all other modeling approaches for multiple time points regarding both unfavorable outcomes and mortality. There were no statistically significant differences between any other models. After proper calibration, the models had little variation (0.02–0.05) across various time points.

CONCLUSIONS

The ML models tested herein performed with equivalent success compared with logistic regression techniques for prognostication in TBI. The TBI prognostication models could predict outcomes beyond 6 months, out to 2 years postinjury.

Free access

Sivaram Emani, Akshay Swaminathan, Ben Grobman, Julia B. Duvall, Ivan Lopez, Omar Arnaout, and Kevin T. Huang

OBJECTIVE

Machine learning (ML) has become an increasingly popular tool for use in neurosurgical research. The number of publications and interest in the field have recently seen significant expansion in both quantity and complexity. However, this also places a commensurate burden on the general neurosurgical readership to appraise this literature and decide if these algorithms can be effectively translated into practice. To this end, the authors sought to review the burgeoning neurosurgical ML literature and to develop a checklist to help readers critically review and digest this work.

METHODS

The authors performed a literature search of recent ML papers in the PubMed database with the terms "neurosurgery" AND "machine learning," with additional modifiers "trauma," "cancer," "pediatric," and "spine" also used to ensure a diverse selection of relevant papers within the field. Papers were reviewed for their ML methodology, including the formulation of the clinical problem, data acquisition, data preprocessing, model development, model validation, model performance, and model deployment.

RESULTS

The resulting checklist consists of 14 key questions for critically appraising ML models and development techniques; these are organized according to their timing along the standard ML workflow. In addition, the authors provide an overview of the ML development process, as well as a review of key terms, models, and concepts referenced in the literature.

CONCLUSIONS

ML is poised to become an increasingly important part of neurosurgical research and clinical care. The authors hope that dissemination of education on ML techniques will help neurosurgeons to critically review new research better and more effectively integrate this technology into their practices.

Free access

Shane Shahrestani, Andrew K. Chan, Erica F. Bisson, Mohamad Bydon, Steven D. Glassman, Kevin T. Foley, Christopher I. Shaffrey, Eric A. Potts, Mark E. Shaffrey, Domagoj Coric, John J. Knightly, Paul Park, Michael Y. Wang, Kai-Ming Fu, Jonathan R. Slotkin, Anthony L. Asher, Michael S. Virk, Giorgos D. Michalopoulos, Jian Guan, Regis W. Haid, Nitin Agarwal, Dean Chou, and Praveen V. Mummaneni

OBJECTIVE

Spondylolisthesis is a common operative disease in the United States, but robust predictive models for patient outcomes remain limited. The development of models that accurately predict postoperative outcomes would be useful to help identify patients at risk of complicated postoperative courses and determine appropriate healthcare and resource utilization for patients. As such, the purpose of this study was to develop k-nearest neighbors (KNN) classification algorithms to identify patients at increased risk for extended hospital length of stay (LOS) following neurosurgical intervention for spondylolisthesis.

METHODS

The Quality Outcomes Database (QOD) spondylolisthesis data set was queried for patients receiving either decompression alone or decompression plus fusion for degenerative spondylolisthesis. Preoperative and perioperative variables were queried, and Mann-Whitney U-tests were performed to identify which variables would be included in the machine learning models. Two KNN models were implemented (k = 25) with a standard training set of 60%, validation set of 20%, and testing set of 20%, one with arthrodesis status (model 1) and the other without (model 2). Feature scaling was implemented during the preprocessing stage to standardize the independent features.

RESULTS

Of 608 enrolled patients, 544 met prespecified inclusion criteria. The mean age of all patients was 61.9 ± 12.1 years (± SD), and 309 (56.8%) patients were female. The model 1 KNN had an overall accuracy of 98.1%, sensitivity of 100%, specificity of 84.6%, positive predictive value (PPV) of 97.9%, and negative predictive value (NPV) of 100%. Additionally, a receiver operating characteristic (ROC) curve was plotted for model 1, showing an overall area under the curve (AUC) of 0.998. Model 2 had an overall accuracy of 99.1%, sensitivity of 100%, specificity of 92.3%, PPV of 99.0%, and NPV of 100%, with the same ROC AUC of 0.998.

CONCLUSIONS

Overall, these findings demonstrate that nonlinear KNN machine learning models have incredibly high predictive value for LOS. Important predictor variables include diabetes, osteoporosis, socioeconomic quartile, duration of surgery, estimated blood loss during surgery, patient educational status, American Society of Anesthesiologists grade, BMI, insurance status, smoking status, sex, and age. These models may be considered for external validation by spine surgeons to aid in patient selection and management, resource utilization, and preoperative surgical planning.

Free access

Tatsat R. Patel, Aakash Patel, Sricharan S. Veeturi, Munjal Shah, Muhammad Waqas, Andre Monteiro, Ammad A. Baig, Nandor Pinter, Elad I. Levy, Adnan H. Siddiqui, and Vincent M. Tutino

OBJECTIVE

Computed tomography angiography (CTA) is the most widely used imaging modality for intracranial aneurysm (IA) management, yet it remains inferior to digital subtraction angiography (DSA) for IA detection, particularly of small IAs in the cavernous carotid region. The authors evaluated a deep learning pipeline for segmentation of vessels and IAs from CTA using coregistered, segmented DSA images as ground truth.

METHODS

Using 50 paired CTA-DSA images, the authors trained (n = 27), validated (n = 3), and tested (n = 20) a deep learning model (3D DeepMedic) for cerebrovasculature segmentation from CTA. A landmark-based coregistration algorithm was used for registration and upsampling of CTA images to paired DSA images. Segmented vessels from the DSA were used as the ground truth. Accuracy of the model for vessel segmentation was evaluated using conventional metrics (dice similarity coefficient [DSC]) and vessel segmentation–specific metrics, like connectivity-area-length (CAL). On the test cases (20 IAs), 3 expert raters attempted to detect and segment IAs. For each rater, the authors recorded the rate of IA detection, and for detected IAs, raters segmented and calculated important IA morphology parameters to quantify the differences in IA segmentation by raters to segmentations by DeepMedic. The agreement between raters, DeepMedic, and ground truth was assessed using Krippendorf’s alpha.

RESULTS

In testing, the DeepMedic model yielded a CAL of 0.971 ± 0.007 and a DSC of 0.868 ± 0.008. The model prediction delineated all IAs and resulted in average error rates of < 10% for all IA morphometrics. Conversely, average IA detection accuracy by the raters was 0.653 (undetected IAs were present to a significantly greater degree on the ICA, likely due to those in the cavernous region, and were significantly smaller). Error rates for IA morphometrics in rater-segmented cases were significantly higher than in DeepMedic-segmented cases, particularly for neck (p = 0.003) and surface area (p = 0.04). For IA morphology, agreement between the raters was acceptable for most metrics, except for the undulation index (α = 0.36) and the nonsphericity index (α = 0.69). Agreement between DeepMedic and ground truth was consistently higher compared with that between expert raters and ground truth.

CONCLUSIONS

This CTA segmentation network (DeepMedic trained on DSA-segmented vessels) provides a high-fidelity solution for CTA vessel segmentation, particularly for vessels and IAs in the carotid cavernous region.

Free access

Abdul Karim Ghaith, Oluwaseun O. Akinduro, A. Yohan Alexander, Anshit Goyal, Antonio Bon-Nieves, Leonardo de Macedo Filho, Andrea Otamendi-Lopez, Karim Rizwan Nathani, Kingsley Abode-Iyamah, Mark E. Jentoft, Bernard R. Bendok, Michelle J. Clarke, Michael J. Link, Jamie J. Van Gompel, Alfredo Quiñones-Hinojosa, and Mohamad Bydon

OBJECTIVE

Chordomas are rare tumors from notochordal remnants and account for 1%–4% of all primary bone malignancies, often arising from the clivus and sacrum. Despite margin-negative resection and postoperative radiotherapy, chordomas often recur. Further, immunohistochemical (IHC) markers have not been assessed as predictive of chordoma recurrence. The authors aimed to identify the IHC markers that are predictive of postoperative long-term (≥ 1 year) chordoma recurrence by using trained multiple tree-based machine learning (ML) algorithms.

METHODS

The authors reviewed the records of patients who had undergone treatment for clival and spinal chordomas between January 2017 and June 2021 across the Mayo Clinic enterprise (Minnesota, Florida, and Arizona). Demographics, type of treatment, histopathology, and other relevant clinical factors were abstracted from each patient record. Decision tree and random forest classifiers were trained and tested to predict long-term recurrence based on unseen data using an 80/20 split.

RESULTS

One hundred fifty-one patients diagnosed and treated for chordomas were identified: 58 chordomas of the clivus, 48 chordomas of the mobile spine, and 45 chordomas sacrococcygeal in origin. Patients diagnosed with cervical chordomas were the oldest among all groups (58 ± 14 years, p = 0.009). Most patients were male (n = 91, 60.3%) and White (n = 139, 92.1%). Most patients underwent resection with or without radiation therapy (n = 129, 85.4%). Subtotal resection followed by radiation therapy (n = 51, 33.8%) was the most common treatment modality, followed by gross-total resection then radiation therapy (n = 43, 28.5%). Multivariate analysis showed that S100 and pan-cytokeratin are more likely to predict the increase in the risk of postoperative recurrence (OR 3.67, 95% CI 1.09–12.42, p= 0.03; and OR 3.74, 95% CI 0.05–2.21, p = 0.02, respectively). In the decision tree analysis, a clinical follow-up > 1897 days was found in 37% of encounters and a 90% chance of being classified for recurrence (accuracy = 77%). Random forest analysis (n = 500 trees) showed that patient age, type of surgical treatment, location of tumor, S100, pan-cytokeratin, and EMA are the factors predicting long-term recurrence.

CONCLUSIONS

The IHC and clinicopathological variables combined with tree-based ML tools successfully demonstrated a high capacity to identify recurrence patterns with an accuracy of 77%. S100, pan-cytokeratin, and EMA were the IHC drivers of recurrence. This shows the power of ML algorithms in analyzing and predicting outcomes of rare conditions of a small sample size.

Free access

Megan G. Anderson, Dana Jungbauer, Nathan K. Leclair, Edward S. Ahn, Petronella Stoltz, Jonathan E. Martin, David S. Hersh, and Markus J. Bookland

OBJECTIVE

Sagittal craniosynostosis is the most common form of craniosynostosis and typically results in scaphocephaly, which is characterized by biparietal narrowing, compensatory frontal bossing, and an occipital prominence. The cephalic index (CI) is a simple metric for quantifying the degree of cranial narrowing and is often used to diagnose sagittal craniosynostosis. However, patients with variant forms of sagittal craniosynostosis may present with a "normal" CI, depending on the part of the suture that is closed. As machine learning (ML) algorithms are developed to assist in the diagnosis of cranial deformities, metrics that reflect the other phenotypic features of sagittal craniosynostosis are needed. In this study the authors sought to describe the posterior arc angle (PAA), a measurement of biparietal narrowing that is obtained with 2D photographs, and elucidate the role of PAA as an adjuvant to the CI in characterizing scaphocephaly and the potential relevance of PAA in new ML model development.

METHODS

The authors retrospectively reviewed 1013 craniofacial patients treated during the period from 2006 to 2021. Orthogonal top-down photographs were used to calculate the CI and PAA. Distribution densities, receiver operating characteristic (ROC) curves, and chi-square analyses were used to describe the relative predictive utility of each method for sagittal craniosynostosis.

RESULTS

In total, 1001 patients underwent paired CI and PAA measurements and a clinical head shape diagnosis (sagittal craniosynostosis, n = 122; other cranial deformity, n = 565; normocephalic, n = 314). The area under the ROC curve (AUC) for the CI was 98.5% (95% confidence interval 97.8%–99.2%, p < 0.001), with an optimum specificity of 92.6% and sensitivity of 93.4%. The PAA had an AUC of 97.4% (95% confidence interval 96.0%–98.8%, p < 0.001) with an optimum specificity of 94.9% and sensitivity of 90.2%. In 6 of 122 (4.9%) cases of sagittal craniosynostosis, the PAA was abnormal while the CI was normal. This means that adding a PAA cutoff branch to a partition model increases the detection of sagittal craniosynostosis.

CONCLUSIONS

Both CI and PAA are excellent discriminators for sagittal craniosynostosis. Using an accuracy-optimized partition model, the addition of the PAA to the CI increased model sensitivity compared to using the CI alone. Using a model that incorporates both CI and PAA could assist in the early identification and treatment of sagittal craniosynostosis via automated and semiautomated algorithms that utilize tree-based ML models.

Free access

Haosu Zhang, Kartikay Tehlan, Sebastian Ille, Maximilian Schwendner, Zhenyu Gong, Axel Schroeder, Bernhard Meyer, and Sandro M. Krieg

OBJECTIVE

Language-related networks have been recognized in functional maintenance, which has also been considered the mechanism of plasticity and reorganization in patients with cerebral malignant tumors. However, the role of interhemispheric connections (ICs) in language restoration remains unclear at the network level. Navigated transcranial magnetic stimulation (nTMS) and diffusion tensor imaging fiber tracking data were used to identify language-eloquent regions and their corresponding subcortical structures, respectively.

METHODS

Preoperative image–based IC networks and nTMS mapping data from 30 patients without preoperative and postoperative aphasia as the nonaphasia group, 30 patients with preoperative and postoperative aphasia as the glioma-induced aphasia (GIA) group, and 30 patients without preoperative aphasia but who developed aphasia after the operation as the surgery-related aphasia group were investigated using fully connected layer-based deep learning (FC-DL) analysis to weight ICs.

RESULTS

GIA patients had more weighted ICs than the patients in the other groups. Weighted ICs between the left precuneus and right paracentral lobule, and between the left and right cuneus, were significantly different among these three groups. The FC-DL approach for modeling functional and structural connectivity was also tested for its potential to predict postoperative language levels, and both the achieved sensitivity and specificity were greater than 70%. Weighted IC was reorganized more in GIA patients to compensate for language loss.

CONCLUSIONS

The authors’ method offers a new perspective to investigate brain structural organization and predict functional prognosis.

Free access

Mohamad Bydon, John H. Shin, Shelly D. Timmons, and Eric A. Potts

Free access

Andrew Abumoussa, Vivek Gopalakrishnan, Benjamin Succop, Michael Galgano, Sivakumar Jaikumar, Yueh Z. Lee, and Deb A. Bhowmick

OBJECTIVE

The goal of this work was to methodically evaluate, optimize, and validate a self-supervised machine learning algorithm capable of real-time automatic registration and fluoroscopic localization of the spine using a single radiograph or fluoroscopic frame.

METHODS

The authors propose a two-dimensional to three-dimensional (2D-3D) registration algorithm that maximizes an image similarity metric between radiographic images to identify the position of a C-arm relative to a 3D volume. This work utilizes digitally reconstructed radiographs (DRRs), which are synthetic radiographic images generated by simulating the x-ray projections as they would pass through a CT volume. To evaluate the algorithm, the authors used cone-beam CT data for 127 patients obtained from an open-source de-identified registry of cervical, thoracic, and lumbar scans. They systematically evaluated and tuned the algorithm, then quantified the convergence rate of the model by simulating C-arm registrations with 80 randomly simulated DRRs for each CT volume. The endpoints of this study were time to convergence, accuracy of convergence for each of the C-arm’s degrees of freedom, and overall registration accuracy based on a voxel-by-voxel measurement.

RESULTS

A total of 10,160 unique radiographic images were simulated from 127 CT scans. The algorithm successfully converged to the correct solution 82% of the time with an average of 1.96 seconds of computation. The radiographic images for which the algorithm converged to the solution demonstrated 99.9% registration accuracy despite utilizing only single-precision computation for speed. The algorithm was found to be optimized for convergence when the search space was limited to a ± 45° offset in the right anterior oblique/left anterior oblique, cranial/caudal, and receiver rotation angles with the radiographic isocenter contained within 8000 cm3 of the volumetric center of the CT volume.

CONCLUSIONS

The investigated machine learning algorithm has the potential to aid surgeons in level localization, surgical planning, and intraoperative navigation through a completely automated 2D-3D registration process. Future work will focus on algorithmic optimizations to improve the convergence rate and speed profile.

Free access

Anmol Warman, Anita L. Kalluri, and Tej D. Azad

OBJECTIVE

In recent years, machine learning models for clinical prediction have become increasingly prevalent in the neurosurgical literature. However, little is known about the quality of these models, and their translation to clinical care has been limited. The aim of this systematic review was to empirically determine the adherence of machine learning models in neurosurgery with standard reporting guidelines specific to clinical prediction models.

METHODS

Studies describing the development or validation of machine learning predictive models published between January 1, 2020, and January 10, 2023, across five neurosurgery journals (Journal of Neurosurgery, Journal of Neurosurgery: Spine, Journal of Neurosurgery: Pediatrics, Neurosurgery, and World Neurosurgery) were included. Studies where the TRIPOD (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis) guidelines were not applicable, radiomic studies, and natural language processing studies were excluded.

RESULTS

Forty-seven studies featuring a machine learning–based predictive model in neurosurgery were included. The majority (53%) of studies were single-center studies, and only 15% of studies externally validated the model in an independent cohort of patients. The median compliance across all 47 studies was 82.1% (IQR 75.9%–85.7%). Giving details of treatment (n = 17 [36%]), including the number of patients with missing data (n = 11 [23%]), and explaining the use of the prediction model (n = 23 [49%]) were identified as the TRIPOD criteria with the lowest rates of compliance.

CONCLUSIONS

Improved adherence to TRIPOD guidelines will increase transparency in neurosurgical machine learning predictive models and streamline their translation into clinical care.