Adult spinal deformity (ASD) surgery is demanding for both the patient and surgeon, with long operating times, a large amount of blood loss, and a high complication risk. Postoperatively, patients are faced with a recovery upward of 6–12 months. Even after a successful recovery, the risk of postoperative complications such as proximal junctional kyphosis (PJK) or proximal junctional failure (PJF), implant failure, rod fracture, or reoperation lingers. Prior studies have estimated the total risk of reoperation after ASD surgery to be as high as 52.6%.1 Prior to undergoing any ASD surgery, a thorough discussion of potential complications and risks associated with those complications is paramount.
The Scoliosis Research Society (SRS)–Schwab ASD Classification provides ideal spinal alignment values for the deformity surgeon to aim for, which include the sagittal vertical axis (SVA), pelvic incidence (PI), sacral slope (SS), L1–S1 lordosis, and pelvic tilt (PT).1–3 Although previous scoring systems have identified specific targets for correction, shortcomings include usage of purely numerical values that incompletely account for PI magnitude, no assessment of lordosis distribution, the inability to account for negative balance, and inaccurate global alignment masked by pelvic retroversion.1–3 Moreover, standardized definitions are lacking for “normal” and “pathologic” alignment, and how this may correlate with both radiographic and clinical outcomes. Drs. Yilgor and Alanay, leaders among the European Spine Study Group, developed a scoring system to predict the risk of mechanical complications after ASD surgery known as the Global Alignment and Proportion (GAP) score. The GAP score stratifies patients into one of three spinopelvic states: 1) proportioned (P), 2) moderately disproportioned (MD), and 3) severely disproportioned (SD), based on 5 parameters: 1) relative pelvic version (measured minus the ideal SS); 2) relative lumbar lordosis (measured minus the ideal lumbar lordosis); 3) lordosis distribution index (L4–S1 lordosis divided by the L1–S1 lordosis multiplied by 100); 4) relative spinopelvic alignment (measured minus ideal global tilt [GT]); and 5) an age factor.4 In their original report, the GAP score produced an excellent ability to predict mechanical complications, with an area under the curve (AUC) of 0.92. Those with a proportioned spinopelvic state (i.e., P group) had a 6% mechanical complication rate, whereas patients in the MD and SD groups had 47% and 95% mechanical complication rates, respectively.2
As with any new scoring system, external validation is required to ensure maximum generalizability and validity in all ASD populations. Since its initial publication, several groups have produced mixed results replicating the GAP score’s ability to predict mechanical complications. Whereas some groups have shown good correlation between the GAP score and predicting clinical and radiographic outcomes,5,6 others have reported weak predictive value.7
The current objective was to examine the ability of the GAP score to predict mechanical complications after ASD surgery at a high-volume single center for spinal deformity surgery. We hypothesized that the GAP score would accurately predict the risk of mechanical complications after ASD surgery at our center.
Methods
Study Design
This study was a retrospective analysis of consecutive surgical patients treated at a single center by a single deformity surgeon with at least 2 years of follow-up or mechanical failure at any time point. Following institutional review board approval, data were collected on spinal deformity operations that were performed between June 2015 and December 2018 by a single surgeon (L.G.L.).
Patient Population
Preoperative deformity criteria for enrollment were identical to those of the original GAP study—age > 18 years and at least one of the following: coronal Cobb angle > 20°, SVA > 5 cm, PT > 25°, or thoracic kyphosis > 60°.4 Operative requirements for study inclusion were also identical to the original GAP study: > 4 levels of posterior instrumented fusion, ≥ 2 years of follow-up, and radiographs at predetermined set points for review (immediately postoperatively, at 6 weeks, and at 2 years).
Data Collection
Demographic data collected included age at time of surgery, last date of follow-up, sex, diagnosis, and history of prior spine surgery (decompression, fusion, or both). Operative data included levels instrumented, 3-column osteotomies (3COs), and pelvic fixation. Radiographic data were collected at 2 time points—preoperatively and 6 weeks postoperatively—and included PI, SS, L1–S1 lordosis, L4–S1 lordosis, and GT. All radiographic measurements were performed with a validated image system by a neurosurgical spinal deformity fellow familiar with spinal measurements.
Outcome data collected included any mechanical complication, which included PJK (defined as > 10° of kyphosis between the upper instrumented vertebra [UIV] and UIV + 2 between early postoperative and later follow-up); PJF (fracture of UIV or UIV + 1, pullout of instrumentation at UIV, and/or sagittal subluxation); distal junctional kyphosis (DJK) or distal junctional failure (DJF) (> 10° postoperative increase in kyphosis between the lower instrumented vertebra [LIV] and LIV − 1 and/or pullout of instrumentation at LIV); rod breakage (single or double); or other implant-related complication (screw loosening, breakage, or pullout of interbody graft, hook, or set screw).
GAP Score Calculation
In addition to the previously described raw data, additional “ideal” values were calculated to produce a final GAP score in keeping with the original methodology as follows: ideal SS (0.59 × PI + 9); ideal lumbar lordosis (0.62 × PI + 29); and ideal GT (0.48 × PI − 15).4 Both the patient’s raw values and the calculated ideal values were then used to calculate the 5 components of the GAP score.
1. Relative pelvic version was measured minus the ideal SS: < −15° was severe retroversion (3 points), −15° to −7.1° was moderate retroversion (2 points), −7° to 5° was aligned (0 points), and > 5° was anteversion (1 point).
2. Relative lumbar lordosis was measured minus the ideal lumbar lordosis: < −25° was severe hypolordosis (3 points), −25° to −14.1° was moderate hypolordosis (2 points), −14° to 11° was aligned (0 points), and > 11° was hyperlordosis (3 points).
3. The lordosis distribution index was L4–S1 lordosis divided by L1–S1 lordosis, multiplied by 100. An index < 40% was severe hypolordotic maldistribution (2 points), 40%–49% was moderate hypolordotic maldistribution (1 point), 50%–80% was aligned (0 points), and > 80% was hyperlordotic maldistribution (3 points).
4. Relative spinopelvic alignment was measured minus ideal GT, a parameter that measures spinal alignment and pelvic compensation not affected by changes in patient positioning that correlates well with quality of life domains.8 Relative spinopelvic alignment > 18° was severe positive malalignment (3 points), 10.1° to 18° was moderate positive malalignment (1 point), 10° to −7° was aligned (0 points), and < −7° was negative malalignment (1 point).
5. Age was dichotomized as adults < 60 years (0 points) and elderly adults ≥ 60 years (1 point).
These 5 values were used to categorize patients into one of three spinopelvic groups in accordance with the original paper: P, for total score 0–2; MD, for total score 3–6; or SD, for total score ≥ 7 (Fig. 1).
Chart showing how to calculate the GAP score as defined by the original study (Yilgor et al.4). Figure is available in color online only.
Statistical Analysis
All study data were collected and managed using the Research Electronic Data Capture (REDCap) tool at Columbia University Medical Center. REDCap is a secure, web-based application designed to support data capture for research studies, providing the following: 1) an intuitive interface for validated data entry; 2) audit trails for tracking data manipulation and export procedures; 3) automated export procedures for seamless data downloads to common statistical packages; and 4) procedures for importing data from external sources.9 Descriptive statistical analysis was performed, which included the mean ± standard deviation for continuous variables and the number and percentage for count variables. After calculation of each GAP score, patients were categorized into one of three groups (P, MD, SD). A chi-square analysis was performed to compare rates of mechanical complications between each group. In addition, a receiver operating characteristic curve was used to assess the ability of the GAP score to predict mechanical complications with an AUC calculation. The Cochran-Armitage test was used to test a linear association. Statistical significance was set at p < 0.05. All analyses were performed using R studio version 2.14.0.
Results
Demographics and Operative Data
A total of 67 patients treated between 2015 and 2017 met inclusion criteria and were included in the final analysis (Table 1). Fifty patients (74.6%) were female, and the mean patient age was 52.5 years (range 18–75 years). Thirty-one patients (46%) were ≥ 60 years old. The mean follow-up was 2.0 years (range 0.1–3.3 years). Patients with < 2 years of follow-up were included only if they had an early mechanical complication. The mean number of levels instrumented was 14.7, and 22% of patients underwent a 3CO.
Preoperative, intraoperative, and postoperative variables in 67 patients with ASD
Variable | Value |
---|---|
No. of pts | 67 |
Preop | |
Age in yrs, mean (range) | 52.5 (18–75) |
Follow-up in yrs, mean (range) | 2.0 (0.1–3.3) |
Male, no. (%) | 17 (25.3%) |
Prior surgery, no. (%) | 37 (55%) |
Decompression | 30 (44.8%) |
Fusion | 36 (53.7%) |
Diagnosis, no. (%) | |
Idiopathic | 37 (55%) |
Degenerative | 18 (27%) |
Neuromuscular | 7 (10%) |
Congenital | 2 (3%) |
Scheuermann’s kyphosis | 1 (2%) |
Posttraumatic | 2 (3%) |
Intraop | |
Levels instrumented, mean ± standard deviation | 14.7 ± 5.2 |
3CO, no. (%) | 15 (22%) |
Pelvic fixation, no. (%) | 55 (82%) |
Postop | |
PI, mean (range) | 56.2° (27.1–95.4°) |
SS, mean (range) | 34.7° (11.6–67.0°) |
L1–S1 lordosis, mean (range) | 48.6° (6.1–83.6°) |
L4–S1 lordosis, mean (range) | 36.6° (0–65.1°) |
GT, mean (range) | 21.6° (0.8–43.4°) |
GAP score, mean (range) | 4.8 (0–13) |
GAP groups, no. (%) | |
P | 21 (31.3%) |
MD | 23 (34.3%) |
SD | 23 (34.3%) |
Pts = patients.
Postoperative Data
All radiographs were measured at 6 weeks postoperatively (Table 2). The mean PI was 56.2° (range 27.1°–95.4°), the mean SS was 34.7° (range 11.6°–67.0°), the mean L1–S1 lordosis was 48.6° (range 6.1°–83.6°), the mean L4–S1 lordosis was 36.6° (range 0°–65.1°), and the mean GT was 21.6° (range 0.8°–43.4°). The mean GAP score was 4.8 (range 0–13). Categorizing patients using the GAP scores gave the following results: P, 21 patients (31.3%); MD, 23 patients (34.3%); and SD, 23 patients (34.3%).
Lumbosacral measurements at 6 weeks postoperatively in 67 patients with ASD
Postop Measurement | Value |
---|---|
PI, mean (range) | 56.2° (27.1–95.4°) |
SS, mean (range) | 34.7° (11.6–67.0°) |
L1–S1 lordosis, mean (range) | 48.6° (6.1–83.6°) |
L4–S1 lordosis, mean (range) | 36.6° (0–65.1°) |
GT, mean (range) | 21.6° (0.8–43.4°) |
GAP score, mean (range) | 4.8 (0–13) |
GAP groups, no. (%) | |
P | 21 (31.3%) |
MD | 23 (34.3%) |
SD | 23 (34.3%) |
Mechanical Complications
Twenty patients (29.8%) had a mechanical complication during the study period with the following breakdown: 11 PJK, 11 rod fractures, 3 pseudarthroses, 2 DJK, and 1 PJF (Table 3). Thirteen patients (19.1%) required surgical revision. The rate of mechanical complications for each group was as follows: P, 19.0%; MD, 30.4%; and SD, 39.1% (Fig. 2). There was no statistically significant difference in complication rates among all three groups (χ2 = 1.70, p = 0.19). The Cochran-Armitage test showed no significant linear trend (p = 0.10) among all three groups. Assessing the GAP score’s ability to accurately predict the occurrence of a mechanical complication, the AUC was 0.621 (Fig. 3), indicating poor discriminative ability of the GAP score. Examples of patients with GAP scores incongruent with their postoperative course are seen in Figs. 4 and 5.
Mechanical complications in 20 patients with ASD
Variable | Value |
---|---|
No. of pts | 20 |
Mechanical complication, no. (%) | 20 (29.8%) |
PJK | 11 |
Rod fracture | 11 |
Pseudarthrosis | 3 |
DJK | 2 |
PJF | 1 |
Reop, no. (%) | |
Entire cohort | 13 (19.4%) |
Pts w/ mechanical complications | 13 (65.0%) |
Rate of mechanical complications by study.
Receiver operating characteristic curve of the GAP score’s ability to predict mechanical complications (AUC = 0.621). Figure is available in color online only.
Radiographs obtained in a 69-year-old woman with a long-standing primary adult idiopathic deformity with degenerative changes and lumbar stenosis with a low GAP score of 3 in the P group at 6 weeks postoperatively and with PJK noted at 6 months postoperatively but not requiring revision thus far. TLIF = transforaminal lumbar interbody fusion.
Radiographs obtained in a 58-year-old woman with a prior L4–5 PSF presenting with severe fixed coronal and sagittal imbalance with a high GAP score of 10 in the SD group at 6 weeks postoperatively, with no complications at 2 years postoperatively. PSO = pedicle subtraction osteotomy.
Discussion
The current objective was to validate the GAP score’s ability to predict mechanical complications after ASD surgery. As prior studies have shown,6 we hypothesized that the GAP score would accurately predict the risk of mechanical complications after ASD surgery. However, these data suggest that the GAP score poorly discriminated between patients who sustained mechanical complications and those who did not. The GAP score may underestimate mechanical complications in patients with a P score and overestimate the risk in patients with an MD or SD score.
The literature reveals a mixed ability of the GAP score to predict mechanical complications. Jacobs et al.6 performed a two-center, retrospective cohort study of 39 patients and concluded that both the GAP score and the Schwab-SRS classification were capable of predicting radiographic evidence of mechanical failure, yet the GAP score performed better with higher correlation.3 Another single-center study performed by Bari and colleagues7 attempted to validate the GAP score with their retrospective cohort of 149 consecutive patients who underwent deformity correction. With an overall mechanical complication rate of 51% and reoperation rate of 35%, the authors reported an AUC of 0.50 and no linear association between GAP score and occurrence of mechanical failure (p = 0.28) or revision surgery (p = 0.58). Similar to our cohort, they noted a more heterogeneous study population and several key methodological differences with the original GAP study. Ohba et al.5 examined their cohort of 128 patients treated by two surgeons at a single center. Unfortunately, an all-encompassing outcome of mechanical complication was not used; however, rod fracture rates were similar across all three GAP groups (P, 18.5%; MD, 13%; SD, 18.9%). That said, the authors found strong correlations between total GAP score and Oswestry Disability Index and increased proximal junctional angle 2 years after surgery, suggesting that one element of the GAP score (GT) had good predictive power, but not the entire score.
There are several differences between the current study cohort and the original GAP cohort that may account for the poor external validation. First, we have a shorter overall mean follow-up period of 2.0 years compared with the original GAP study of 2.4 years, and it is possible that some complications occurred in this longer follow-up duration. In a review of 643 patients undergoing ASD surgery, Pichelmann et al.10 cited a revision rate of 9.0%, in which 29.3% of the revision surgeries occurred between 2 and up to 5 years postoperatively, 12.1% occurred between at least 5 years to 10 years postoperatively, and 13.8% (8/58) occurred > 10 years postoperatively. It is conceivable that some later complications were missed; however, in comparison to the original GAP study, our follow-up was only 5 months shorter. Second, the original GAP cohort was heterogeneous, involving many surgeons, several centers, and varying intraoperative and postoperative practices. Our center is a tertiary and quaternary referral center specifically for spinal deformity, leading to a potentially more homogeneous cohort of patients who have self-selected to undergo surgery at a single center from a group of three similarly trained surgeons. To this end, the diagnoses differed heavily between cohorts. Our cohort had 55% idiopathic and 27% degenerative deformities, compared with 37% and 43%, respectively, in the original derivation cohort. Third, the type of surgical approach and implants may have differed. All our cases were done posteriorly with a mean of 14.7 levels instrumented. Nearly all of our surgeons used the same implants and construct designs, including multirod constructs and large-diameter pedicle screws and rods, which all introduce further selection bias. Unfortunately, the original GAP paper does not mention the type of surgical approach or any surgical details. However, two figures showcase L1 to sacrum fusion and T12 to sacrum fusion, both significantly smaller than our mean levels fused, which was 14.7. These important potential differences may explain our disparate results.
Despite clear benefits of the GAP score, these results bring forth the possibility that certain unmeasured elements may contribute to mechanical complications. Several patient demographics were not considered in the original score, such as osteoporosis, the frailty score, and other comorbidities. Noh and colleagues11 created a modified GAPB (GAP + BMI + bone mineral density) score that accounted for patient BMI and age and found that the GAPB score had the highest AUC (0.885), followed by the original GAP score (0.798), age-adjusted alignment goals (0.568), and SRS-Schwab classification (0.532). Given the differences in the length of our constructs, it is conceivable that these patient-specific factors become more important when the extent of surgery increases. Additionally, our high proportion of patients with adult idiopathic scoliosis, in whom coronal balance often outweighs the importance of sagittal correction, may require different criteria to predict mechanical failures. Despite the simplicity and usability of the GAP score, it is possible that more complex and robust models that account for many preoperative variables may have superior predictive capabilities. Scheer et al.12 reported an AUC of 0.89 to predict perioperative complications that included pain levels, osteoporosis, a comorbidity index, and several patient-reported outcomes. However, all complications, not solely mechanical complications, were the outcome in this more complex model.
The current study is not without limitation. As always with a retrospective analysis, there is an inherent selection bias and a prospective analysis for validation is warranted. Our cohort also had a relatively small sample size compared with larger studies, but had a very high mean number of instrumented/fused levels of 14.7, highlighting a potential source of variability among an already heterogeneous diagnosis of ASD. Furthermore, these results represent the practice of one surgeon rather than a diverse group of surgeons, and so may not be highly generalizable, although the single-surgeon aspect of this analysis decreases the variability in treatment philosophies. While all measurements were taken by an experienced research team familiar with spine surgery, reliability calculations were not done. Future studies should aim to establish intraobserver and interobserver reliability. Furthermore, the slightly shorter follow-up time versus the original study may have underestimated the true prevalence of mechanical complications within our cohort of patients with spinal deformity, although all patients without a mechanical complication were followed for a minimum of 2 years postoperatively. Last, there may be patient-specific differences between European and North American spinopelvic parameters that make the derivation of a predictive score different in each of these patient populations. To definitively determine whether this score has external validity, a multicenter North American validation study is necessary to ensure appropriate control for the bias observed in our cohort.
Conclusions
This single-center, single-surgeon study of consecutive surgically treated patients with ASD found that the GAP score poorly discriminated between patients who sustained a mechanical complication and those who did not. Several reasons exist for the poor external validity, including varying diagnoses, patient differences, varying surgical interventions, and institution- and surgeon-specific factors. Future studies including alternate variables in addition to lumbopelvic parameters may more accurately predict mechanical complications after ASD surgery. Additionally, separate predictive scores for idiopathic versus degenerative deformity pathologies may be warranted.
Disclosures
Dr. Baum reports being a consultant for Stryker. Dr. Sardar reports receiving honoraria from Medtronic and Stryker Spine. Dr. Lenke reports being a paid consultant for Medtronic (monies donated to a charitable foundation), EOS Technologies, and Acuity Surgical; receiving royalties from Medtronic and Quality Medical Publishing; receiving reimbursement for airfare/hotel from Broadwater, the Seattle Science Foundation, the Scoliosis Research Society, Stryker Spine, the Spinal Research Foundation, and AO Spine; receiving grant support (monies to institution) from the Scoliosis Research Society, EOS, the Setting Scoliosis Straight Foundation, and AO Spine; being an expert witness in a patent infringement case for Fox Rothschild; receiving philanthropic research funding from a grateful patient/family (Evans Family donation and Fox Family Foundation); and receiving fellowship support to his institution from AO Spine.
Author Contributions
Conception and design: Cerpa, Baum, Ha, Lenke. Acquisition of data: Cerpa, Baum, Ha, Zuckerman, Lin, Menger, Osorio, Morr, Leung. Analysis and interpretation of data: Cerpa, Baum, Ha, Zuckerman, Menger, Osorio, Morr, Leung, Lenke. Drafting the article: Baum, Ha, Zuckerman, Lin, Menger, Osorio, Morr, Leung. Critically revising the article: all authors. Reviewed submitted version of manuscript: all authors. Approved the final version of the manuscript on behalf of all authors: Cerpa. Statistical analysis: Cerpa, Baum. Administrative/technical/material support: Cerpa, Lehman, Sardar, Lenke. Study supervision: Lehman, Sardar, Lenke.
Supplemental Information
Previous Presentations
This abstract was presented at the Scoliosis Research Society 2019 Annual Meeting, Montreal, Quebec, Canada, September 18–21, 2019.
References
- 1↑
Schwab F, Patel A, Ungar B, et al. Adult spinal deformity-postoperative standing imbalance: how much can you tolerate? An overview of key parameters in assessing alignment and planning corrective surgery. Spine (Phila Pa 1976). 2010;35(25):2224–2231.
- 2↑
Schwab FJ, Blondel B, Bess S, et al. Radiographical spinopelvic parameters and disability in the setting of adult spinal deformity: a prospective multicenter analysis. Spine (Phila Pa 1976). 2013;38(13):E803–E812.
- 3↑
Schwab F, Ungar B, Blondel B, et al. Scoliosis Research Society–Schwab adult spinal deformity classification: a validation study. Spine (Phila Pa 1976). 2012;37(12):1077–1082.
- 4↑
Yilgor C, Sogunmez N, Boissiere L, et al. Global Alignment and Proportion (GAP) score: development and validation of a new method of analyzing spinopelvic alignment to predict mechanical complications after adult spinal deformity surgery. J Bone Joint Surg Am. 2017;99(19):1661–1672.
- 5↑
Ohba T, Ebata S, Oba H, et al. Predictors of poor global alignment and proportion score after surgery for adult spinal deformity. Spine (Phila Pa 1976). 2019;44(19):E1136–E1143.
- 6↑
Jacobs E, van Royen BJ, van Kuijk SMJ, et al. Prediction of mechanical complications in adult spinal deformity surgery—the GAP score versus the Schwab classification. Spine J. 2019;19(5):781–788.
- 7↑
Bari TJ, Ohrt-Nissen S, Hansen LV, et al. Ability of the global alignment and proportion score to predict mechanical failure following adult spinal deformity surgery—validation in 149 patients with two-year follow-up. Spine Deform. 2019;7(2):331–337.
- 8↑
Obeid I, Boissière L, Yilgor C, et al. Global tilt: a single parameter incorporating spinal and pelvic sagittal parameters and least affected by patient positioning. Eur Spine J. 2016;25(11):3644–3649.
- 9↑
Harris PA, Taylor R, Thielke R, et al. Research electronic data capture (REDCap)—a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform. 2009;42(2):377–381.
- 10↑
Pichelmann MA, Lenke LG, Bridwell KH, et al. Revision rates following primary adult spinal deformity surgery: six hundred forty-three consecutive patients followed-up to twenty-two years postoperative. Spine (Phila Pa 1976). 2010;35(2):219–226.
- 11↑
Noh SH, Ha Y, Obeid I, et al. Modified global alignment and proportion scoring with body mass index and bone mineral density (GAPB) for improving predictions of mechanical complications after adult spinal deformity surgery. Spine J. 2020;20(5):776–784.
- 12↑
Scheer JK, Smith JS, Schwab F, et al. Development of a preoperative predictive model for major complications following adult spinal deformity surgery. J Neurosurg Spine. 2017;26(6):736–743.