Learning curves in robot-assisted spine surgery: a systematic review and proposal of application to residency curricula

OBJECTIVE Spine robots have seen increased utilization over the past half decade with the introduction of multiple new systems. Market research expects this expansion to continue over the next half decade at an annual rate of 20%. However, because of the novelty of these devices, there is limited literature on their learning curves and how they should be integrated into residency curricula. With the present review, the authors aimed to address these two points. METHODS A systematic review of the published English-language literature on PubMed, Ovid, Scopus, and Web of Science was conducted to identify studies describing the learning curve in spine robotics. Included articles described clinical results in patients using one of the following endpoints: operative time, screw placement time, fluoroscopy usage, and instrumentation accuracy. Systems examined included the Mazor series, the ExcelsiusGPS, and the TiRobot. Learning curves were reported in a qualitative synthesis, given as the mean improvement in the endpoint per case performed or screw placed where possible. All studies were level IV case series with a high risk of reporting bias. RESULTS Of 1579 unique articles, 97 underwent full-text review and 21 met the inclusion and exclusion criteria; 62 articles were excluded for not presenting primary data for one of the above-described endpoints. Of the 21 articles, 18 noted the presence of a learning curve in spine robots, which ranged from 3 to 30 cases or 15 to 62 screws. Only 12 articles performed regressions of one of the endpoints (most commonly operative time) as a function of screws placed or cases performed. Among these, increasing experience was associated with a 0.24-to 4.6-minute decrease in operative time per case performed. All but one series described the experience of attending surgeons, not residents. CONCLUSIONS Most studies of learning curves with spine robots have found them to be present, with the most common threshold being 20 to 30 cases performed. Unfortunately, all available evidence is level IV data, limited to case series. Given the ability of residency to allow trainees to safely perform these cases under the supervision of experienced senior surgeons, it is argued that a curriculum should be developed for senior-level residents specializing in spine comprising a minimum of 30 performed cases.

surgery, including several meta-analyses 4,5 and one small randomized, unblinded prospective trial. 6 The identified benefits have included increased instrumentation accuracy, 5,7 reduced radiation dosage, 7,8 reduced blood loss, 9 and a shorter hospital length of stay. 9 However, the heterogeneity of the published series makes it unclear as to whether the results of robot-assisted spine surgery are dependent on the experience level of the user. It is known from the orthopedic arthroplasty 10 and general surgery literature 11 that there exists a learning curve, a certain minimum number of cases that must be performed for the user to become proficient. Systematic reviews of the robotic laparoscopy and arthroplasty literature have suggested that the number to achieve baseline proficiency is 15 to 35 cases. 10,11 To this end, many authors describing their experience with spine robots have similarly documented the presence of a learning curve. 9,[12][13][14][15][16][17][18][19][20][21][22][23][24][25] This is important, as the rates of complication may be higher early on in the learning curve and may actually make robotic surgery less safe than more conventional means as surgeons and trainees master these assistive devices.
Given the potential for robotic systems to be less efficient and less effective, at least early in the experience, there exists the question as to whether or not the teaching of robotic systems should be integrated within neurosurgery residency curricula. The objectives of the present review are to summarize the literature on learning curves in spine robotics and to propose a means by which the teaching of spinal robotics may be integrated within current residency curricula.

Methods
To identify the current evidence for learning curves in spine robotic surgery, a systematic review of the literature was performed on July 22, 2021. Databases queried were PubMed/MEDLINE, Ovid MEDLINE, Web of Science, and Scopus. The bibliographies of included articles were queried for additional relevant references. Search queries are listed in Table 1. Articles were included if they described primary data from a cohort, trial, or series examining spine surgical robots used to place pedicle screws or other spine instrumentation in patients. To be included, articles must have presented data on one of the following endpoints as a surrogate of improvement in proficiency with the device: pedicle screw accuracy (graded or rate of misplacement), pedicle screw placement time, robot registration time, fluoroscopy utilization (time or radiation delivered), or operative time. Studies using any of the commercially approved robots (ExcelsiusGPS, Mazor SpineAssist, Mazor Renaissance, Mazor X, Mazor X Stealth, TiRobot, ROSA ONE Spine, and Cirq) were included. Articles were excluded if they did not present primary data (i.e., were case reports, reviews, commentaries, letters to the editor, or methods descriptions), if they described the use of a non-spine robot (e.g., DaVinci, Intuitive Surgical), if they studied learning outside the clinical setting (e.g., in cadavers, polyurethane models, or animals), or if they studied robot skills learning in non-spine surgery.
Articles were screened independently by two reviewers for inclusion and exclusion criteria according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines ( Fig. 1). In cases of disagreement, a third reviewer was recruited to resolve the disagreement. Data extraction was then performed by the two reviewers with additional review by the third, independent reviewer. As the primary endpoint of the study was the presence or absence of a learning curve (i.e., a difference in the examined outcomes as a function of screws placed or cases performed) and none of the series provided patient-level data, a pooled analysis was not possible. Consequently, only a qualitative analysis of the gathered data is presented. Gathered endpoints were summarized as ranges and the number of studies reporting each outcome. All included studies were classified as level IV therapeutic studies according to the North American Spine Society levels of evidence guidelines, 26 and were at high risk of reporting bias.

Learning Curve
Of the 21 included articles, 18 observed the presence of a learning curve using ≥ 1 of the endpoints analyzed, 9,12-25, 28, 31,32 including 4 of the 5 series considering the experiences of surgeons who had employed the robot in ≥ 50 cases. 12,13,15,16 Among the 8 articles using screw accuracy as a measure of the learning curve, 12,17,19,23,27,28,30,31 4 observed the presence of a learning curve, ranging from 10 to 30 cases 12,17,19 or 30 screws, as in the analysis of Siddiqui et al. 23 When considering total operative time, 8 of the 10 articles reported the presence of a learning curve, ranging from 20 to 30 cases 9, 14-16, 24, 25,32 or 62 screws, as in the analysis of Onen et al. 18 Four of the 8 articles using screw wplacement/insertion time as a marker of proficiency observed the presence of a learning curve, ranging from 3 to 15 cases 17,20,22 or 62 screws, again, as in Onen et al. 18 Three of the 7 articles using fluoroscopy time per placed screw observed the presence of a learning curve, ranging from 8 to 10 cases 21,29 or 62 screws (Onen et al. 18 ). One of the 2 articles considering system registration time observed a decrease with increased proficiency. 13 The majority (n = 13) of included studies performed dichotomous analyses with a predetermined case or screw threshold. 12,13, 16-23, 27, 29,31 For those performing this analysis, the described cut-points were 3 to 30 cases or 15 to 62 screws. Ten articles performed linear regressions of the examined endpoint as a function of the number of cases performed or screws placed. 9,14,16,17,20,24,25,28,30,32 For those performing said linear regressions, the improvement in screw placement time was estimated at 0.10 minutes per screw for each additional case performed, 20 and the improvement in operative time ranged from 0.24 to 4.6 minutes per case for each additional case performed. 9,16,24,25,32 Two studies, those of Avrumova et al. 13 and Chen et al., 15 performed logistic regressions. Avrumova et al. 13 performed a regression for screw placement time as a function of screws placed in addition to a dichotomous analysis. In this analysis they found that the learning curve was rela-

Discussion
In the present systematic review, 21 articles were identified that investigated the presence of a learning curve in spinal robotics, of which 18 provided evidence for the existence of a learning curve, ranging from 3 to 30 cases or 15 to 62 screws, depending on the endpoint examined. 9,12-25, 28, 31,32 Importantly, of those series with the greatest experiences described-5 series in which the surgeon had used the robot in ≥ 50 cases 12,13,15,16,19 -80% documented the presence of a statistically significant learning curve. Improvements were most commonly noted in total operative time, screw accuracy, and screw placement time. Of the 3 articles failing to document a learning curve, 27,30,32 2 described very small experiences of 13 and 20 cases. 30,32 Given that a significant plurality of the series documented a learning curve of 20 to 30 cases to achieve proficiency, it is likely that these series were too small to determine the presence or absence of a learning curve. Similarly, the third article tested for the presence of a learning curve using a threshold of 30 screws, 27 which is far below the number of screws that would be placed in 20 single-level fusions. Consequently, the data would argue for the presence of a learning curve given that a sufficient number of cases are examined.
Despite the evidence gathered supporting the existence of a learning curve in spinal robotics, the body of evidence is of low quality, comprising single-surgeon series. A pooled analysis of prior experiences would help to generate high-level evidence and enable more powerful conclusions to be drawn. Additionally, such a pooled analysis would help to control for baseline surgeon experience, as greater familiarity with spine surgery, in general, would be » CONTINUED FROM PAGE 5  expected to serve as a sounder knowledge base onto which the intricacies of robotic surgery could be overlaid. Only 2 of the identified studies examined surgeon experience or level of training as a factor in the learning curve for robotic systems. 22,23 Urakov et al. 22 saw no difference in the time of screw insertion for junior residents (postgraduate years [PGYs] 1-5) compared with senior residents (PGYs 6-7) or fellows using the Mazor Renaissance. However, the number of cases performed by each resident was quite small, and the interaction of learning curve with level of training was not reported. Similarly, Siddiqui et al. 23 suggested that there was minimal interaction between level of training and learning curve in a three-surgeon series composed of a single attending and two postgraduate neurosurgical spine fellows using the ExcelsiusGPS. All three surgeons experienced significant improvement in screw tail placement accuracy over the first 30 screws placed. Only one of the fellows failed to show improvement in screw tip placement; however, their baseline performance was already superior to that of the attending and other fellow. The reason for this is unclear; however, it may be that all three surgeons had the minimum knowledge base required to take advantage of the robot, whereas a more junior resident might not.

Influence of Surgical Robots on Residency Curricula
Although novel for the field of neurosurgery, the issue of whether and how best to integrate surgical robotics within residency curricula is not unique. A similar issue was faced by the general surgery, obstetrics/gynecology, and urology fields in the past decade as they were forced to determine how best to integrate the increased utilization of surgical robots across their surgical disciplines. 33,34 From a topdown perspective, all three fields have elected to consider robot-assisted procedures differently. Both obstetrics/gynecology and general surgery treat robot-assisted cases as a subtype of laparoscopic surgery, which had already been integrated into the case minimums outlined by the Accreditation Council for Graduate Medical Education (ACGME), the accrediting body for US residency programs. 35,36 By contrast, effective as of 2021, ACGME mandates that graduating urology residents complete a minimum of 80 robot-assisted cases. 37 The latter move has been suggested to, at least in part, been driven by the increasing volume of robot-assisted urological procedures, which comprised nearly half of all major urological surgeries during the 2016 to 2017 year. 38 Consequently, reworking case minimums to better equip residents for the reality of clinical practice was a consideration to improve job prospects.
Penetration of surgical robotics within general surgery and obstetrics/gynecology is far lower, however, accounting for only 29% of uterine procedures and 8% of colorectal procedures between 2001 and 2015 (compared with 77% of prostate procedures). 39 Additionally, many general surgery 40 and obstetrics/gynecology 41 programs do not regularly expose trainees to robotic cases. Consequently, imposing case log minimums for these procedures would place an unreasonable burden on trainees. Additionally, it has been noted that increased use of surgical robots is associated with decreased trainee participation in surgical cases and decreased surgical volume in other tracked categories. 42

Application of Spinal Robotics to Neurosurgery Residency Curricula
Unlike robotics in general surgery, urology, and obstetrics/gynecology, spine surgery robots are relatively new. Therefore, few programs have these devices available, thereby precluding their inclusion in ACGME case minimums. Additionally, their indications are relatively limited, as their main purpose is to facilitate the accurate placement of pedicle screw instrumentation. As a result, the proportion of neurosurgical cases in which such a device could be used comprises the minority of cases that will be seen by or require logging by neurosurgical residents. Nevertheless, formalization of a spine robot curriculum for residents pursuing specialization in spine surgery can help to ensure these residents experience their robot learning curves during residency, and, thereby, shift the poorer outcomes from the time during which they are practicing independently to one in which they are supervised by a senior surgeon adept at the use of the robot. Additionally, acquisition of these skills may increase the marketability of residency graduates, as the majority of the public perceives surgical robots to be associated with better results and fewer complications, irrespective of the available data. 43,44 Based on the evidence identified in the literature review here, we propose a curriculum in which senior-level residents (≥ PGY5) can obtain a certificate verifying competence with spine surgery robots. Under this curriculum, residents must perform a minimum of 30 surgical cases with one of the currently approved and marketed spine robots for which learning curves have been previously described (i.e., Mazor X, Mazor X Stealth, and Excelsius-GPS). In addition, these residents should also complete any industry-sponsored training programs that are offered to surgeons who will be using these technologies, as these may highlight unique features of the varied systems, some of which may not be present at the residents' home institutions. To participate, residents should have met their ACGME-mandated minimums for spine cases (300 in the latest ACGME guidelines). 45 Having met these minimums helps to ensure that residents have acquired the basic anatomical knowledge necessary to verify the fidelity of robotic trajectories. It also helps to ensure that residents can troubleshoot cases in which the robot fails to register patient anatomy (up to 8% of cases) 12 or otherwise forces screws to be placed manually after registration (up to 9% in one large series). 12 The ability of surgeons to fall back on more conventional placement techniques (image guided, freehand, or fluoroscopy guided) is essential and ensures that robots are only used as tools to facilitate superior outcomes for spine surgeons as opposed to being employed by those who are incapable of performing the primary procedure without assistive technology. 46 In truth, no curriculum can ensure that surgeons are competent with robotics or any procedure for that matter, but akin to residency training in general, it can improve the probability that users are competent with the device. An alternative to the aforementioned structure would require a minimum number of screws (vs cases) be placed to achieve certification. However, some of the articles identified in this review, such as that of Chen et al., 15 found that the major driver of decreased operative time was a decrease in overall robot usage time (vs screw instrumentation time alone). Consequently, the learning curve (and, therefore, assessment of a trainee's progression along said curve) should be determined by both the number of screws placed and the number of times the trainee has gone through the workflow of setting up the robot and plotting screw trajectories. Accomplishing both occurs at the case level; therefore, it is argued that ensuring a minimum number of robotic cases performed helps to better ensure a minimum level of competency.

Limitations
The present review has several limitations, foremost of which is the low level of evidence. All included studies were level IV case series, the majority of which were single-surgeon series. Ideally, individual, patient-level data could be included in a future analysis to reach more generalizable results, as well as to investigate the relative influence of user training level and robotic platform on the learning curve. Additionally, the source data are all retrospective, and many series are nonconsecutive, meaning that included cases could have been selected for robot assistance preoperatively based on surgeon-identified factors that would facilitate a better outcome. It is impossible to know the degree to which this affected the present results, and future studies that prospectively investigate and document surgeon results with surgical robots are required. Furthermore, the evidence in the present review is drawn from both current general robots (ExcelsiusGPS, Mazor X Stealth, Mazor X, and TiRobot) and prior versions (Mazor SpineAssist and Mazor Renaissance). The latter employed more fragile platforms and less-refined imaging software to plot screw trajectories. Consequently, they may have been associated with longer learning curves as surgeons adjusted to their varying results.
Another limitation of the present review is that a diverse set of endpoints are reported to describe the learning curves associated with spine robot use. This may help to account for the variability in the identified minimum number of cases required to achieve proficiency. However, it must be noted that few studies assessed a total robot usage time, with most looking at screw placement time, screw accuracy, or screw fluoroscopy time. It has previously been noted that robot system registration time represents a nontrivial time cost associated with spine robot system utilization. Time expended with robot system positioning in the operating room, reference frame placement and registration, and reregistration of the system in the case of long-segment constructs represent a greater proportion of the total case time than does actual screw placement. Consequently, the learning curve associated with robot system usage may prove to be different when focusing on these steps. Further investigation of robotic learning curves that examines both changes in screw placement time and system setup and registration times is necessary to better appreciate the true learning curve of these devices.
Lastly, the proposed development of a certificate program for residency curricula is contingent upon residency programs having a sufficient robotic case volume. As no good data yet exist for the percentage of spine surgeries being performed robotically, or the percentage of training centers actively using robots for their spine cases, it is unclear that such a certificate program is feasible at this time. Nevertheless, given the increasing spine robot case volumes reported by market research, 2,3 and the propensity for these advanced technologies to be found at large academic centers with sufficient financial infrastructure and case volume to support their use, it seems likely that in the near future a sufficient number of training programs will see volume sufficient enough to render the proposed certificate program feasible. Consequently, we advocate for earlier discussion of such curriculum changes in anticipation that they will be needed in the near future. Such discussions will also broach topics of import, such as whether a generalized curriculum is sufficient, or whether platform-specific curricula will be necessary. This decision will likely benefit from better data on the learning curves associated with spine robot workflow steps other than screw placement.

Conclusions
In the present systematic review, we found that the majority of published series demonstrate the presence of a learning curve in spinal robotics. Descriptions of the length of the curve and the metrics used to define the curve are highly heterogeneous. However, multiple studies have described that surgeons experienced continued improvement in outcomes up to a nadir of 20 to 30 cases, which coincides with our previously described experience. Based on this, we propose that an optional robotic spine surgery certificate program could be developed for senior-level residents pursuing spine specialization who have already completed the ACGME minimum requirements for spine surgery cases. However, additional investigation in prospective series using standardized outcomes (i.e., screw placement time, operative time, and instrumentation placement accuracy) on currently employed systems is required to account for potential reporting bias present in the extant series.