Aneurysm occlusion has been used as surrogate marker of aneurysm treatment efficacy. Aneurysm occlusion scales are used to evaluate the outcome of endovascular aneurysm treatment and to monitor recurrence. These scales, however, require subjective interpretation of imaging data, which can reduce the utility and reliability of these scales and the validity of clinical studies regarding aneurysm occlusion rates. Use of a core lab with independent blinded reviewers has been implemented to enhance the validity of occlusion rate assessments in clinical trials. The degree of agreement between core labs and treating physicians has not been well studied with prospectively collected data.
In this study, the authors analyzed data from the Hydrogel Endovascular Aneurysm Treatment (HEAT) trial to assess the interrater agreement between the treating physician and the blinded core lab. The HEAT trial included 600 patients across 46 sites with intracranial aneurysms treated with coiling. The treating site and the core lab independently reviewed immediate postoperative and follow-up imaging (3–12 and 18–24 months, respectively) using the Raymond-Roy occlusion classification (RROC) scale, Meyer scale, and recanalization survey. A post hoc analysis was performed to calculate interrater reliability using Cohen’s kappa. Further analysis was performed to assess whether degree of agreement varied on the basis of various factors, including scale used, timing of imaging, size of the aneurysm, imaging modality, location of the aneurysm, dome-to-neck ratio, and rupture status.
Minimal interrater agreement was noted between the core lab reviewers and the treating physicians for assessing aneurysm occlusion using the RROC grading scale (k = 0.39, 95% CI 0.38–0.40) and Meyer scale (k = 0.23, 95% CI 0.14–0.38). The degree of agreement between groups was slightly better but still weak for assessing recanalization (k = 0.45, 95% CI 0.38–0.52). Factors that significantly improved degree of agreement were scales with fewer variables, greater time to follow-up, imaging modality (digital subtraction angiography), and wide-neck aneurysms.
Assessment of aneurysm treatment outcome with commonly used aneurysm occlusion scales suffers from risk of poor interrater agreement. This supports the use of independent core labs for validation of outcome data to minimize reporting bias. Use of outcome tools with fewer point categories is likely to provide better interrater reliability. Therefore, the outcome assessment tools are ideal for clinical outcome assessment provided that they are sensitive enough to detect a clinically significant change.