Editorial. When can we be positive about p values?

Restricted access

If the inline PDF is not rendering correctly, you can download the PDF file here.

To and Jukes have performed a very valuable service to the neurosurgical community by undertaking an exhaustive analysis of the frequency and features of p values appearing in the abstracts of articles published in 4 high-impact neurosurgical journals between 1990 and 2017.19 Using a computer-executed search strategy, they show that the frequency of abstracts containing p values has increased by 0.86% per year over that 27-year interval (95% confidence interval [CI] 0.61%–1.2%, p < 0.001), from 8.52% in 1990 to 34.0% in 2017. There has been a corresponding increase in the number of p values per

Article Information

Correspondence Michael Glantz: Penn State Milton S. Hershey Medical Center, Hershey, PA. mglantz@pennstatehealth.psu.edu.

ACCOMPANYING ARTICLE DOI: 10.3171/2018.8.JNS172897.

INCLUDE WHEN CITING Published online February 8, 2019; DOI: 10.3171/2018.10.JNS182494.

Disclosures The authors report no conflict of interest.

© AANS, except where prohibited by US copyright law.



  • View in gallery

    Illustration of common situations in which p values are routinely misinterpreted. Study 1 is not statistically significant, and we cannot be certain about its clinical importance. The intervention may be great or it may be terrible; the effect size estimate is too imprecise (i.e., the CI is too wide) to tell. This result provides no guidance. Study 2 is not statistically significant and not clinically important. Even the upper limit of the CI is not clinically important. A definitively “negative” result. Study 3 is both clinically and statistically significant. The effect size is large, favors the intervention, and the CI lies entirely within the “important” range. A definitively “positive” result. Study 4 is statistically significant, but clinically not important. The effect size falls entirely within the “unimportant” range. Again, a definitively “negative” result. Study 5 is statistically significant, but we cannot tell about its clinical importance. The CI includes both clinically important and unimportant effects. The effect size estimate is too imprecise to provide any guidance. Figure is available in color online only.



Austin PCMamdani MMJuurlink DNHux JE: Testing multiple statistical hypotheses resulted in spurious associations: a study of astrological signs and health. J Clin Epidemiol 59:9649692006


Bedard PLKrzyzanowska MKPintilie MTannock IF: Statistical power of negative randomized controlled trials presented at American Society for Clinical Oncology annual meetings. J Clin Oncol 25:348234872007


Bilimoria KYStewart AKWinchester DPKo CY: The National Cancer Data Base: a powerful initiative to improve cancer care in the United States. Ann Surg Oncol 15:6836902008


Chavalarias DWallach JDLi AHTIoannidis JPA: Evolution of reporting p-values in the biomedical literature 1990-2015. JAMA 315:114111482016


Cleophas TJZwinderman AH: Clinical trials are often false positive: a review of simple methods to control this problem. Curr Clin Pharmacol 1:142006


Engel J JrMcDermott MPWiebe SLangfitt JTStern JMDewar S: Early surgical therapy for drug-resistant temporal lobe epilepsy: a randomized trial. JAMA 307:9229302012


French JA: Proof of efficacy trials: endpoints. Epilepsy Res 45:53592001


French JAKrauss GLSteinhoff BJSquillacote DYang HKumar D: Evaluation of adjunctive perampanel in patients with refractory partial-onset seizures: results of randomized global phase III study 305. Epilepsia 54:1171252013


Gamble CKrishan AStocken DLewis SJuszczak EDoré C: Guidelines for the content of statistical analysis plans in clinical trials. JAMA 318:233723432017


Held L: A nomogram for P values. BMC Med Res Methodol 10:21262010


Ioannidis JPA: Why most published research findings are false. PLoS Med 2:e1242005


McGirt MJSperoff TDittus RSHarrell FE JrAsher AL: The National Neurosurgery Quality and Outcomes Database (N2QOD): general overview and pilot-year project description. Neurosurg Focus 34(1):E62013


Neuhäuser M: How to deal with multiple endpoints in clinical trials. Fundam Clin Pharmacol 20:5155232006


Ocana ATannock IF: When are “positive” clinical trials in oncology truly positive? J Natl Cancer Inst 103:16202011


Page MJMcKenzie JEKirkham JDwan KKramer SGreen S: Bias due to selective inclusion and reporting of outcomes and analyses in systematic reviews of randomised trials of healthcare interventions. Cochrane Database Syst Rev 10:MR0000352014


Pocock SJGeller NLTsiatis AA: The analysis of multiple endpoints in clinical trials. Biometrics 43:4874981987


Schmidt SFranke JRauschmann MAdelt DBonsanto MMSola S: Prospective, randomized, multicenter study with 2-year follow-up to compare the performance of decompression with and without interlaminar stabilization. J Neurosurg Spine 28:4064152018


Shiloach MFrencher SK JrSteeger JERowell KSBartzokis KTomeh MG: Toward robust information: data quality and inter-rater reliability in the American College of Surgeons National Surgical Quality Improvement Program. J Am Coll Surg 210:6162010


To MSJukes A: Reporting trends of p values in the neurosurgical literature. J Neurosurg [epub ahead of print February 8 2019. DOI: 10.3171/2018.8.JNS172897]


Vera-Badillo FEShapiro ROcana AAmir ETannock IF: Bias in reporting of end points of efficacy and toxicity in randomized, clinical trials for women with breast cancer. Ann Oncol 24:123812442013


Wang DLi YWang XLiu XFu BLin Y: Overview of multiple testing methodology and recent development in clinical trials. Contemp Clin Trials 45 (Pt A):13202015


Wiebe SBlume WTGirvin JPEliasziw M: A randomized, controlled trial of surgery for temporal-lobe epilepsy. N Engl J Med 345:3113182001




All Time Past Year Past 30 Days
Abstract Views 174 174 174
Full Text Views 2377 2377 2377
PDF Downloads 21 21 21
EPUB Downloads 0 0 0


Google Scholar