Statistical Methods: Need for a Rethink

Correspondence

Indian Pediatr 2017;54: 65

	Statistical Methods: Need for a Rethink
*Tamoghna Biswas and ^#Subhabrata Majumdar From *Department of Pediatric Medicine, Medical College Kolkata, India; and ^#School of Statistics, University of Minnesota, Twin Cities Minneapolis, USA. Email: [email protected]
We read with interest the recent article [1] in Indian Pediatrics on reporting statistical results, and would like to commend the authors on lucidly summarizing such an important topic. Indeed, as the British economist Ronald Coase had said, "if you torture the data long enough, it will confess." Arguably, nowhere is the sentence more appropriate than describing the biomedical fraternity’s obsession with statistical significance. A recent analysis [2] found a significant increase in reporting of P-values in Medline abstracts over the past twenty-five years, and unsurprisingly most of them reported significant results. Moreover, P-values between 0.041 and 0.049 have increased manifold in the last couple of decades [3]. While this might imply an increase in ‘P-hacking’, the consequences of which are being debated worldwide, a more worrisome trend is the over-reliance on frequentist inference and the widespread misunderstanding of the P-value. Sterne, et al [4], in their authoritative piece in the British Medical Journal, stated "an arbitrary division of results, into "significant" or "non-significant" according to the P value, was not the intention of the founders of statistical inference." The American Statistical Association (ASA) has acknowledged common misuses of the p-value and categorically asserted "P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone" [5]. At best, P-value can inform our decision regarding whether the data under consideration is compatible with a particular null hypothesis, but by itself it is not sufficient to comment on neither the truth of the null hypothesis, nor the biological or clinical significance of the results. While statisticians have voiced such concerns for years, we continue to teach, learn, use, over-use and misinterpret p-values in our literature. These concerns may sound only semantic, but are by no means irrelevant when we consider the larger landscape of publication bias. The use of Bayesian inference can potentially circumvent some of the problems [4], for example by generating a reliable ‘credible interval’ for an estimated parameter even for small sample sizes. Relatively modern statistical techniques like bootstrap and confidence distribution are worth exploring as well. Although the biomedical community has traditionally been slow in taking up and implementing new approaches, we hope a greater awareness of proper statistical methods and the willingness to adopt new techniques can change the scenario. This can be made possible through closer collaboration between biomedical scientists and statisticians. References 1. Khan AM, Ramji S. Reporting statistics in biomedical research literature: The numbers say it all. Indian Pediatr. 2016;53:811-4. 2. Chavalarias D, Wallach JD, Li AH, Ioannidis JP. Evolution of reporting P values in the biomedical literature, 1990-2015. JAMA. 2016;315:1141-8. 3. de Winter JC, Dodou D. A surge of p-values between 0.041 and 0.049 in recent decades (but negative results are increasing rapidly too). Peer J. 2015;3:e733. 4. Sterne JA, Davey Smith G. Sifting the evidence-what’s wrong with significance tests? BMJ. 2001;322:226-31. 5. Wasserstein RL, Lazar NA. The ASA’s statement on p-values: Context, process, and purpose. The American Statistician. 2016:70:129-33.