Reporting Statistics in Biomedical Research Literature: The Numbers Say it All

the art and science of writing a paper

Indian Pediatr 2016;53: 811-814

Reporting Statistics in Biomedical Research Literature: The Numbers Say it All

Amir Maroof Khan and *Siddarth Ramji

From the Department of Community Medicine, University College of Medical Sciences and GTB Hospital, and *Department of Neonatology, Maulana Azad Medical College and associated Lok Nayak Hospital; New Delhi, India.

Correspondence to: Dr. Amir Maroof Khan, Associate Professor, Department of Community Medicine, University College of Medical Sciences and GTB Hospital, Dilshad Garden, Delhi, India. [email protected]

Editor’s Note: Writing a scholarly article (and getting it accepted too) is both ‘art’ and ‘science’. Most reputed journals have a high rejection rate, and extensive editing is required in most of the manuscripts that are accepted. There is no formal training in paper writing during medical schooling, but faculty members of medical colleges are expected to write papers in high impact medical journals for career promotions. Consequently, there is increasing incidence of plagiarism, duplicate publication and fraud in paper-writing. The huge trap of predatory journals is also a challenge for the scientific community. The articles in this series will aim to help and guide the readers in writing articles for medical journals. Simplicity will be the key ‘mantra’ for this series. I hope that readers will find the series useful; any comments and feedback are welcome. These may be directly communicated to the authors or to the journal office at jiap.nic.in. Comments can also be posted on the relevant thread at www.facebook.com/indianpediatrics.

Statistics is the cornerstone of evidence-based quantitative research. Statistical analysis and outputs pave the way for clinical and policy decision-making, thereby impacting the health status of countless individuals across the world. Various studies have shown that statistical reporting is inappropriate and incorrect in biomedical journals [1-3]. Uniformity and transparency in statistical reporting strengthens the validity and reliability of the scientific literature. Most biomedical researchers are not comfortable with statistics and hence the reporting of the statistical output is quite varied, confusing and meaningless to the reader. This article attempts to provide an overview of good practices of reporting statistics in biomedical research literature.

The statistical reporting guidelines and styles being presented in this article draw from instructions for authors of Indian Pediatrics, International Committee of Medical Journal Editors (ICMJE) guidelines, Enhancing Quality and Transparency of Health Reporting (EQUATOR) guidelines, and current practices observed in the published research literature [4-6].

As different journals have their specific requirements regarding reporting of statistics, it is necessary that the authors go through the instructions for authors and some already published articles of the journals to which they intend to submit their manuscript for publication. However, the guiding principle of reporting statistical analyses is to

"Describe statistical methods with enough detail to enable a knowledgeable reader with access to the original data to verify the reported results [4]."

Report Statistics When Relevant

"Can we do some statistical analysis and report from this data here?"

This is a common non-specific question which biomedical researchers ask their colleagues who help them in data-analysis. The requirement in this case is to insert some statistical test result in the manuscript, without giving a thought to the fact as to how it would fit in with the research objectives. The question reflects the investigator’s lack of clarity and understanding of their research/study objectives. Had they been clear about their research objectives, the question would have been something like this: "How to compare these means?" Or, "What is the effect of this variable over another one?" Or, "Is there an association between these variables?" The point of serious concern is that even the peer-review process, at times, fail to pointedly enquire about the relevance and the resultant interpretation of the statistical tests applied.

At times, biomedical researchers are not even aware of the purpose of statistics being applied in their research. They just want to have it because it’s a ‘cool’ thing to do or their belief that it increases the chances of publication. Some are of the notion that without reporting a ‘p-value’, a study is considered as irrelevant. An understanding of the conceptual framework of the study is important before embarking on the statistical analysis. The appropriate application of statistical tests would then aid in the meaningful deconstruction and interpretation of the study results. This would also make the discussion and the conclusion sections more meaningful. Statistics is a vital part of biomedical literature, but only if it’s relevant. Else it loses its importance. Statistical tests should be applied and reported where it is relevant and not just for the sake of reporting it.

Statistical tests are used in biomedical research broadly for two reasons:

1. Estimation studies. In this type of studies, there is no hypothesis statement. The research question is to find out a ‘population estimate’ from a given ‘sample data’ [7]. Examples of such estimates include determining the mean weight for age Z-score (WAZ) of under-five children or proportion of underweight births in the newborn population. In these two examples, the mean WAZ score and the proportion are the statistics to be estimated using the sample dataset. In both these cases there is no hypothesis testing involved. Statistical applications for such objectives will primarily be restricted to reporting of Standard Errors of Means and Proportions and their related confidence intervals (CI), and will be devoid of any p-value. Statistics is employed here to extrapolate the result from the sample data to the population data in quantitative terms. Hence, statistical tests and the accompanying p-values are irrelevant while reporting population estimates.

2. Inferential studies (Hypothesis testing). These studies intend to determine an association between two or more variables. These studies usually are designed to test a hypothesis [7]. Case control studies, experimental studies (randomized or non-randomized, with or without control group), typical cohort studies have a null hypothesis at the start and fall in this category. Statistical tests of significance and the accompanying p-values and effect sizes become relevant and should be reported in studies having hypothesis testing as an objective.

In a research manuscript, statistical methods and statistical outputs are reported in the Methods section and the Results section, respectively. What and how you report the statistical methods and outputs in your manuscript will also depend on the journal you are submitting your manuscript to.

Reporting of Statistical Methods in the Methods Section

The statistical methods reported in the methods section will depend on the study design and objective of the study. Box 1 presents the key points to be noted while reporting statistical methods in the methodology section of a biomedical research manuscript.

Box 1 Statistical Information to be Included in the Methods Section

• All the statistical methods employed in the study.

• Any data transformation done for the purpose of statistical analysis.

• Identify any uncommon statistical method applied, with a reference.

• The order of the statistical methods described should follow that of the objectives mentioned.

• Report the statistical tests in the context of the research objectives, and not as generic statements.

• Use appropriate statistical tests for analyzing paired data.

• Consider the type of distribution while selecting and reporting statistical tests.

• Mention the statistical analysis software packages used for data analysis only for complex analysis.

In the case of observational studies, this section should report all methods used, including those for confounder control. The section must address how missing data was addressed (including loss to follow-up in cohort studies), subgroup analysis if any that were planned, and how matching was done for cases and controls (in the case of case-control studies). When reporting randomized trials, the methods used to compare the primary and secondary outcomes and any additional analysis that were planned must also be reported in this section. The details of what to report in the method sections are available in the reporting guidelines for each study design [6]. The statistical tests employed should be mentioned with respect to the variables being analysed, rather than as standalone general statements. Some examples of stating the test used could be "Categorical data to test for presence of association between failure to regain birth weight and the likely risk factors was analysed using Fishers exact test", or "The strength of association between the factors and failure to regain birth weight among the infants studied was determined using odds ratios and confidence intervals" [8].

This section must also report methods used to transform raw data prior to analysis such as converting non-normal to normal distribution, collapsing categories in categorical data, etc. While reporting common tests used such as Chi-square, Fisher exact, student t-test, linear regressions, no citations are needed. However, when reporting more complex analysis, the authors must cite the source, which should preferably be a standard textbook [9]. There is no need to report the analytical software used for the basic descriptive analysis. In the case of statistical analysis involving hypothesis testing, reporting the analytical software is useful and very much required. The alpha level used to define statistical significance must also be mentioned in this section. Box 1 presents the key points to be noted while reporting statistical methods in the methodology section of a biomedical research manuscript.

Reporting of Statistical Results

The numerical results must be presented keeping the study objectives in mind. Box 2 presents the important points to be kept in mind while reporting statistics in the results section of a research manuscript.

Box 2 Reporting Statistical Outputs in the Results Section of the Manuscript

• Avoid nontechnical uses of technical terms in statistics.

• Explicitly state the groups being compared.

• Report exact P-values, and not just as significant or non-significant.

• Do not report P-values as 0.000. In such cases, report as P<0.001.

• Report the effect sizes with their confidence intervals.

• Usually P-values with effect sizes and the associated CI are sufficient while reporting the statistical test results, unless the journal asks for additional details.

• Do not use the term ‘correlation’, a statistical method to assess the relationship between two continuous variables, to describe ‘association’.

• Refer to relevant reporting guidelines for the study design e.g., STROBE, CONSORT.

What must be included? For descriptive statistics, the point estimate and 95% confidence interval should be reported. For comparative studies, rates, risk, ratios or the mean difference along with their precision such as 95% confidence interval or standard deviation must be reported.

If P-values are included, the actual value up to 1 or 2 decimal spaces may be reported (e.g. P=0.2 or P=0.41); values less than 0.001 should be reported as P<0.001. P-values should not be reported as "not significant" or "NS".

When reporting outcome results, first the results of primary outcomes must be reported and later that of the secondary and any other sub-group analysis. Post-hoc analysis that had not been pre-specified must not be reported.

Box 2 presents the key messages regarding reporting of statistics in the results section of a research manuscript.

What can be omitted? Most statistical softwares will churn out a plethora of outputs during analysis. Typically the outputs would include test statistic (e.g., chi-square statistic, t-statistic, F-statistic), p-value, and degrees of freedom; all of these can be omitted including p-values (but some reviewers would insist on the reporting of p-values). In manuscripts reporting randomized controlled trials, p-values should not be reported when comparing baselines variables/characteristics. Similarly, regression analysis outputs would include multiple data outputs which may include coefficients, R2, standard errors and p-value. It is best to avoid reporting these in the results unless they serve a useful interpretive function. It is also best not to include complex statistical formulas [10].

Study Design-specific Statistical Results

Observational studies: For these, provide unadjusted estimates and, if applicable, confounder-adjusted estimates and their precision (e.g., 95% confidence interval). Make clear which confounders were adjusted for and why they were included. Report category boundaries when continuous variables were categorized [11]. If relevant, consider translating estimates of relative risk into absolute risk for a meaningful time period.

Randomized controlled trials: For these, provide results for each primary and secondary outcome, and the estimated effect size and its precision (such as 95% confidence interval). For binary outcomes, presentation of both absolute and relative effect sizes is recommended [12].

When reporting correlation, identify the correlation being reported – Pearson or Spearman. Report the 95% CI and the Pvalue. While reporting regression analysis, it is best to depict in a tabular format.

Conclusions

Reporting statistics in biomedical research literature has become more transparent, uniform and reliable. The medical researcher should report relevant statistics and provide meaningful interpretations in their research manuscripts. Pvalues should not be reported alone. If at all, they should be reported along with effect sizes and their confidence intervals. The test statistic and degrees of freedom can usually be omitted. Both, the summary statistics and the inferential statistics should be given careful consideration while reporting. Specific statistical tests require specific components to be reported. Various guidelines are now available to aid the medical researcher for reporting methods. It is important that the specific journal guidelines regarding reporting of statistics should be strictly followed when preparing and sending a manuscript for publication.

References

1. Hassan S, Yellur R, Subramani P, Adiga P, Gokhale M, Iyer MS, et al. Research design and statistical methods in Indian medical journals: a retrospective survey. PLoS One. 2015;10: e0121268.

2. Jaykaran, Preeti Y. Quality of reporting statistics in two Indian pharmacology journals. J Pharmacol Pharmacother. 2011;2:85-9.

3. Horton NJ, Switzer SS. Statistical methods in the journal. N Engl J Med. 2005;353:1977-9.

4. International Committee of Medical Journal Editors. Recommendations for the Conduct, Reporting, Editing and Publication of Scholarly Work in Medical Journals Available from: http://www.icmje.org/news-and-editorials/icmje-recommendations_annotated_ dec15.pdf. Accessed August 18, 2016.

5. Indian Pediatrics. Statistics, Instruction to Authors. Available from: http://indianpediatrics.net/author1.htm# Statistics. Accessed July 16, 2016.

6. Enhancing the Quality and Transparency of Health Research (EQUATOR) Guidelines. Available from: http://www.equator-network.org/ Accessed July 16, 2016.

7. Hypothesis testing and estimation. In: Jennifer P, Barton B, Elliot E, editors. Statistics Workbook for Evidence-based Health Care. 1st ed. West Sussex, UK: John Wiley & Sons; 2009.

8. Namiiro FB, Mugalu J, McAdams RM, Ndeezi G. Poor birth weight recovery among low birth weight/preterm infants following hospital discharge in Kampala, Uganda. BMC PregChildbirth. 2012;12:1.

9. Haruhiko F, Yasuo O. A Guideline for Reporting Results of Statistical Analysis in Japanese Journal of Clinical Oncology. Jpn J Clin Oncol 1997;27:21-7.

10. Lang TA, Altman DG. Basic statistical reporting for articles published in clinical medical journals: the SAMPL Guidelines. In: Smart P, Maisonneuve H, Polderman A (eds). Science Editors’ Handbook, European Association of Science Editors, 2013.

11. von Elm E, Altman DG, Egger M, Pocock SJ, Gotzsche PC, Vandenbroucke JP. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement: guidelines for reporting observational studies. Ann Intern Med. 2007;147:573-7.

12. Schulz KF, Altman DG, Moher D, for the CONSORT Group. CONSORT 2010 Statement: updated guidelines for reporting parallel group randomised trials. Ann Int Med. 2010;152:726-32.