If been a popular method to convey results

If you have come across a research article in the field of biomedical, there is a high chance that you will see the letter ‘P’, followed by an equality or inequality expression and then by a number between 0 and 1 inclusive, either through reading thoroughly or using the command ‘Control-F’ to search. This is known as the P-value in Statistic.

Statistical testing has been a popular method to convey results from studies in many research fields in biomedicine and other disciplines. The most common way of reporting statistical findings is the use of P-value, which has provided the basis of evidence to support findings in many biomedical literatures. However, did the use of P-values increase the reliability of the results? Are we quoting and using them correctly?

P-value, ranged between 0 and 1 inclusive, is an indication whether the result from a study can provide evidence to support the hypothesis of the study, which is tested out in the study. A value closer to zero suggests that it has high statistically significance and the result can provide the evidence to support the hypothesis. On the other hand, a value closer to the number one suggests that it has low statistical significance and the result should not be used to provide evidence to support the hypothesis. There is also a threshold value that serves as a reference whether to reject or do not reject the null hypothesis. Usually, this value is set to be 0.5, below which the P-value is statistically significant, and the result can be used as an evidence to support the hypothesis.

While using P-values in research articles has become increasingly common over the years, the bias of selecting and highlighting results that are statistically significant and erasing the ‘negative’ results which are undesirable to the study has also seen an increase in cases over the year, especially in the abstract, as reported in a research article “Evolution of Reporting P Values in the Biomedical Literature, 1990-2015” by David Chavalarias, Joshua David Wallach, Alvin Ho Ting Li et al.

In the research article, it was mentioned that there is a ‘strong clustering around some specific rounded P values, most commonly P-values of .05 and of .001 or smaller’, suggesting that of the sample of texts analysed, many that included P-values as statistical evidence display only those P-values which will support the study’s purpose. In addition, it was also reported that ‘more “strongly” significant results (ie, P values of .001 or smaller) were reported more commonly in abstracts than in the PMC full-text articles.’, indicating that there is a deliberate usage of P-values to attract readers by presenting the best result. Abstract, which is what most people will start reading from and primarily focus on, thus provided a somewhat distorted picture of evidence. Therefore, is it still reliable to depend on P-values?

Yet, it is suggested that we should not abandon P-values completely, which is still a useful and important tool to present statistical finding, but to incorporate other alternative methods such as Bayesian methods, effect sizes and the measure of uncertainty to understand the results. ‘Negative’ results should also be included, and an unbiased communication of results should be used. So, the next time you are going to read a research article, do not just take in all the numbers just because they are tiny enough.