“P” in P-value means the “Probability”- how likely it is that your sample data was *ONLY* BECAUSE OF RANDOM SAMPLING, not because of the actual difference. So, if the p-value is small, that's good because it tells you that the result of your experiment is not just due to chance.
P values are used in hypothesis testing to help decide whether to reject the null hypothesis. The smaller the p value, the more likely you are to reject the null hypothesis.
If the p-value is less than 0.05, it is judged as “significant,” and if the p-value is greater than 0.05, it is judged as “not significant.” However, since the significance probability is a value set by the researcher according to the circumstances of each study, it does not necessarily have to be 0.05.
A p-value > 0.05 would be interpreted by many as "not statistically significant," meaning that there was not sufficiently strong evidence to reject the null hypothesis and conclude that the groups are different. This does not mean that the groups are the same.
Statistical Significance, the Null Hypothesis and P-Values Defined & Explained in One Minute
Is a large or small p-value good?
The lower the p-value, the greater the statistical significance of the observed difference. A p-value of 0.05 or lower is generally considered statistically significant. P-value can serve as an alternative to—or in addition to—preselected confidence levels for hypothesis testing.
A P-value less than 0.5 is statistically significant, while a value higher than 0.5 indicates the null hypothesis is true; hence it is not statistically significant.
In reality, p value can never be zero. Any data collected for some study are certain to be suffered from error at least due to chance (random) cause. Accordingly, for any set of data, it is certain not to obtain "0" p value. However, p value can be very small in some cases.
A small P-value signifies that the evidence in favour of the null hypothesis is weak and that the likelihood of the observed differences due to chance is so small that the null hypothesis is unlikely to be true.
How do I interpret P values? If the P value is less than that critical value, you reject the null hypothesis. If it is equivalent or higher than the critical value, you fail to reject the null hypothesis. Keep in mind, smaller is ``better'' when it comes to interpreting P values for significance.
A p-value of 0.99 means that practically there is no effect, no association, no correlation between two variables and the situation is so simply straight forward that one would not even have to go for any test.
In an extremely high powered experiment (e.g., 99% power) the p-value will be smaller than . 01 in approximately 96% of the tests, and between 0.01 and 0.05 in only 3.5% of the tests. In general, the higher the statistical power of a test, the less likely it is to observe relatively high p-values (e.g., p > . 02).
The p-value is less than or equal to alpha. In this case, we reject the null hypothesis. When this happens, we say that the result is statistically significant. In other words, we are reasonably sure that there is something besides chance alone that gave us an observed sample.
A reported P-value of 0 can mean either or both of the P-value being (1) too small to calculate or (2) smaller than the reported resolution. In Stata -- which this question is not about -- P reported as 0.000 means just <0.0005 (and further decimal places can usually be retrieved with some effort).
What if the p-value is smaller than the critical value?
When test statistic exceeds the critical value, we reject the null hypothesis. To your point, the p value could be less than 0.05 and we could still have the test statistic be less than the critical value. This would mean our chosen α was smaller than 0.05, and would mean we would fail to reject the null.
The lower the p-value is, the lower the probability of getting that result if the null hypothesis were true. A result is said to be statistically significant if it allows us to reject the null hypothesis. All other things being equal, smaller p-values are taken as stronger evidence against the null hypothesis.
And although 0.5 or below is generally regarded as the threshold for significant results, that doesn't always mean that a test result which falls between 0.05 and 0.1 isn't worth looking at. It just means that the evidence against the null hypothesis is weak.
When there is a meaningful null hypothesis, the strength of evidence against it should be indexed by the P value. The smaller the P value, the stronger is the evidence.
If the p-value is less than 0.05, we reject the null hypothesis that there's no difference between the means and conclude that a significant difference does exist. If the p-value is larger than 0.05, we cannot conclude that a significant difference exists. That's pretty straightforward, right? Below 0.05, significant.
A P-value of 0.01 infers, assuming the postulated null hypothesis is correct, any difference seen (or an even bigger “more extreme” difference) in the observed results would occur 1 in 100 (or 1%) of the times a study was repeated.
It is inappropriate to interpret a p value of, say, 0.06, as a trend towards a difference. A p value of 0.06 means that there is a probability of 6% of obtaining that result by chance when the treatment has no real effect. Because we set the significance level at 5%, the null hypothesis should not be rejected.
A small p-value means that it's greater than chance alone, something happened; the test is significant. Whereas a large p-value indicates that the result is within chance or normal sampling error, or in other words nothing happened, the test is not significant. And p values range from 0 to 1.
P-values only indicate whether an observed effect is statistically significant, but do not provide any information about the size of the effect. In contrast, effect sizes provide a measure of the practical or clinical significance of an effect, which can be more relevant for making medical decisions.
If p=0.051, then the result is “insignificant”, a mere “trend” in the data which is easily dismissed. However, if one data point is added causing a drop to p=0.049 then the result is suddenly, magically significant.