 AIPMED

## P - VALUE AND CONFIDENCE INTERVALS – FACTS AND FARCES

P values and confidence intervals are reported in almost all scientific writings and are used in interpreting results of statistical analysis. It is usual for medical researchers and other investigators to ask questions such as ‘Is the result significant?’ or ‘what is the p value?’ Many clinicians worry when they carry out statistical analysis and there are no significant results. This article describes some facts about the p value and confidence intervals.

The reporting of p values and confidence intervals usually follows hypothesis testing or significance testing. Most scientific investigations involve the testing of hypotheses. These are formal procedures for testing whether findings from the investigations are compatible with a so called null hypothesis. Hypotheses refer to statements concerning the situation being investigated which are usually stated as two mutually exclusive options; a null hypothesis and an alternative hypothesis. The null hypothesis is a statement of no association between variables or no difference in means of groups while the alternative hypothesis states that there’s a difference or an association.  The interests of medical researchers are varied and research questions result in statement of hypotheses. Examples of such questions are: Is there a significant difference in proportion of low birth weight babies delivered to mothers with single and multiple pregnancies?; Is there a difference in effects of three antiretroviral drugs on reduction in viral load?; Is there a correlation between body mass index and systolic blood pressure; or Is there a difference in reduction in blood sugar between a standard hypoglycemic and a new drug. The null hypothesis for the last study objective will be ‘There is no correlation between body mass index and systolic blood pressure’. The use and interpretation of p values and confidence intervals will now be discussed.

There are different definitions of the ‘p value’. Perhaps the most popular definition is ‘The probability of obtaining a value as extreme or more extreme as found in the study if the null hypothesis were true’.1 Put more simply, it can be defined as the probability that the observed result is due to chance alone.2  An important point to note in these definitions is the use of phrases ‘found in the study’ and ‘observed result’. The p value only tells us whether what we have observed – which is usually obvious- is statistically significant. This is an important point to note. For example in a study which examined the difference in prevalence of low birth weight deliveries between singleton and multiple pregnancies, the figures for the prevalence could have been 12.5% for multiple pregnancies and 3.6% for singletons. All the statistical jargon about p value and confidence intervals do not negate the fact that the proportion of low birth babies delivered to mothers with multiple pregnancies is higher than the proportion for singletons. Hence from the initial descriptive statistics used to summarize variables such as proportions and means we have an idea of the results of our study but the statistical significance is what the p value helps to ‘endorse’. The interpretation of p values is based on reference to a particular cut off for the probability or the so called level of significance which is conventionally set at 0.05. Hence p values less than this number are significant while those above are not.