Fisher's Exact Test

Introduction

Fisher's Exact Test is a statistical significance test used in the analysis of contingency tables. Although in practice it is employed when sample sizes are small, it is valid for all sample sizes. It is named after its inventor, Ronald Fisher, and is one of a class of exact tests, so called because the significance of the deviation from a null hypothesis (e.g., P = 0.5) can be calculated exactly, rather than relying on an approximation that becomes exact in the limit as the sample size grows to infinity, as seen in many other statistical tests.

A photograph of a contingency table with numbers representing data points.

History and Development

Fisher's Exact Test was first published by Ronald Fisher in 1922. The test was the result of a question posed to Fisher by a colleague, Dr. Muriel Bristol, regarding the lady tasting tea problem. This problem, which involved determining whether or not a lady could correctly identify whether tea or milk was added first to a cup, led Fisher to develop the test as a way to rigorously analyze small data sets.

Mathematical Formulation

The Fisher's Exact Test is based on Hypergeometric distribution. Given a contingency table as follows:

N = a + b + c + d

The probability of obtaining any such arrangement of frequencies in a contingency table under the null hypothesis of independence is given by the hypergeometric distribution:

P = (a+b)!(c+d)!(a+c)!(b+d)! / a!b!c!d!N!

where "!" denotes the factorial function.

Assumptions and Requirements

Before applying Fisher's Exact Test, certain assumptions and requirements must be met. These include:

1. The data must be categorical (i.e., they belong to distinct categories). 2. The sampling method must be random. 3. The rows and columns of the contingency table must be mutually exclusive (i.e., no individual can be in more than one cell of the table). 4. The data must be in the form of frequencies, not percentages or transformed data.

Application and Use Cases

Fisher's Exact Test is widely used in various fields such as medicine, psychology, marketing, and sociology, among others. In medicine, for instance, it is often used in clinical trials to compare the treatment responses between two groups when the sample size is small. In psychology, it may be used to test the association between two categorical variables.

Limitations and Considerations

While Fisher's Exact Test is a powerful tool, it is not without its limitations. The test is most accurate when used on small sample sizes, and can become computationally intensive with larger tables. Additionally, like all statistical tests, it is sensitive to the violation of its assumptions.