Hypergeometric distribution: Difference between revisions
(Created page with "== Introduction == The hypergeometric distribution is a discrete probability distribution that describes the probability of k successes in n draws from a finite population of size N containing exactly K successes, without replacement. This distribution is particularly useful in scenarios where the sample size is a significant fraction of the total population, making it distinct from the binomial distribution, which assumes replacement. == Definition == Formally, the p...") |
No edit summary |
||
Line 93: | Line 93: | ||
== Image == | == Image == | ||
[[Image:Detail-98253.jpg|thumb|center|A deck of playing cards spread out on a table.|class=only_on_mobile]] | |||
[[Image:Detail-98254.jpg|thumb|center|A deck of playing cards spread out on a table.|class=only_on_desktop]] | |||
== Categories == | == Categories == |
Latest revision as of 20:46, 8 October 2024
Introduction
The hypergeometric distribution is a discrete probability distribution that describes the probability of k successes in n draws from a finite population of size N containing exactly K successes, without replacement. This distribution is particularly useful in scenarios where the sample size is a significant fraction of the total population, making it distinct from the binomial distribution, which assumes replacement.
Definition
Formally, the probability mass function (PMF) of the hypergeometric distribution is given by:
\[ P(X = k) = \frac{\binom{K}{k} \binom{N-K}{n-k}}{\binom{N}{n}} \]
where: - \( \binom{a}{b} \) denotes a binomial coefficient, which is the number of ways to choose b elements from a set of a elements. - \( N \) is the population size. - \( K \) is the number of success states in the population. - \( n \) is the number of draws. - \( k \) is the number of observed successes.
Properties
Mean and Variance
The mean \( \mu \) and variance \( \sigma^2 \) of a hypergeometric distribution are given by:
\[ \mu = n \frac{K}{N} \]
\[ \sigma^2 = n \frac{K}{N} \left(1 - \frac{K}{N}\right) \frac{N - n}{N - 1} \]
These formulas reflect the fact that the hypergeometric distribution accounts for the decreasing population size as samples are drawn without replacement.
Support
The support of the hypergeometric distribution is the set of integers \( k \) for which the PMF is non-zero. Specifically, \( k \) can take values from \( \max(0, n + K - N) \) to \( \min(K, n) \).
Applications
The hypergeometric distribution is widely used in various fields such as:
Quality Control
In quality control, the hypergeometric distribution can be used to model the number of defective items in a sample drawn from a batch. For example, if a factory produces a batch of 1000 items with 50 defective items, the distribution can help determine the probability of finding a certain number of defective items in a sample of 10.
Ecology
Ecologists use the hypergeometric distribution to estimate population sizes and the distribution of species. For instance, when studying a population of animals, researchers might capture, tag, and release a number of individuals, and then recapture a sample to estimate the total population size.
Card Games
In card games, the hypergeometric distribution can be used to calculate the probability of drawing a specific combination of cards from a deck. For example, in poker, it can determine the likelihood of drawing a particular hand.
Calculation Methods
Direct Computation
Direct computation of the hypergeometric PMF involves calculating binomial coefficients, which can be computationally intensive for large populations. Efficient algorithms and libraries are available in various programming languages to handle these calculations.
Approximation
For large populations, the hypergeometric distribution can be approximated by the binomial distribution or the normal distribution. The binomial approximation is suitable when the sample size is small relative to the population size, while the normal approximation is used when both the sample size and the number of successes are large.
Relationship to Other Distributions
Binomial Distribution
The hypergeometric distribution is related to the binomial distribution. While the binomial distribution assumes independent trials with replacement, the hypergeometric distribution deals with dependent trials without replacement. When the population size is large compared to the sample size, the hypergeometric distribution approaches the binomial distribution.
Negative Hypergeometric Distribution
The negative hypergeometric distribution is a variation where the number of failures is fixed, and the number of draws is random. It describes the probability of drawing a certain number of successes before a specified number of failures occurs.
Examples
Consider a scenario where a deck of 52 playing cards contains 4 aces. If 5 cards are drawn without replacement, the probability of drawing exactly 2 aces can be calculated using the hypergeometric distribution.
\[ P(X = 2) = \frac{\binom{4}{2} \binom{48}{3}}{\binom{52}{5}} \]
Calculating the binomial coefficients:
\[ \binom{4}{2} = 6 \] \[ \binom{48}{3} = 17296 \] \[ \binom{52}{5} = 2598960 \]
Thus,
\[ P(X = 2) = \frac{6 \times 17296}{2598960} \approx 0.0399 \]