Random Variable

From Canonica AI

Definition

A random variable is a variable whose possible values are numerical outcomes of a random phenomenon. There are two types of random variables, discrete and continuous.

Discrete Random Variables

A discrete random variable is one which may take on only a countable number of distinct values such as 0,1,2,3,4,........ Discrete random variables are usually (but not necessarily) counts. If a random variable can take only a finite number of distinct values, then it must be discrete. Examples of discrete random variables include the number of children in a family, the Friday night attendance at a cinema, the number of patients in a doctor's surgery, the number of defective light bulbs in a box of ten.

A photograph of a dice, a common example of a discrete random variable.
A photograph of a dice, a common example of a discrete random variable.

Continuous Random Variables

A continuous random variable is one which takes an infinite number of possible values. Continuous random variables are usually measurements. Examples include height, weight, the amount of sugar in an orange, the time required to run a mile.

Probability Distribution

The probability distribution of a random variable is a function that describes the likelihood of obtaining the possible values that the random variable can assume. In other words, the values of the variable vary based on the underlying probability distribution.

Discrete Probability Distribution

The probability distribution of a discrete random variable is a list of probabilities associated with each of its possible values. It is also sometimes called the probability function or the probability mass function. To have a mathematical sense, suppose a random variable X may take k different values, with the probability that X=xi defined to be P(X=xi)=pi. Then the probabilities pi must satisfy the following:

1. 0 < pi < 1 for each i 2. p1 + p2 + ... + pk = 1.

Continuous Probability Distribution

The probability distribution of a continuous random variable, known as a probability distribution functions, are the functions that take on continuous values. The probability of observing any single value is equal to 0 since the number of values which may be assumed by the random variable is infinite. For example, a random variable X may take all values over an interval of real numbers. Then the probability that X is in the set of outcomes A, P(A), is defined to be the area above A and under a curve. The curve, which represents a function p(x), must satisfy the following:

1. The curve has no negative values (p(x) > 0 for all x) 2. The total area under the curve is equal to 1.

A curve meeting these requirements is often known as a density curve.

A photograph of a bell curve, a common example of a continuous probability distribution.
A photograph of a bell curve, a common example of a continuous probability distribution.

Expectation and Variance

The expected value (or mean) of X, where X is a discrete random variable, is a weighted average of the possible values that X can take, each value being weighted according to the probability of that event occurring. The expected value of X is usually written as E(X) or m.

The variance of X is a measure of how, on average, each of the values of a discrete random variable deviates from the expected value. The variance is usually written as Var(X) or σ².

Covariance and Correlation

Covariance is a measure of how much two random variables vary together. It’s similar to variance, but where variance tells you how a single variable varies, covariance tells you how two variables vary together.

The correlation coefficient is a statistical measure that calculates the strength of the relationship between the relative movements of two variables. The values range between -1.0 and 1.0. A calculated number greater than 1.0 or less than -1.0 means that there was an error in the correlation measurement.

See Also