Kendall rank correlation coefficient
Introduction
The Kendall rank correlation coefficient, also known as Kendall's tau coefficient, is a statistic used to measure the ordinal association between two measured quantities. A tau test is a non-parametric hypothesis test for statistical dependence based on the tau coefficient. It is named after Maurice Kendall, who developed it in 1938 Maurice Kendall.
Definition
The Kendall correlation between two variables will be high when observations have a similar (or identical for a correlation of 1) rank (i.e., relative position label of the observations within the variable: 1st, 2nd, 3rd, etc.) between the two variables, and low when observations have a dissimilar (or fully different for a correlation of -1) rank between the two variables.
Mathematical Representation
The Kendall rank correlation coefficient can be expressed mathematically as:
τ = (number of concordant pairs - number of discordant pairs) / total number of pairs
Where a pair is concordant if the ranks for the variables both increase or both decrease together, and discordant if one rank increases while the other decreases.
Properties
The Kendall tau rank correlation coefficient is a robust measure of association, i.e., it is not sensitive to strong outliers. It is also invariant under monotonic transformations of the data. This makes it a valuable tool for analyzing ordinal data, where numerical values are replaced by rank orders.
Applications
The Kendall tau rank correlation coefficient is widely used in the social sciences, particularly in psychology and education, as well as in medicine, engineering, and other fields. It is particularly useful when dealing with ordinal variables, or when the relationship between variables is not linear.
Advantages and Limitations
Like all statistical measures, the Kendall tau rank correlation coefficient has its strengths and limitations. Its main advantage is its robustness and its ability to handle ordinal data and non-linear relationships. However, it can be less powerful than other measures, such as Pearson's correlation coefficient, in detecting linear relationships.