Geary's C

From Canonica AI

Introduction

Geary's C is a statistical measure that is used in the field of spatial analysis. It is named after the Irish statistician Roy C. Geary, who first defined it in 1954. Geary's C is a measure of spatial autocorrelation, which is a statistical concept that describes the degree to which similar values are grouped together in space.

A mathematical formula representing Geary's C
A mathematical formula representing Geary's C

Definition and Calculation

Geary's C is defined as the sum of the squared differences between the values of all pairs of neighboring locations, divided by twice the sum of all possible squared differences. The formula for Geary's C is:

C = (N-1) / (2 * W) * Σ Σ wij * (xi - xj)^2 / Σ (xi - x̄)^2

Where: - N is the total number of locations, - W is the sum of all spatial weights, - wij is the spatial weight between locations i and j, - xi and xj are the values at locations i and j, respectively, and - x̄ is the mean of all values.

The spatial weight, wij, is typically defined as 1 if locations i and j are neighbors, and 0 otherwise. However, other definitions of spatial weight can be used, depending on the specific application.

Interpretation

Geary's C ranges from 0 to 2. A value of 1 indicates random spatial distribution, values less than 1 indicate positive spatial autocorrelation (similar values are grouped together), and values greater than 1 indicate negative spatial autocorrelation (similar values are dispersed).

It's important to note that Geary's C is a global measure of spatial autocorrelation, meaning it provides a single value that summarizes the degree of spatial autocorrelation across the entire study area. It does not provide information about local patterns of spatial autocorrelation.

Applications

Geary's C is widely used in various fields such as geography, epidemiology, economics, and environmental science, among others. It is used to analyze spatial patterns in a wide variety of data, including disease incidence, economic indicators, and environmental variables.

For example, in epidemiology, Geary's C might be used to analyze the spatial distribution of disease cases to identify clusters of high disease incidence. In economics, it could be used to analyze the spatial distribution of income or unemployment rates to identify areas of economic disparity.

Limitations and Criticisms

Despite its wide use, Geary's C has been criticized for several reasons. First, it assumes that the spatial distribution of the data is stationary, meaning that the spatial pattern does not change over the study area. This assumption is often violated in practice.

Second, Geary's C is sensitive to the definition of neighbors and the choice of spatial weights. Different definitions can lead to different results, which can make interpretation difficult.

Finally, as a global measure of spatial autocorrelation, Geary's C does not provide information about local patterns of spatial autocorrelation. This can be a significant limitation in studies where local patterns are of interest.

See Also