Getis-Ord Gi* Statistics

From Canonica AI

Introduction

The Getis-Ord Gi* statistic is a measure used in geographic information systems (GIS) and spatial statistics to identify spatial clustering in point data. Developed by Arthur Getis and J. K. Ord, this local statistic compares the local mean rate or value for a feature with the global mean rate or value for all features.

Concept and Calculation

The Getis-Ord Gi* statistic is calculated for each feature in a dataset. The statistic is essentially a z-score; for a feature with a high value, a high positive z-score indicates that the feature has a high value and is surrounded by other features with high values. This situation indicates a hot spot. Conversely, for a feature with a low value, a high negative z-score indicates that the feature has a low value and is surrounded by other features with low values. This situation indicates a cold spot.

The formula for the Getis-Ord Gi* statistic is as follows:

Gi* = ∑[j=1 to n] (wij * xj) - Xbar * ∑[j=1 to n] (wij) / S * sqrt [ ((n * ∑[j=1 to n] (wij^2)) - (∑[j=1 to n] (wij))^2 ) / (n - 1) ]

where: - wij is the spatial weight between feature i and j, - xj is the attribute value for feature j, - Xbar is the mean of the attribute values for all features, - S is the standard deviation of the attribute values for all features, - n is the total number of features.

Applications

The Getis-Ord Gi* statistic is widely used in various fields such as epidemiology, urban planning, environmental science, and criminology, among others. In epidemiology, for instance, it can be used to identify clusters of disease cases in a geographical area. In urban planning, it can be used to identify areas of high or low property values. In environmental science, it can be used to identify areas of high or low pollution levels. In criminology, it can be used to identify hot spots of crime.

Limitations

While the Getis-Ord Gi* statistic is a powerful tool for identifying spatial clusters, it is not without its limitations. One of the main limitations is the Modifiable Areal Unit Problem (MAUP), which refers to the issue that the results of spatial analysis can be significantly affected by the way in which the spatial data is aggregated. Another limitation is the edge effect, which refers to the problem that the results for features near the edge of the study area may be biased because the spatial weights for these features are based on incomplete data.

See Also