# High/Low Clustering (Getis-Ord General G) (Spatial Statistics)

Measures the degree of clustering for either high values or low values

## Usage tips

• This tool honors the environment output coordinate system. Feature geometry is projected to the output coordinate system prior to analysis, so the units associated with values entered for the Distance Band/Threshold Distance parameter should match those specified in the output coordinate system. All mathematical computations are based on the output coordinate system spatial reference.

• Calculations based on either Euclidean or Manhattan distance require projected data to accurately measure distances.

• If you will be running several analyses on a single dataset (e.g., analyzing several different fields) or if you have a dataset with more than 3000 features, it is recommended that you construct the spatial weights matrix file prior to analysis.

• The General G tool calculates the value of the General G index, associated Z score and p-value for a given input feature class. These values are written to the Command Line message window and passed as derived output.

• The Z score and p-value are measures of statistical significance which tell you whether or not to reject the null hypothesis. For this tool, the null hypothsis states that the values associated with features are randomly distributed.

• The higher (or lower) the z score, the stronger the intensity of the clustering. A Z score near zero indicates no apparent clustering within the study area. A positive Z score indicates clustering of high values. A negative Z scores indicates clustering of low values.

• For line and polygon features, true geometric feature centroids are used in computations.

• The input field you select should only contain positive numeric values. The General G statistic was designed to work with non-negative values only.

• The input field should contain a variety of non-negative values. The math for this statistic requires some variation in the variable being analyzed; it cannot solve if all input values are 1, for example. If you have incident data, and want to analyze incident intensity, consider aggregating your incident data or using Integrate with the Collect Events tool prior to analysis.

• Whenever using shapefiles keep in mind that they cannot store null values. Tools or other procedures that create shapefiles from non-shapefile inputs may store or interpret null values as zero. This can lead to unexpected results.

• The Conceptualization of Spatial Relationships used for analysis should be based on your understanding of spatial interaction among the features being analyzed.

• For the Fixed Distance option, the distance band used for analysis should be based on your understanding of spatial interaction among the features being analyzed. Alternatively, features may be evaluated for a range of distance values or at the specific distance where spatial autocorrelation is maximized.

• For Inverse Distance conceptualization options: when zero is entered for the "Distance Band or Threshold Distance" parameter all features are considered neighbors of all other features; when this parameter is left blank, a default threshold distance will be applied.

• When the spatial conceptualization is an Inverse Distance method (Inverse Distance, Inverse Distance Squared, or Zone of Indifference) any two points that are coincident will be given a weight of one to avoid zero division. This assures features are not excluded from analysis.

• With inverse distance conceptualizations, weights for distances less than 1 become unstable. The weighting for features separated by less than 1 unit of distance (common with Geographic Coordinate System projections), are given a weight of 1.

• Analysis on features with a Geographic Coordinate System projection is not recommended with any of the inverse distance based spatial conceptualization methods.

• The "Display Output Graphically" parameter will only work on the Windows operating system. When set to true it will display the results of the tool graphically.

• When output is shown graphically, a separate graphics dialog box will be displayed. If you use the tool in a script, set the Display_Output_Graphically parameter to "false", otherwise your script will not complete until you click"Close" on the popup graphic.

• In ArcGIS version 9.2, the "Global" standardization option was removed. Global standardization returns the same results as no standardization. Models built with previous versions of ArcGIS that use the Global standardization option may need to be rebuilt.

• Current map layers may be used to define the input feature class. When using layers, only the currently selected features are included in the analysis.

## Syntax

HighLowClustering_stats (Input_Feature_Class, Input_Field, Display_Output_Graphically, Conceptualization_of_Spatial_Relationships, Distance_Method, Standardization, Distance_Band_or_Threshold_Distance, Weights_Matrix_File)
Parameter Explanation Datatype
Input Feature Class (Required)

The feature class for which the General G statistic will be calculated.

Feature Layer
Input Field (Required)

The numeric field used for analysis (incidents or weights).

Field
Display Output Graphically (Required)

Specifies whether the tool will display General G results graphically.

• True—The output will be displayed graphically.
• False—The output will not be displayed graphically.

Boolean
Conceptualization of Spatial Relationships (Required)

Specifies how spatial relationships among features are conceptualized.

• Inverse Distance—All features impact/influence all other features, but the farther away something is, the smaller the impact it has.
• Inverse Distance Squared—Same as Inverse Distance except that the slope is sharper so influence drops off more quickly and only a target feature's closest neighbors will exert substantial influence in computations for that feature.
• Fixed Distance Band—Each feature is analyzed within the context of those neighboring features within some specified critical distance. Features outside the critical distance of a target feature do not influence calculations for that feature.
• Zone of Indifference—Features within the specified critical distance of a target feature are included in analyses for that feature. Once the critical distance is exceeded, the level of impact quickly drops off.
• Polygon Contiguity (First Order)—The neighbors of each feature are only those with which the feature shares a boundary. All other features have no influence on computations. Requires an ArcInfo license.
• Get Spatial Weights From File—Spatial relationships are defined in a spatial weights file. The pathname to the spatial weights file is specified in the Weights Matrix File parameter.

String
Distance Method (Required)

Specifies how feature to feature distances are calculated.

• Euclidean (as the crow flies)—The straight-line distance between two points.
• Manhattan (city block)—The distance between two points measured along axes at right angles. Calculated by summing the (absolute) difference between the X and the Y coordinates.

String
Standardization (Required)

Row standardization is recommended whenever the distribution of your features is potentially biased due to sampling design or an imposed aggregation scheme.

• None—No standardization of spatial weights is applied.
• Row—Spatial weights are standardized; each weight is divided by its row sum (the sum of the weights of all neighboring features).

String
Distance Band or Threshold Distance (Required)

Specifies a cutoff distance for Inverse Distance and Fixed Distance options. Features outside the specified cutoff for a target feature are ignored in analyses for that feature. However, for Zone of Indifference, the influence of features outside the given distance is reduced with distance while those inside the distance threshold are equally considered. The value entered should be in units that match those of the Output Coordinate System. For the Inverse Distance conceptualizations of spatial relationships: A value of zero for this parameter indicates that no threshold distance is applied; when this parameter is left blank, a default threshold value will be computed and applied. This parameter has no effect when "Polygon Contiguity" or "Get Spatial Weights From File" spatial conceptualizations are selected.

Double
Weights Matrix File (Optional)

The pathname to a file containing spatial weights that define spatial relationships among features in the Input Feature Class.

File
Data types for geoprocessing tool parameters

Script Example

```# Analyze the spatial distribution of 911 calls in a metropolitan area
# using the High/Low Clustering (Getis-Ord General G)

# Import system modules
import arcgisscripting

# Create the Geoprocessor object
gp = arcgisscripting.create(9.3)
gp.OverwriteOutput = 1

# Local variables...
workspace = "C:\Data\911Calls"

try:

# Set the current workspace (to avoid having to specify the full path to the feature classes each time)
gp.workspace = workspace
# Copy the input feature class and integrate the points to snap
# together at 500 feet
# Process: Copy Features and Integrate
cf = gp.CopyFeatures("911Calls.shp", "911Copied.shp",
"#", 0, 0, 0)
integrate = gp.Integrate("911Copied.shp #", "500 Feet")
# Use Collect Events to count the number of calls at each location
# Process: Collect Events
ce = gp.CollectEvents("911Copied.shp", "911Count.shp", "Count", "#")
# Add a unique ID field to the count feature class
# Process: Add Field and Calculate Field
af = gp.AddField("911Count.shp", "MyID", "LONG", "#", "#", "#", "#",
"NON_NULLABLE", "NON_REQUIRED", "#",
"911Count.shp")

cf = gp.CalculateField("911Count.shp", "MyID", "[FID]", "VB")
# Create Spatial Weights Matrix for Calculations
# Process: Generate Spatial Weights Matrix...
swm = gp.GenerateSpatialWeightsMatrix("911Count.shp", "MYID",
"euclidean6Neighs.swm",
"K_NEAREST_NEIGHBORS",
"#", "#", "#", 6,
"NO_STANDARDIZATION")
# Cluster Analysis of 911 Calls
# Process: High/Low Clustering (Getis-Ord General G)
hs = gp.HighLowClustering("911Count.shp", "ICOUNT",
"false",
"Get Spatial Weights From File",
"Euclidean Distance", "None",
"#", "euclidean6Neighs.swm")
except:
# If an error occurred when running the tool, print out the error message.
print gp.GetMessages()

```