# Average Nearest Neighbor (Spatial Statistics)

Calculates a nearest neighbor index based on the average distance from each feature to its nearest neighboring feature.

## Usage tips

• This tool honors the environment output coordinate system. Feature geometry is projected to the output coordinate system prior to analysis, so the units associated with values entered for the Area parameter should match those specified in the output coordinate system. All mathematical computations are based on the output coordinate system spatial reference.

• Calculations based on either Euclidean or Manhattan distance require projected data to accurately measure distances.

• The nearest neighbor index and associated Z score and p-value are written to the command window and passed as derived output.

• The Z score and p-value are measures of statistical significance which tell you whether or not to reject the null hypothesis. For Average Nearest Neighbor the null hypothsis states that features are randomly distributed.

• The nearest neighbor index is expressed as the ratio of the observed distance divided by the expected distance. The expected distance is the average distance between neighbors in a hypothetical random distribution. If the index is less than 1, the pattern exhibits clustering; if the index is greater than 1, the trend is toward dispersion or competition.

• The average nearest neighbor tool is most effective for comparing different features in a fixed study area. Use the Calculate Area tool on the Study Area polygon to get an Area value for the area parameter.

• If an Area value is not specified, then the area of the minimum enclosing rectangle around the features is used. Unlike the extent, a minimum enclosing rectangle will not necessarily align with the x- and y-axes.

• The average nearest neighbor function is very sensitive to the Area value (small changes in the area can result in considerable changes in the results).

• The extent (and minimum enclosing rectangle) for a feature class containing only coincident points (all with the exact same X and Y coordinates) is zero. In this case, even though the points are highly clustered at a single location, the expected distribution for those points in a space with zero area is to have all points coincident. Because the observed and expected distributions are equivalent, the resultant Z score will be 0.0 and the resultant p-value will be 1.0, to indicate that spatial clustering is not statistically significant. It is always best to supply an AREA value when using the Average Nearest Neighbor tool.

• Although this tool will work with polygon or line data, it is really only appropriate for event, incident, or other fixed-point feature data. For line and polygon features, feature true geometric centroids are used in the computations. For multipoint, polyline or polygons with multiple parts, the centroid is computed using the weighted mean center of all feature parts. The weighting for point features is 1, for line features is length, and for polygon features is area.

• Whenever using shapefiles keep in mind that they cannot store null values. Tools or other procedures that create shapefiles from non-shapefile inputs may store or interpret null values as zero. This can lead to unexpected results.

• The "Display Output Graphically" parameter will only work on the Windows operating system. When set to true it will display the results of the tool graphically.

• When output is shown graphically, a separate graphics dialog box will be displayed. If you use the tool in a script, set the Display_Output_Graphically parameter to "false", otherwise your script will not complete until you click"Close" on the popup graphic.

• Current map layers may be used to define the input feature class. When using layers, only the currently selected features are included in the analysis.

## Syntax

AverageNearestNeighbor_stats (Input_Feature_Class, Distance_Method, Display_Output_Graphically, Area)
Parameter Explanation Datatype
Input Feature Class (Required)

The feature class, typically a point feature class, for which the average nearest neighbor distance will be calculated.

Feature Layer
Distance Method (Required)

Specifies how distances are calculated from each feature to its nearest neighboring feature.

• Euclidean (as the crow flies)—The straight-line distance between two points.
• Manhattan (city block)—The distance between two points measured along axes at right angles. Calculated by summing the (absolute) difference between the X and the Y coordinates.

String
Display Output Graphically (Required)

Specifies whether the tool will display the nearest neighbor index graphically.

• True—The output will be displayed graphically.
• False—The output will not be displayed graphically.

Boolean
Area (Optional)

A numeric value representing the study area size. The default value is the area of the minimum enclosing rectangle that would encompass all features (or all selected features). Units should match those for the Output Coordinate System.

Double
Data types for geoprocessing tool parameters

Script Example

```# Analyze crime data to determine if spatial patterns are statistically significant

# Import system modules
import arcgisscripting

# Create the Geoprocessor object
gp = arcgisscripting.create()

# Local variables...
workspace = "C:/project93/data"
crime_data = "burglaries.shp"

try:
# Set the current workspace (to avoid having to specify the full path to the feature classes each time)
gp.workspace = workspace

# Obtain Nearest Neighbor Ratio and Z score
# Process: Average Nearest Neighbor...
nn_output = gp.AverageNearestNeighbor_stats(crime_data, "Euclidean Distance", "false", "#")
nn_values = nn_output.split(";")
print "The nearest neighbor index is: " + nn_values[0]
print "The z score of the nearest neighbor index is: " + nn_values[1]
print "The p-value of the nearest neighbor index is: " + nn_values[2]

except:
# If an error occurred when running the tool, print out the error message.
print gp.GetMessages()

```