Geoprocessing considerations for shapefile output

Over the years, ESRI has developed three main data formats for storing geographic information; coverages, shapefiles, and geodatabases. Shapefiles were developed to provide a simple, nontopological format for storing geographic and attribute information. Because of the simplicity of shapefiles, they are a very popular open data transfer format. While shapefiles may seem to be an easy choice because of their simplicity, there are limitations in their use that geodatabases address. When using shapefiles, you should be aware of their limitations. In broad general terms:

These issues (and more) mean that shapefiles are a poor choice for active database management—they do not handle the modern life-cycle of data creation, editing, versioning and archiving.

When should I use a shapefile?

When should I not use a shapefile?

With some exceptions that are noted below, shapefiles are acceptable for storing simple feature geometry. However, shapefiles have serious problems with attributes. For example, they cannot store null values, they round up numbers, they have poor support for Unicode character strings, they do not allow field names longer than 10 characters, and they cannot store both a date and time in a field. These are just the main issues. Additionally, they do not support capabilities found in geodatabases, such as domains and subtypes. So unless you have very simple attributes and no geodatabase capabilities, do not use shapefiles.

Shapefile components and file extensions

Shapefiles are stored in three or more files which all have the same prefix and are stored in the same system folder (shapefile workspace). You will see the individual files when viewing the folder in Windows Explorer, not in ArcCatalog.

Extension Description Required?
.shp The main file that stores the feature geometry. No attributes are stored in this file—only geometry. Yes
.shx A companion file to the .shp that stores the position of individual feature IDs in the .shp file. Yes
.dbf The dBase table that stores the attribute information of features. Yes
.sbn and .sbx Files that store the spatial index of the features. No
.atx Created for each dBase attribute index created in ArcCatalog. No
.ixs and .mxs Geocoding index for read-write shapefiles. No
.prj The file that stores the coordinate system information. No
.xml Metadata for ArcGIS - stores information about the shapefile. No

Geometry limitations

Attribute limitations

Data Type containing null value Shapefile representation
Number - When tool requires a NULL, infinity, or NaN (Not a Number) to be output. -1.7976931348623158e+308

(IEEE standard for the maximum negative value)
Number (all other geoprocessing tools) 0
Text " " (blank — no space)
Date Stored as zero, but displays "<null>".

Unsupported capabilities

Shapefiles have no extended data types at either the workspace or feature class level. Any conversion to shapefile from a geodatabase feature class or other format will result in the loss of the following:

Shapefiles and geoprocessing

Any geoprocessing tool that outputs a feature class allows you to choose either a shapefile or geodatabase feature class as the output format. Similarly, a tool that outputs a table allows you to choose either a dBase file (.dbf) or a geodatabase table as the output. You should always be aware of which format you use and the consequences of converting a geodatabase input to a shapefile output.

Geoprocessing tools auto-generate an output feature class or table for you. This auto-generated output is based on a number of factors as described in Specifying tool inputs and outputs. If your scratch workspace environment is set to a system folder, and not a geodatabase, the auto-generated output features class will be a shapefile or dBase file, as illustrated below.

Shapefile and dBase output

It is suggested that you set your scratch workspace to a file geodatabase so that the auto-generated output is written to a file geodatabase, not a shapefile or .dbf table.

Learn more about geoprocessing environments

Because shapefiles write quickly, they are often used to write intermediate data in models since this makes for faster model execution. However, writing to a file geodatabase is almost as fast as writing to a shapefile, so unless execution speed is critical, you should always use a file geodatabase for intermediate and output data. If you do use shapefiles, be aware of their limitations as described above and only use shapefiles for simple features and attributes. An alternative to using shapefiles for intermediate data is to write features to the in_memory workspace.

Learn more about writing features to the in_memory workspace

See Also