Shapefile is ESRI's specification of geospatial data format. Shapefile groups geospatial vector data (shape), index of geometry (shape index), and attributes associated with each shape (metadata).

The US Census Bureau is an excellent resource for shapefiles, e.g. the TIGER/Line shapefiles.

Format Description

Mandatory files:

  1. .shp, shape format, the feature geometry (vector data);
  2. .shx, shape index format, an offset index of the geographic features;
  3. .dbf, dBase file format, the attribute table for the geographic features.

Optional files:

  • .prj, projection format, the coordinate system and projection information, a plain text file describing the projection using the Well-Known Text (WKT) format;
  • .sbn and .sbx, a spatial index of the features, used only by ESRI and not documented;
  • .fbn and .fbx, a spatial index of the features that are read-only;
  • .qix, an alternative quadtree spatial index used by MapServer and GDAL/OGR software;
  • .ain and .aih, an attribute index of the active fields in a table;
  • .atx, an attribute index for the .dbf file in the form of shapefile.columnname.atx (ArcGIS 8 and later);
  • .ixs, a geocoding index for read-write datasets;
  • .mxs, a geocoding index for read-write datasets (ODB format);
  • .shp.xml, geospatial metadata in XML format;
  • .cpg, code page format identifying the character encoding to be used (only for .dbf);

References

For more details, refer to ESRI's official documentation, ESRI Shapefile Technical Description - July 1998.