The format that refused to die
Esri shipped the Shapefile in 1993 as a quiet open spec accompanying ArcView 2. Thirty-two years later, it remains the lingua franca of GIS data exchange. Cadastral offices distribute it. The US Census, the EU Open Data Portal, and almost every municipal GIS portal still serve it. Most QGIS users encounter it on day one and never stop encountering it.
That persistence is not because the format is good. It persists because the network effect is enormous and because the alternatives — GeoPackage, FlatGeobuf, GeoParquet — arrived after the institutional knowledge had already calcified.
The four-file problem
A "Shapefile" is not one file. It is at minimum three (.shp, .shx, .dbf) and in practice usually five or six (.prj, .cpg, optionally .sbn/.sbx, .qix, .qmd). Email any of them in isolation and the recipient gets nothing. Zip them together and you've reinvented archive distribution badly.
This matters more than it sounds. Half the "broken Shapefile" support tickets are not broken at all — someone dragged .shp out of a ZIP without its siblings.
The 10-character field name limit
The attribute table lives in a .dbf file, a dBase III descendant from 1983. Column names are stored in a fixed 11-byte slot, of which one byte is a null terminator. That leaves 10 ASCII characters per attribute name.
QGIS will silently truncate population_density to population. Export the same layer with population_male and population_female and you now have two columns called population and population_1. Roundtrip through Shapefile a few times and the schema becomes unreadable.
The 2GB hard cap
.shp and .dbf both use 32-bit signed offsets, capping each file at 2,147,483,647 bytes. For a parcel layer with thousands of vertices per polygon, you hit the wall around 2 million features. There is no streaming, no chunking, no graceful failure — GDAL refuses to write past the limit and most tools refuse to read past it.
Compare to GeoPackage, which is just SQLite: file size capped at 140 TB, queryable in place, and a single .gpkg carries multiple layers.
No nulls, no booleans, no dates with time
The .dbf format predates SQL's null concept. A "missing" numeric value is stored as a sentinel — sometimes 0, sometimes a row of asterisks. A "missing" string is an empty string, indistinguishable from a deliberate empty. Booleans don't exist; they are coerced into T/F characters or 0/1 integers. Dates are stored as 8-character YYYYMMDD strings with no time component. Timestamps lose their hours and minutes on the way in.
If your data pipeline assumes that null != "", Shapefile will quietly corrupt it.
Geometry quirks
- No mixed geometry types. A Shapefile is point, line, or polygon — never a mix. A GeoJSON FeatureCollection that mixes
PointandPolygoncannot be written to a single Shapefile. - No native multipart support advertised in the type field. A "POLYGON" Shapefile silently contains multipolygons.
- Z and M values are stored, but unevenly supported. Round-trip through some tools and your elevations disappear.
Why it's still everywhere
Three reasons, all institutional:
- Procurement language. Government RFPs from the early 2000s specified "deliverables in Esri Shapefile format" and the language was copy-pasted forward for two decades.
- Desktop GIS habit. ArcMap, QGIS, MapInfo, and AutoCAD Map all open Shapefiles instantly with no driver decisions. The path of least resistance.
- The ZIP-able interchange property. A folder of Shapefiles is a self-contained dataset you can hand to anyone. GeoPackage is technically superior but requires the recipient to know what a
.gpkgis.
When to migrate
Migrate when any of the following are true:
- Your dataset will exceed 1.5 GB or 1.5 million features.
- You need attribute names longer than 10 characters, or true null distinction.
- You need multiple layers in a single deliverable.
- You're publishing to a web map and don't want to ship a ZIP.
In those cases, GeoPackage is the drop-in replacement (ogr2ogr -f GPKG out.gpkg in.shp) and GeoJSON is the right web target.
When to stay
Stay on Shapefile when the recipient explicitly asks for it (procurement, agency exchange) or when you're handing data to a legacy desktop user who will not accept a .gpkg. In every other case in 2025, you are paying a tax for nostalgia.