GeoPackage has quietly become the default vector format in QGIS, the recommended format in OGC publications, and the new standard for government data distribution in several countries. If you've been around GIS for a while and still treat it as "that newer thing", this article is for you.
What it actually is
A GeoPackage (.gpkg) is a SQLite 3 database file with a specific set of tables and conventions defined by the OGC GeoPackage Encoding Standard (current version: 1.3, published 2021). That's the whole format. Open a .gpkg in DB Browser for SQLite, and you'll see ordinary tables; what makes it a GeoPackage rather than a generic SQLite file is three mandatory metadata tables:
gpkg_contents— registers every spatial layer and its bounding boxgpkg_geometry_columns— describes geometry columns and their CRSgpkg_spatial_ref_sys— stores CRS definitions in WKT
Feature geometries are stored as binary blobs in a slight extension of WKB (Well-Known Binary), prefixed with a header containing the SRID and an envelope. Spatial indexing is provided by SQLite's R*Tree module, exposed as virtual tables that are kept in sync with feature tables through triggers.
Because it's built on SQLite, GeoPackage inherits:
- ACID transactions. Power loss won't corrupt the database.
- Full SQL. Every standard SQL clause — JOIN, WHERE, GROUP BY, window functions — works.
- A single-file footprint. No bundle of sidecars to lose.
- Cross-platform binary compatibility. Write on Windows, open on macOS, copy to Linux.
Why it's replacing Shapefile
GeoPackage fixes every long-standing Shapefile pain point:
| Pain | Shapefile | GeoPackage |
|---|---|---|
| File count | 3–7 files per layer | 1 file for the whole dataset |
| Field name length | 10 ASCII chars | Unlimited |
| Encoding | Codepage roulette | UTF-8 native |
| Geometry types | One per file | Mixed within a single layer |
| Layers per file | One | Many |
| Size limit | 2 GB | 140 TB |
| Spatial index | No (or sidecar .qix) | Built-in R*Tree |
| Vendor | Esri proprietary | OGC open standard |
For any new dataset that doesn't have an external compatibility requirement, GeoPackage is strictly better.
What's inside a real GeoPackage
Let's open one with ogrinfo:
ogrinfo -so my_data.gpkgTypical output:
INFO: Open of `my_data.gpkg'
using driver `GPKG' successful.
1: roads (Multi Line String)
2: buildings (Polygon)
3: addresses (Point)
4: census_tracts (Multi Polygon)Four layers, different geometry types, all in one file. Now query one:
ogrinfo -al -so my_data.gpkg roadsThe summary returns layer extent, geometry type, CRS, feature count, and the full attribute schema with column names and types. Because the file is a real SQL database, you can also run arbitrary queries:
ogrinfo my_data.gpkg -sql "
SELECT COUNT(*) AS road_count, SUM(ST_Length(geom)) AS total_length
FROM roads
WHERE highway = 'primary'"That's full spatial SQL against a single-file portable database — something Shapefile literally cannot do.
How vector geometry is encoded
Geometry columns store a GeoPackage binary blob: a short header (magic bytes, version, flags, SRID, optional envelope) followed by standard WKB. The header lets the reader skip to the bounding box without parsing the geometry — useful for spatial filters when the R*Tree index is unavailable.
WKB itself is the standard OGC binary encoding: a byte-order marker, a geometry-type code, then coordinate doubles. The result is roughly 30% the size of equivalent text-based WKT and parses in O(n) without lexer overhead.
How spatial indexing works
A GeoPackage spatial index is an SQLite R*Tree virtual table named rtree_<table>_<geom_column>. It is automatically populated by triggers on the feature table — INSERT, UPDATE, and DELETE on roads keep rtree_roads_geom in sync. Spatial queries can then use SQL like:
SELECT * FROM roads
WHERE id IN (
SELECT id FROM rtree_roads_geom
WHERE minx <= ? AND maxx >= ?
AND miny <= ? AND maxy >= ?
)With the index in place, a viewport query that touches a few thousand of 50 million features runs in milliseconds.
Raster support
GeoPackage isn't just vector. The standard defines a tile matrix encoding that stores raster pyramids — base map tiles at multiple zoom levels — alongside vector data. This is hugely useful for offline field-collection apps: a single .gpkg can hold the base satellite imagery, the project's vector layers, and the user's draft edits.
Open-source field apps like QField and Mergin Maps build their entire offline workflow around this capability.
Where GeoPackage still falls short
It's not perfect:
- Multi-writer concurrency is limited. SQLite uses file-level locking; two users editing the same
.gpkgsimultaneously will block or fail. For multi-user editing, PostGIS is the right tool, with GeoPackage as a distribution snapshot. - Not human-readable. You need a tool. GeoJSON wins for diff-able, hand-editable workflows.
- Older ArcGIS (10.x) can't read it directly. ArcGIS Pro 2.0+ does, but legacy environments need conversion.
- No web-browser support. Browsers cannot consume a binary SQLite file directly. For web maps, convert to GeoJSON or vector tiles.
- VACUUM needed after large deletes. Like any SQLite database, deletes mark pages free but don't shrink the file. Run
VACUUMperiodically.
How to migrate from Shapefile
The one-liner is:
ogr2ogr -f GPKG output.gpkg input.shpFor a folder of Shapefiles into a single GeoPackage with one layer per file:
for f in *.shp; do
ogr2ogr -f GPKG -update -append \
-nln "$(basename "$f" .shp)" \
output.gpkg "$f"
doneThe -update -append flags add layers to an existing database; -nln names each new layer after its source file. The result is one .gpkg containing all your former Shapefiles as separate, queryable layers.
Our online Shapefile to GeoPackage converter handles single uploads of zipped Shapefiles.
Should you switch?
If you're starting a new project, yes — make GeoPackage your default. Use GeoJSON for anything that hits a browser, Shapefile only when an external party demands it, and PostGIS when you need multi-user editing or server-side analysis.
If you have an existing pipeline that's all Shapefile, there's no urgent reason to migrate. But every new dataset you create is a chance to pick the better format, and the cost of conversion is one ogr2ogr command.