Deep Dive10 min readNovember 21, 2024

What is GeoPackage and Why Is It Replacing Shapefile?

GeoPackage looks like a 'just one more file format' but it's actually a portable SQLite database. Here's why QGIS made it the default, and the technical details that matter.

GeoPackage has quietly become the default vector format in QGIS, the recommended format in OGC publications, and the new standard for government data distribution in several countries. If you've been around GIS for a while and still treat it as "that newer thing", this article is for you.

What it actually is

A GeoPackage (.gpkg) is a SQLite 3 database file with a specific set of tables and conventions defined by the OGC GeoPackage Encoding Standard (current version: 1.3, published 2021). That's the whole format. Open a .gpkg in DB Browser for SQLite, and you'll see ordinary tables; what makes it a GeoPackage rather than a generic SQLite file is three mandatory metadata tables:

  • gpkg_contents — registers every spatial layer and its bounding box
  • gpkg_geometry_columns — describes geometry columns and their CRS
  • gpkg_spatial_ref_sys — stores CRS definitions in WKT

Feature geometries are stored as binary blobs in a slight extension of WKB (Well-Known Binary), prefixed with a header containing the SRID and an envelope. Spatial indexing is provided by SQLite's R*Tree module, exposed as virtual tables that are kept in sync with feature tables through triggers.

Because it's built on SQLite, GeoPackage inherits:

  • ACID transactions. Power loss won't corrupt the database.
  • Full SQL. Every standard SQL clause — JOIN, WHERE, GROUP BY, window functions — works.
  • A single-file footprint. No bundle of sidecars to lose.
  • Cross-platform binary compatibility. Write on Windows, open on macOS, copy to Linux.

Why it's replacing Shapefile

GeoPackage fixes every long-standing Shapefile pain point:

PainShapefileGeoPackage
File count3–7 files per layer1 file for the whole dataset
Field name length10 ASCII charsUnlimited
EncodingCodepage rouletteUTF-8 native
Geometry typesOne per fileMixed within a single layer
Layers per fileOneMany
Size limit2 GB140 TB
Spatial indexNo (or sidecar .qix)Built-in R*Tree
VendorEsri proprietaryOGC open standard

For any new dataset that doesn't have an external compatibility requirement, GeoPackage is strictly better.

What's inside a real GeoPackage

Let's open one with ogrinfo:

ogrinfo -so my_data.gpkg

Typical output:

INFO: Open of `my_data.gpkg'
      using driver `GPKG' successful.
1: roads (Multi Line String)
2: buildings (Polygon)
3: addresses (Point)
4: census_tracts (Multi Polygon)

Four layers, different geometry types, all in one file. Now query one:

ogrinfo -al -so my_data.gpkg roads

The summary returns layer extent, geometry type, CRS, feature count, and the full attribute schema with column names and types. Because the file is a real SQL database, you can also run arbitrary queries:

ogrinfo my_data.gpkg -sql "
  SELECT COUNT(*) AS road_count, SUM(ST_Length(geom)) AS total_length
  FROM roads
  WHERE highway = 'primary'"

That's full spatial SQL against a single-file portable database — something Shapefile literally cannot do.

How vector geometry is encoded

Geometry columns store a GeoPackage binary blob: a short header (magic bytes, version, flags, SRID, optional envelope) followed by standard WKB. The header lets the reader skip to the bounding box without parsing the geometry — useful for spatial filters when the R*Tree index is unavailable.

WKB itself is the standard OGC binary encoding: a byte-order marker, a geometry-type code, then coordinate doubles. The result is roughly 30% the size of equivalent text-based WKT and parses in O(n) without lexer overhead.

How spatial indexing works

A GeoPackage spatial index is an SQLite R*Tree virtual table named rtree_<table>_<geom_column>. It is automatically populated by triggers on the feature table — INSERT, UPDATE, and DELETE on roads keep rtree_roads_geom in sync. Spatial queries can then use SQL like:

SELECT * FROM roads
WHERE id IN (
  SELECT id FROM rtree_roads_geom
  WHERE minx <= ? AND maxx >= ?
    AND miny <= ? AND maxy >= ?
)

With the index in place, a viewport query that touches a few thousand of 50 million features runs in milliseconds.

Raster support

GeoPackage isn't just vector. The standard defines a tile matrix encoding that stores raster pyramids — base map tiles at multiple zoom levels — alongside vector data. This is hugely useful for offline field-collection apps: a single .gpkg can hold the base satellite imagery, the project's vector layers, and the user's draft edits.

Open-source field apps like QField and Mergin Maps build their entire offline workflow around this capability.

Where GeoPackage still falls short

It's not perfect:

  • Multi-writer concurrency is limited. SQLite uses file-level locking; two users editing the same .gpkg simultaneously will block or fail. For multi-user editing, PostGIS is the right tool, with GeoPackage as a distribution snapshot.
  • Not human-readable. You need a tool. GeoJSON wins for diff-able, hand-editable workflows.
  • Older ArcGIS (10.x) can't read it directly. ArcGIS Pro 2.0+ does, but legacy environments need conversion.
  • No web-browser support. Browsers cannot consume a binary SQLite file directly. For web maps, convert to GeoJSON or vector tiles.
  • VACUUM needed after large deletes. Like any SQLite database, deletes mark pages free but don't shrink the file. Run VACUUM periodically.

How to migrate from Shapefile

The one-liner is:

ogr2ogr -f GPKG output.gpkg input.shp

For a folder of Shapefiles into a single GeoPackage with one layer per file:

for f in *.shp; do
  ogr2ogr -f GPKG -update -append \
    -nln "$(basename "$f" .shp)" \
    output.gpkg "$f"
done

The -update -append flags add layers to an existing database; -nln names each new layer after its source file. The result is one .gpkg containing all your former Shapefiles as separate, queryable layers.

Our online Shapefile to GeoPackage converter handles single uploads of zipped Shapefiles.

Should you switch?

If you're starting a new project, yes — make GeoPackage your default. Use GeoJSON for anything that hits a browser, Shapefile only when an external party demands it, and PostGIS when you need multi-user editing or server-side analysis.

If you have an existing pipeline that's all Shapefile, there's no urgent reason to migrate. But every new dataset you create is a chance to pick the better format, and the cost of conversion is one ogr2ogr command.

Related Converters

Format References