Guide8 min readApril 22, 2025

How to Validate GIS Files Before Importing into QGIS

Most QGIS import failures have nothing to do with QGIS. They're caused by malformed GIS files that you can catch in 30 seconds with the right validation step.

When QGIS refuses to open a file, the temptation is to blame QGIS. In nine cases out of ten, the file is broken — and you can spot the breakage in advance with a quick validation pass. This guide covers the most common pre-flight checks for each major format.

Why validate first?

QGIS is permissive about some errors and strict about others. A subtly malformed GeoJSON may load with a warning that you miss, then crash an export step half an hour later. A Shapefile with a corrupt .dbf may load empty without any error message at all. A KML with swapped coordinates may render — in the wrong country.

Five minutes of validation up front saves the afternoon you'd otherwise spend tracking down 'why is the analysis broken'.

Shapefile: the bundle is the danger zone

The most common Shapefile problem isn't bad geometry — it's a missing sidecar. The minimum bundle is:

  • .shp (geometry)
  • .shx (positional index)
  • .dbf (attribute table)

A .prj is strongly recommended; without it, QGIS will warn that the CRS is unknown and ask you to assign one.

Common errors:

  1. Files nested in a sub-folder inside the ZIP. Many download portals zip a folder containing the Shapefile, not the Shapefile files themselves. QGIS handles this, but online tools and validators often don't. Always check the ZIP structure first.
  2. Missing `.dbf` or `.shx`. The data is unusable. There is no recovery from this — request a fresh download.
  3. Encoding garbled. Without a .cpg declaring the codepage, QGIS guesses. The guess is often wrong for non-English data. Add a .cpg containing UTF-8 if your data is UTF-8.
  4. Attribute names truncated weirdly. dBASE limits field names to 10 characters; a longer name was truncated by whoever exported the file. Check whether you've lost meaningful distinctions (population_2020 and population_2025 both becoming populatn_2).
  5. 2 GB file size cap exceeded. The .shp or .dbf is corrupt past offset 2,147,483,647 bytes. Re-export from the source in a format without this limit.

Validate with our Shapefile validator or run ogrinfo -so input.shp locally to see the geometry type, feature count, CRS, and field schema in one shot.

GeoJSON: JSON is easy to break

GeoJSON's text format makes it trivial to inspect — and trivial to corrupt by hand-editing.

Common errors:

  1. Invalid JSON. A trailing comma, an unquoted key, a stray BOM at the file start. A jq . file.geojson quickly identifies syntactic issues with a clear error message.
  2. Unclosed polygon rings. RFC 7946 requires every polygon ring's first and last coordinate to be identical. Many writers get this wrong silently. QGIS will draw the polygon anyway (closing it implicitly) but stricter consumers refuse it.
  3. Mixed geometry types in a `FeatureCollection`. Legal per spec, but breaks tools that assume uniform layers. If you need to keep mixed geometries, split into separate files.
  4. Coordinate ranges out of bounds. Latitudes must be in [-90, 90], longitudes in [-180, 180]. Values outside that range usually mean swapped coordinates ([53.5, 10.0] instead of [10.0, 53.5]).
  5. Legacy `crs` member. GeoJSON 2008 allowed a crs object at the FeatureCollection level. RFC 7946 dropped it; the format is locked to WGS 84. Modern validators flag this as a warning; most tools ignore it but the data may not be where you think.
  6. File too large. Browsers struggle past 50–100 MB. If your file is bigger, convert to GeoPackage or TopoJSON before loading.

Validate with our GeoJSON validator or ogrinfo -al -so file.geojson. For very large files, python -c "import json; json.load(open('file.geojson'))" confirms JSON validity without loading into a GIS.

KML: XML, CDATA, and the coordinate trap

KML's XML structure is forgiving — Google Earth opens almost anything labelled KML — but the permissiveness hides issues that bite downstream tools.

Common errors:

  1. Coordinate order swapped. KML expects lon,lat,alt separated by commas. A handwritten or AI-generated KML with lat,lon,alt will render points in completely wrong locations. Sample-check a few coordinates against a known location.
  2. Namespace omitted or wrong. The root <kml> element needs xmlns="http://www.opengis.net/kml/2.2". Without it, strict parsers refuse the file.
  3. CDATA inside descriptions broken. Description blocks often contain HTML wrapped in <![CDATA[...]]>. A misplaced ]]> ends CDATA prematurely and breaks the rest of the file.
  4. KMZ uploaded instead of KML. KMZ is a zipped KML. Validators expect plain .kml — extract the inner doc.kml first.
  5. Empty Placemarks. A <Placemark> without a geometry child is legal XML but useless. QGIS will skip them; counts may surprise you.

Validate with our KML validator or ogrinfo -al -so file.kml. Run xmllint --noout file.kml to confirm XML well-formedness.

GeoPackage: SQLite under the hood

GeoPackage's SQLite foundation makes most validation trivial — if the file opens in DB Browser for SQLite without error, the database itself is fine.

Common errors:

  1. File is not actually a GeoPackage. Someone renamed a SQLite database to .gpkg. It lacks gpkg_contents and the other mandatory metadata tables. Open it in DB Browser and verify those tables exist.
  2. Spatial layer registered but empty. gpkg_contents lists a feature table that contains zero rows. Often a sign of an interrupted export.
  3. Geometry column references undefined SRS. A row in gpkg_geometry_columns points to an srs_id that doesn't exist in gpkg_spatial_ref_sys. The layer will load but spatial operations fail.
  4. File not VACUUMed after large deletes. The file is much larger than the data warrants. Open in DB Browser and run VACUUM.
  5. Corrupted geometry blobs. Rare but possible — a power loss during write. SQLite's PRAGMA integrity_check; catches it.

Validate with our GeoPackage validator or ogrinfo -so file.gpkg to list all layers, geometry types, CRSs, and feature counts.

A 30-second pre-flight checklist

Before importing any file into QGIS:

  1. Open it in a validator (online or ogrinfo locally).
  2. Confirm feature count matches what you expect (not zero, not absurdly large).
  3. Check the CRS — known and sensible for the region.
  4. Look at coordinate ranges — within valid bounds, in the right hemisphere.
  5. Spot-check one geometry by clicking through to see attribute values.

If any of those fail, fix the file before importing. Loading a broken file into QGIS often leaves silent residue — invalid features in your project, incorrect joins, misaligned layers — that surface as bugs much later.

Beyond the file: post-load validation

Once loaded into QGIS, run the Geometry Validity check from Vector → Geometry Tools → Check Validity. This finds self-intersecting polygons, duplicate vertices, and ring orientation errors that are valid per the file format but invalid per OGC Simple Features rules. Many spatial operations (intersection, buffering, union) fail on geometrically invalid features even when the file itself is valid.

Related Converters

Format References