Parchment Docs
Barrelman

Importing data

Download an OSM region and build the Barrelman search indexes.

Barrelman's import pipeline downloads an OSM PBF extract from Geofabrik, loads it into PostGIS via osm2pgsql, and builds the search indexes needed for full-text, fuzzy, and spatial queries.

All import tools (osm2pgsql, scripts, Lua styles) are baked into the barrelman-db image — nothing needs to be installed on the host.


1. Download a PBF extract

Copy an OSM PBF file into the barrelman_barrelman-osm-data Docker volume. The easiest way is to download directly inside the running container:

# North Carolina (~400 MB)
docker exec barrelman-db bash -c '
  wget -O /data/region.osm.pbf \
    https://download.geofabrik.de/north-america/us/north-carolina-latest.osm.pbf
'

Other region examples:

# Germany
https://download.geofabrik.de/europe/germany-latest.osm.pbf

# France
https://download.geofabrik.de/europe/france-latest.osm.pbf

# Full United States
https://download.geofabrik.de/north-america/us-latest.osm.pbf

Find all regions at download.geofabrik.de.


2. Run the import

Execute the import script inside the barrelman-db container. Use -d to detach so it survives SSH disconnects:

docker exec -d barrelman-db bash -c '
  osm2pgsql \
    --create --slim \
    --output=flex \
    --style=/app/import/osm2pgsql-flex.lua \
    -d "$DATABASE_URL" \
    /data/region.osm.pbf \
  && psql "$DATABASE_URL" -f /app/import/post-import.sql \
  && echo IMPORT_COMPLETE \
  || echo IMPORT_FAILED
' > /tmp/import.log 2>&1

Monitor progress:

# Watch the log
tail -f /tmp/import.log

# Check row count once the import commits
docker exec barrelman-db psql -U barrelman -d barrelman \
  -c "SELECT count(*) FROM geo_places;"

Import time depends on region size. A US state (~400 MB PBF) takes roughly 20–40 minutes.

Avoid --flat-nodes unless importing a full planet extract. For regional files it creates a ~31 GB sparse file unnecessarily.


3. Generate embeddings (optional)

Semantic search uses Ollama vector embeddings. All other search layers (full-text, fuzzy, abbreviation) work without this step.

# From the cloned repo (local dev only)
bun run import:embed

Embedding generation requires Ollama with nomic-embed-text pulled. Set OLLAMA_HOST in .env to point at your Ollama instance. Barrelman silently skips the semantic layer when Ollama is unreachable.


Import pipeline details

The import runs these steps in sequence:

StepToolDescription
1osm2pgsqlImports all OSM nodes/ways/relations via flex Lua style into geo_places
2post-import.sqlExtracts structured address/contact fields, builds GiST + GIN indexes, computes area_m2
3generate-abbreviations.tsPre-computes name_abbrev for autocomplete (e.g. UNCC → UNC Charlotte)
4tsvector rebuildRebuilds full-text search vectors to include abbreviations

After import, the geo_places table contains all importable OSM objects with:

ColumnDescription
centroidPoint geometry for distance queries and spatial indexing
geomFull geometry (polygon/linestring/point) for containment queries
tagsRaw OSM tag dictionary (JSONB)
categoriesArray of id-tagging-schema category IDs
addressStructured address extracted from addr:* tags
area_m2Geographic area in square metres (for administrative areas)
tsPre-computed tsvector for full-text search
embeddingVector for semantic search (populated by import:embed)

Re-importing

To replace existing data with a fresh PBF:

# Drop and re-import (destructive)
docker exec barrelman-db psql -U barrelman -d barrelman \
  -c "DROP TABLE IF EXISTS geo_places CASCADE;"

# Then re-run step 2 above