Everything about OpenStreetMap.

Data Structure

OSM's data model can be split into OSM elements and OSM changeset.

OSM elements:

  1. Type:
    1. Node node: geographic coordinates defining a point.
    2. Way way: an ordered list of node IDs representing a line connecting multiple nodes.
      • Closed way closed way: a way whose ends are identical.
      • Area area: a closed way without tags default to lines (highway, barrier, junction=roundabout) or explicitly tagged with area=yes.
    3. Relation Relation: an ordered list of member entity IDs with optional roles that defines their logical or geographic relationships.
  2. Metadata: common attributes of OSM entities.
    • id: (not universally guaranteed) unique entity ID within an entity type, integers starting from 1 (node id needs 64bit); OSM IDs are not permanent, every split of a way leaves one half with a different ID than the original way.
    • visible: indicator of whether the entity is deleted or not; deleted entities are only returned by history calls.
    • version: edit version of the entity, in integers starting from 1.
    • changeset: an integer id for a group of changes made by a single user over a short period of time.
    • timestamp: time of the last modification in W3C date and time format (a subset of ISO 8601).
    • user: name of the last editor.
    • uid: OSM id of the last editor, integer.
  3. Tag: key-value pairs describing custom attributes of an entity, both are character strings with maximum length 256.
    • Keys can be qualified with prefixes, infixes, or suffixes, forming namespace, e.g lanes:bus:forward.
    • Value can be a number, text, or categorical values delimited by semicolon.

Changeset is a special OSM entity for metadata about data edits, including id fields (id, user, uid), two timestamps (created_at, closed_at), special attributes (open, comments_count) and custom attributes via tags. Actual edits corresponding to a changeset is not part of the OSM data model, and is typically derived from OSM files using custom software.

A simplified example of OSM data represented in XML (the entire file should be enclosed in an <osm> tag pair).

<node id="42437959" visible="true" version="12" changeset="13150866"
timestamp="2012-09-18T02:14:34Z" user="RoadGeek_MD99" uid="475877"
lat="40.7229823" lon="-73.9885488">
    <tag k="highway" v="traffic_signals"/>
</node>
<node id="42443513" visible="true" version="5" changeset="5155439"
timestamp="2010-07-07T00:47:51Z" user="Dylan Semler" uid="31855"
lat="40.7232770" lon="-73.9884510">
    <tag k="highway" v="traffic_signals"/>
</node>
<way id="5672851" visible="true" version="37" changeset="41811184"
timestamp="2016-08-30T20:49:09Z" user="infinitesunrise" uid="4434293">
    <nd ref="42437959"/>
    <nd ref="42443513"/>
    <tag k="highway" v="primary"/>
    <tag k="name" v="1st Avenue"/>
    <tag k="oneway" v="yes"/>
</way>
<relation id="1077653" visible="true" version="3" changeset="21582260"
timestamp="2014-04-09T04:29:57Z" user="lxbarth" uid="589596">
    <member type="way" ref="22898643" role="from"/>
    <member type="way" ref="34080174" role="to"/>
    <member type="node" ref="42428201" role="via"/>
    <tag k="restriction" v="only_straight_on"/>
    <tag k="type" v="restriction"/>
</relation>

An OSM changeset represented in XML.

<changeset id="1077653" user="aeonesa" uid="61599" open="false" comments_count="0"
created_at="2009-05-04T18:36:38Z" closed_at="2009-05-04T19:36:38Z">
    <tag k="created_by" v="Potlatch 0.11a"/>
</changeset>

A simple OSM edit in osmChange format (.osc):

<osmChange version="0.6" generator="acme osm editor">
    <modify>
        <node id="1234" changeset="42" version="2" lat="12.1234567" lon="-8.7654321">
            <tag k="amenity" v="school"/>
        </node>
    </modify>
</osmChange>

Tags

OSM Cheat Sheet: tags and keyboard shortcuts.

Names:

  • name, the primary tag used for common names;
  • ref, reference numbers or codes.
  • alt_name, less common names.

Road classification: highway is the main tag (confusingly) used to classify any kind of road, street or path:

  • Restricted access road: motorway (one-directional by default);
    • ramps or motorway junctions: motorway_link (speed depends on curvature).
  • Standard road network: trunk (high capacity road without limited access), primary, secondary, tertiary;
    • link roads: trunk_link, primary_link, secondary_link, tertiary_link.
  • Smaller road: residential (in residential area), unclassified (minor public roads, non-residential);
  • Special type: living_street (slow traffic has absolute right-of-way), service (in private property or parking lots), pedestrian (normally forbidden for motor vehicles), road (classification unknown);
  • Tagged on node: traffic_signals, stop (stop signs), bus_stop, mini_roundabout;
  • May be tagged on areas, i.e. area=yes or type=multipolygon.

Other tags that can classify ways as part of a road network:

  • lanes (lanes:forward, lanes:backward) indicate number of lanes (in a given direction);
  • turn:lanes (left, through, right) and parking:lane indicate turning lanes and parking lanes;
  • bridge, tunnel for bridges and tunnels;
  • junction=roundabout, a single lane roundabout (one-directional);

Restrictions:

  • access additionally describes legal access to an entity via all or particular forms of transport.
  • maxspeed, maximum legal speed limit (default in kilometers per hour).
  • oneway (yes, no, -1).
  • hgv, heavy goods vehicle.

Spatial relation:

  • tagged as is_in;
  • inside an area tagged as place (continent, country, state, region, county, city, town, village, hamlet, suburb, island);
  • inside an area tagged as boundary=administrative and admin_level=8;

barrier, a physical structure that blocks or impedes movement (toll_booth, etc).

Relation

Turn restriction: member roles from, to, via, and tag type=restriction.

  • tag restriction
    • prohibitory: no_right_turn, no_left_turn, no_u_turn, no_straight_on;
    • mandatory: only_right_turn, only_left_turn, only_straight_on;
    • dead end: no_entry, no_exit;
  • tag except, exception by vehicle type.
  • tags for conditional restriction: day_on, day_off, hour_on, hour_off.

Multipolygon: member roles outer, inner, and tag type=multipolygon.

Other relations: route (a collections of ways) type=route, boundary.

File Format

File formats:

  • OpenStreetMap canonical formats
    • XML: .osm, data file with at most one version per object (may also be history file); .osh, history file, multiple versions of an object are allowed; .osc, osmChange file.
    • Protocol-buffer Binary Format (PBF): .pbf, the primary binary format based on protocol buffers.
    • O5M/O5C: a flat binary format that nobody uses; .o5m for data files and .o5c for osmChange files.
  • Third-party specified formats
    • Object Per Line: .opl, a text format where one object per line, designed by osmium for command line processing.
    • DEBUG: .debug, a text format more human-readable than the XML and OPL formats, by osmium.
    • Vanilla Extract eXchange format: .vex, a binary format for better compression ratio and processing speed, invented by Conveyal .

OSM data files are almost always ordered in a specific way: nodes, ways, relations; each group ordered by ascending ID. History files additionally order each element by ascending version. Change files are usually ordered by changeset ID.

Protocol buffers is a language-neutral, platform-neutral extensible mechanism for serializing structured data. Delta coding and variable-byte coding are applied throughout.

Characteristics of OSM PBF:

  1. multi-level structure: compressed Blobs - PrimitiveBlocks - PrimitiveGroups - OSM entities
  2. mapping of one Protobuf type to each OSM entity type is difficult due to the “dense nodes” data type and parallel arrays in OSM data.
  3. String tables replacing repetitive strings are redundant as Gzip will do an equivalent job.
  4. Separate Protobuf specifications are used for file block structure and OSM data, which requires implementers write auxiliary code.
  5. The most effective Protobuf technique applied in PBF is varints for delta-coded fixed-precision coordinates.

Tools

Converting map data between OSM and external formats:

  • Import
    • JOSM plugin OpenData for Shapefile, KML, GML, CSV and others.
    • Merkaartor can directly import Shapefile (and write OSM).
    • GPSBabel can write to OSM file (including tags like created_by=GPSBabel-1.5.2).
    • osm-and-geojson for GeoJSON;
    • gml2osm for GML and csv2osm for CSV.
  • Export
    • GDAL/OGR has an OSM driver with read-only support, so ogr2ogr can write OSM data to any supported format.
    • QGIS can convert OSM file into a SpatiaLite DB file, and export selected tags and geometry types.
    • osmtogeojson and OSM2GEO can also export to GeoJSON.

Web Services

OpenStreetMap.org services:

  • Entity viewer via endpoint /<type>/<id>;
  • The OSM Editing API (API v0.6) is a RESTful HTTP API via endpoint /api/0.6;
  • History via suffix /history.
  • iD is the default online map editor for OSM via endpoint /edit;

API v0.6 calls:

  • OSM and API Metadata
    • API Capabilities: /capabilities (0.25 sq degree, 300 seconds timeout);
    • API Connection Permissions: /permissions (not authorized, Basic Auth, OAuth);
    • OSM User Info: /user query by id with endpoints for the current authenticated user (gpx_files, details, preferences);
    • Geo-referenced text notes: /notes to edit (create, comment, close, reopen) and read (by bbox and text search);
  • OSM data:
    • Elements: /<type>/<id> to edit (create, update, delete) and read (history or specific version) individual element; fetch referencing (ways, relations) or referenced (full) elements; fetch multiple elements with query "(node|way|relation)=id[,id][&...]".
    • Extract: /map with query "bbox=left,bottom,right,top";
    • Changeset: /changeset/<id> to edit (create, upload, update, expand bounding box, close), read (get, download), or discuss (comment, subscribe, unsubscribe) individual changeset; fetch multiple /changesets with query (user, display_name; bbox, time, open/closed; changesets)
  • Raw GPS Traces: /trackpoints to query GPS track points (bbox, page) as GPX files; /gpx to upload (create) and download (details, data).

Overpass API is a read-only HTTP API for extracting elements from an OpenStreetMap database. Two query languages are available for the Overpass API: Overpass XML and Overpass QL. Overpass Turbo is a Web frontend of Overpass API.

planet.osm.org provides weekly dumps of the entire OpenStreetMap database in one file commonly referred to as Planet.osm. OSM data is provided in text (/planet/planet-latest.osm.bz2) and binary (/pbf/planet-latest.osm.pbf) formats. Diffs/change files produced with osmosis are organized under /replication/(day|hour|minute) where state.txt provides metadata about the most recent change file and /AAA/BBB/CCC.osc.gz is the change file with sequence number AAABBBCCC. You should use osmosis to retrieve and apply all replication diffs since its last run. Metadata of changes are at /planet/changesets-latest.osm.bz2 and under /replication/changesets respectively. Planet.osm has several mirrors.

Regional extracts are available and periodically updated from third parties. GeoFabrik.de provides daily extracts for continents, countries and sub-country regions; also available in Shapefile. BBBike.org provides weekly extracts for cities and regions as shapefile, CSV, SVG, OPL and formats for offline navigation (Garmin, Navit, maps.me, OsmAnd, mapsforge), also offers custom bounding box or polygon extracts within 24M sq km and 768MB file size. BBBike.org extracts set version of all map entities to -1 and all timestamps to 1969, which can cause trouble. Metro Extracts by Mapzen provides weekly bounding box extracts of popular cities and regions as Shapefile and GeoJSON files split by geometry type of features (lines, points, polygons) or by logical groups (Roads, etc) of OpenStreetMap tags (no relation); custom bounding box extracts with update support is available using Mapzen API; coastline geometries of each area are also available in Shapefile.

OSM turn restrictions:

Standalone Executables

osmosis is a command line Java application for processing OSM data, which can extract data inside a bounding box or polygon.

osmconvert can convert and process OSM files faster than osmosis but has less functionality; it has a few special functions. (--all-to-nodes, --complex-ways and --out-statistics)

osmupdate can create planet change files and update OSM data files with them; faster than osmosis but skips history diffs and cannot update databases.

osmium-tool are command line tools of libosmium to convert/concatenate files, derive/apply/merge changeset, validate/recreate id, and extract header/objects/historical view.

JOSM, Java OpenStreetMap, is the offline editor of OSM data, with many plugins. Features: validator feature can check and fix invalid data. JOSM plugins for fixing missing turn restrictions: ScoutSigns (road signs by Scout users), ImproveOsm (missing geometries and turn restrictions, by Scout), turnrestrictions (by skobbler).

Merkaartor is another OpenStreetMap editor, written in C++.

KeepRight, data consistency checks (quality assurance) for OSM.

PostgreSQL Loader: osm2pgsql; osm2pgrouting, imports OSM topology to PostgreSQL database; other java programs.

SQLite/SpatiaLite for OSM

Libraries & Frameworks

Programming Frameworks for accessing, processing, map rendering (static, interactive), geocoding, and navigation with OpenStreetMap data. Debian GIS Blend meta-package for OpenStreetMap already include many of these tools and libraries.

Accessing:

  • osm-common (Java): accessing, processing, geocoding; supports Overpass.
  • osmapi (Python): Python wrapper for the OpenStreetMap API.
  • osmaR (R): access OpenStreetMap data from file or API, and convert to other classes.
  • Data access APIs for Ruby and PHP are also available.

Data processing / parsing:

  • C/C++: osmium (It also has Python bindings pyosmium and Node.JS bindings node-osmium); pbf2osm, osmpbf;
  • GO: Gosmparse, osmpbf;
  • Java: osmosis, osm4j, BasicOSMParser, OSMemory;
  • Python: imposm (import OSM to PostgreSQL/PostGIS), osmread;

Navigation:

  • OSRM (C++), Open Source Routing Machine, used by Mapbox Directions.
  • Valhalla (C++), Mapzen Turn-by-Turn backend.
  • GraphHopper (Java), used by OpenStreetMap.org for bike and pedestrian routing.

Geocoding: Gisgraphy (Java)

Map rendering:

  • Web map interface: Leaflet, OpenLayers 3 (supports rotation);
  • WebGL (supports camera position (bearing, pitch) and lighting): Mapbox GL JS, Tangram;
  • HTML5 Canvas: Kothic JS, Cartagen
  • SVG: Kartograph (Python, JavaScript)

🏷 Category=Geographic Information System