Who's On First — Bundles

Bundles (of Who's On First data) are still considered experimental. They are being actively used (by Mapzen itself) but they are a still work in progress so some things may not be quite right, yet.

Bundles are a collection of GeoJSON formatted files (Who's On First data) grouped by a specific property, like placetype. They allow for people to more easily bulk download a subset of the entire Who's On First dataset. Currently there are only bundles by placetype but eventually we will add a variety of different slices of the data as demand and interest require.

There is only ever one bundle-per-property at any given moment. When a new bundle is generated it replaces the old one. If you need to test whether a bundle has changed without downloading the bundle itself the best way is to compare the SHA1 hash of the meta file for that property.

For example if the bundle in question wof-region-latest-bundle.tar.bz2 then its corresponding meta file will be wof-region-latest.csv.

You can determine the SHA1 hash for the meta file either by downloading the meta file directly and calculating its hash locally or you can download the wof-country-latest.csv.sha1.txt file which is a plain text file containing the value of the hash.

What follows are the list of bundles for administrative placetypes that are part of the whosonfirst-data repository. There are actually 560 individual bundles (there are reasons why discussed in detail, over here) so we've made a handy index file for all the bundles. It lives at https://whosonfirst.mapzen.com/bundles/index.txt and is a line-separated text file with the name of every available bundle. All the same naming conventions for meta files and SHA1 hashes, discussed above, apply to the bundles listed in index file.

Placetypes

Common ?

Common Optional ?

Optional ?

Working with Bundles

Bundles all have the same directory structure: A single data directory and a wof-PLACETYPE-OR-LATEST-BUNDLE-NAME-latest.csv comma-separated "meta" file. It is left up to individual users of the data to decide how and whether data from multiple bundles should be merged.

Here is a working example script that will download all the bundles for continents, countries, regions, localities and neighbourhoods and merge the records from each in to a single data folder, located in the directory that the script is run from.

Please note that there is little to no error-checking in the example below so you should adjust this to taste. Also it assumes that you are comfortable working on the command-line and that your computer has copies of the tar, curl and rsync utilities pre-installed (this is true for all Linux and OS X computers).

#!/bin/sh

PLACETYPES='continent country region locality neighbourhood'
DATA="data"

if [ ! -d ${DATA} ]
then
    mkdir ${DATA}
fi

for PT in ${PLACETYPES}
do

    BUNDLE="wof-${PT}-latest-bundle"
    COMPRESSED="${BUNDLE}.tar.bz2"

    if [ -e ${COMPRESSED} ]
    then
        echo "remove ${COMPRESSED}"
        rm -i ${COMPRESSED}
    fi

    if [ -d ${BUNDLE} ]
    then
        echo "remove ${BUNDLE}"
        rm -ri ${BUNDLE}
    fi

    echo "fetch ${COMPRESSED}"
    curl -s -o ${COMPRESSED} https://whosonfirst.mapzen.com/bundles/${COMPRESSED}

    echo "expand ${COMPRESSED}"
    tar -xvjf ${COMPRESSED}

    echo "sync ${BUNDLE}"
    rsync -av ${BUNDLE}/data/ ${DATA}/

done