Getting data out of OpenStreetMap (OSM) presents more technical challenges than putting data into OSM. When you put data into OSM, you can use your choice of a number of different types of editors. You can use any tags that you want, attempting to stick to tagging conventions of course.
In contrast, when you get data out of OSM, you have to deal with the following:
Complicating matters is the fact that OSM returns data in its own structure of XML, which is not immediately readable by many GIS applications. Therefore, getting data from OSM often involves converting from this XML into some other format.
There are a variety of mechanisms for downloading OSM data. The easiest ones address the challenges by providing a way to filter the tags you want, allowing you to specify the output format, and allowing you to specify a geographic bounding box for the requested data, so you don't retrieve too much.
One of the most user-friendly GUI-oriented ways that I have found for retrieving OSM data is a server at BBBike.org [1] (http://extract.bbbike.org [1]). This little web-based tool allows you to draw a bounding box interactively and specify the output format you want. After a while, you receive an e-mail with a link to download your data.
In the walkthrough, however, we'll use the OSM download mechanism that is build directly into QGIS. Although this way is a little more advanced than the BBBike extract service, it is more immediate and allows greater flexibility for the amount of data and tags selected.
Examine the image below of Cayenne, French Guiana. You'll notice that the city has detailed building footprint polygons available. Let's suppose that we want to get a shapefile of these building footprints using QGIS.
Note that we have defined our three pieces of essential information to filter the OSM data we want:
Follow these steps to get the data using QGIS:
When doing this, be careful that you don't specify a bounding box larger than you need, or you could end up with an extraordinary amount of data.
The bounding coordinates must be supplied in WGS 1984 lat/lon format or the tool will not work. It may take a bit of detective work to figure out these coordinates before you launch QGIS.
cayenne.osmas shown above and click OK. Wait while your data is downloaded. At the time of this writing, the size was around 23 MB.
cayenne.osm.dbSpatiaLite file. Be aware that the .db extension may not be visible in Windows Explorer, but if the file shows up in the file browser dialog, then you are okay.
cayenne_polygons
Behind any data retrieval mechanism from OSM is a web service request. You can send these requests directly from your web browser or an automated program using an OSM query API. One of the most powerful of these APIs is called Overpass [2]. Try the following:
http://www.overpass-api.de/api/xapi_meta?*[building=yes][bbox=-52.35,4.88,-52.25,4.98]Notice what this is requesting...It should look familiar.
You can use Python or other scripting languages to make these requests automatically. For example, here's how you could use Python to query OSM for all the farmers' markets in Philadelphia and save them to a .osm file. (You're not required to run this code).
import urllib workspace = "C:\\data\\OSMdev\\" # Make data queries to jXAPI marketsXml = urllib.urlopen("http://www.overpass-api.de/api/xapi_meta?*%5Bshop=farm%5D%5Bbbox=-75.29,39.86,-74.95,40.15%5D").read() # Make farmers markets file marketsPath = workspace + "markets.osm" marketsFile = open(marketsPath, 'w') marketsFile.write(marketsXml) marketsFile.close()
For Python junkies: The above code uses a library called urllib which is able to make web requests and read the responses. You just have to provide the URL for the request. So as not to be interpreted as defining a list, the "[" and "]" characters are escaped using the %5B and %5D sequences, respectively, but otherwise the query has the same syntax as the one you issued above for Cayenne buildings. The resulting XML is then written to a file using the standard Python write method.
A script like this might be useful if you wanted to update one or more datasets on a periodic basis. The script could be combined with GDAL processing to get the data into a format suitable for your web map. Recent versions of GDAL (1.10 and later) can read OSM XML and convert it to different formats, such as GeoJSON or shapefiles. (Be careful with shapefiles though, because GDAL plops most of the less common "other tags" into one field that gets cut off at 256 characters, a limitation of the shapefile format).
As an exclamation point at the end of all this geekiness, play around with the graphical tool overpass turbo [3] for a few minutes. This gives you an interactive environment for querying OSM and seeing the results on the map. You can save any interesting result in popular formats, such as KML. This is helpful if you just want to make a one-off query to OSM for some particular feature type.
There are many circumstances and needs that can affect the way you retrieve data from OSM. Hopefully, this walkthrough has provided enough options that you can make an informed decision about how to best get the scope and scale of data you need. Now let's go to the lesson assignment where you'll get some experience with the other side of things: putting data into OSM.
Links
[1] http://extract.bbbike.org
[2] http://wiki.openstreetmap.org/wiki/Overpass_API
[3] http://overpass-turbo.eu/