Vector tile serving: Some introductory thoughts & hints for a fast way to serve vector tiles

I. Introduction

In the last few weeks I played around and tested some approaches to  self hosted serve (and style) vector tiles. In this blog entry I share some of my observations to you. Don’t expect a fully designed tutorial, it’s more a collection of starting points for your own tests to serve vector tiles. There are several approaches how to serve vector tiles. Mapbox [1] can be used, also GeoServer  has a vector tile extension [2] In my experiments I used tileserver-gl, an open source tile server developed by Klokantech [3] . Tileserver-gl is a great way to quickly serve vector tiles on an own web server, escaping typically pay per view scenarios. Mapbox for example, which is today’s big player in serving web maps, uses a pay per map-view payment model, charging 0.50 US$ per 1000 map views [4]. With tileserver-gl it is possible to self host you vector map tiles and later embed them in your web map using e.g. Leaflet.js or OpenLayers and tileserver.gl has the great advantage, that it allows using GL styles, a JSON style specification, basically telling your map how it should look [5]

For those of you that never heard about vector tiles, I recommend reading this entry of the OpenStreetMap Wiki: [6]. In its nutshell, vector tiles are a vector based representation of geo-objects. The idea is rather old and appeared shortly after the appearance of GIS systems and first was used by the US Wildfire Service in 1975, in their software  Wetlands Analytical Mapping System (WAMS) [7]. Vector tiles started to “revolutionize” the way how map data is served in the WWW, when Google introduced vector tiles to serve Google maps in 2010  [7].

II. Prerequisites

If you plan to serve some vector tiles and would like to follow the suggestions of this blog post, you need a running LINUX server with docker installed on the server. For my tries I use a LINUX V-Server from Strato [8], a Berlin based company belonging to 1&1 (as far as I know). They offer some interesting and cheap virtual server packages. I recently  switched from UnitedHoster to STRATO . Of course, you can use your own server, or Ubuntu from the Amazon AWS market place [9].  As you can see, there are plenty of other hosting companies out there. For some testing, a starter package normally is sufficient, if bandwidth and speed matters, you have to spend more bucks for the fun. 

III. Use OpenMapTiles to get OpenStreetMap(OSM) vector tiles

OpenMaptTiles [10 ] allows downloading vector tiled OSM data in the MBtiles format free of charge with smaller regions; for customized and bigger regions, the service charges a fee, e.g. to download the whole planet.osm in MBtiles format costs 2048 US$ for business users (1024 for individual users, as free lancers).

IV. How can you create vector tiles of your own map data?

The makers of OpenMaptiles published a nice introduction how to create vector tiles using the command line tool Tippecanoe developed by Mapbox [11] What isn’t answered is, how to install Tippecanoe on a LINUX system. First you have to clone the Git repository, Next change to the folder where the repository was cloned to, compile the software and install it.

git clone mapbox/tippecanoe.git 
cd tippecanoe
make -j
make install

After the installation you need files in the GeoJSON format as an input of Tippecanoe.

Ogr2ogr easily allows to convert file formats as the still widely popular ESRI Shapefiles to GeoJSON:

ogr2ogr -f GeoJSON your_data_in_4326.json -t_srs EPSG:4326 your_data.shp

Next Tippecanoe converts the GeoJSON in the MBtiles format in 14 different zoom levels, the MBtiles later can be served using a tileserver.

tippecanoe -o your_data.mbtiles your_data_in_4326.json

If you just need raster tiles for you web map?

In some cases vector tiles aren’t necessary and raster tiles are all you need to quickly visualize something on a slippy map. A quick way to get raster tiles, without zoom-level (scale) dependent rendering of objects would be:  1. Style your data in QGIS . 2. Export your styled map as high resolution png and 3. Tile your date with gdal2tiles:  

gdal2tiles.py -s EPSG:4326  -z 10-16 yourdata.png yourdata_xyz=4

Gdal2tiles creates a folder structure, with subfolders for each zoom level, depending on how much zoom levels you have defined. For quite a while I used the great software Tilemill [12]  to style geodata using carto.css and to create raster tiles. Unfortunately active development of Tilemill stopped a few years ago.

V. How to install and use tileserver.gl?

First the good news, tileserver.gl is available as docker container. In case you want to fight through your way  to install the full stack and not to use docker, you can install the server this way:

  1. Install node.js on Ubuntu Server: DigitalOcean [13]  offers a nice introduction on how to install node.js. To use tileserver.gl, at least the version 6 of node.js is necessary.
  2. Install Tileserver.gl: For the installation this tutorial found on “Ralphs blog” (interesting blog, by the way) was used [14]

Use Klokantechs Docker container

As mentioned above, it is not necessary to install the full stack. You can also use Klokantechs Docker container. If you haven’t installed Docker, Digitalocean also offers great tutorials on how to install Docker on your Ubuntu server [15]Start the container in the same folder your MBTiles are located.  After installing docker, start the docker container from the directory, where you host your MBtiles with the command:

 sudo docker run --rm -it -v $(pwd):/data -p 8080:80 klokantech/tileserver-gl -c config.json

The map servers frontpage, with access to the styles and the data,  can be opened via your servers IP and the port number 8080 by default. Here you can see the frontpage of my test server: http://h2800220.stratoserver.net:8080/ The behavior of the map, e.g. the map style can be controlled in the style.json file. In the configuration file, the style.json file has to be referenced.  

VI. Style your map with Maputnik

Styles are controlled by changing the style.json file and have to be referenced in the config.json. How can you change this file, to get an own style for a custom web map? Styles can be changed using Maputnik [16], an open source style editor for Mapbox styles. Maputnik can be used in the browser as online editor, but does not allow adding vector tiles served with http, just https. On my test server I did not activate https, but the offline version of Maputnik also accepts http as data source. Maputnik just hast to be downloaded from: https://github.com/maputnik/editor/releases/tag/v1.5.0 and extracted, than it can be used locally in a browser. Edit the style as you want and then export the style.json and upload it to your server. I recommend to start with an existing style and to adapt it to your needs. After modifying the style, upload the style.json file to your server, don’t forget to reference the config.json and restart the docker container, so that the changes take effect.  Just an observation, Maputnik looks and feels a lot like the Mapbox Studio [17] , the usage is quite intuitive.

And finally I got  a self hosted vector map (of Managua, Nicaragua) with some edits of the nice dark matter style (which is just a product of playing around a little with the style):  http://h2800220.stratoserver.net:8080/styles/test/#15/12.14186/-86.28059

In my next blog post I plan to write a little more about the GL style definition.

 

References

[1] https://www.mapbox.com/vector-tiles/

[2] https://docs.geoserver.org/stable/en/user/extensions/vectortiles/

[3] https://github.com/klokantech/tileserver-gl

[4] https://www.mapbox.com/pricing/?utm_source=chko&utm_medium=search&utm_content=MapboxPricePlan&utm_campaign=CHKO-PR01-BR-Mapbox-INT-Exact&gclid=CjwKCAiAlvnfBRA1EiwAVOEgfMMmTtJem8_gOUFz8gE5S3RshxEqdPOON_Z8OiCGzV6Ht1O2BmzI6xoCb5EQAvD_BwE

[5]  https://www.mapbox.com/mapbox-gl-js/style-spec/

[6] https://wiki.openstreetmap.org/wiki/Vector_tiles

[7] https://en.wikipedia.org/wiki/Vector_tiles#History

[8] https://www.strato.de/server/linux-vserver/

[10  ]https://openmaptiles.com/

[11] https://openmaptiles.org/docs/generate/custom-vector-from-shapefile-geojson/

[12] https://github.com/tilemill-project/tilemill

[13] https://www.digitalocean.com/community/tutorials/how-to-install-node-js-on-ubuntu-18-04

[14] https://golb.hplar.ch/2018/07/self-hosted-tile-server.html

[15]https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-ubuntu-18-04

[16] https://maputnik.github.io/

[17] https://www.mapbox.com/studio/

 

 

Learn to geocode addresses in 5 minutes

Geocode a list of addresses in 5 Minutes: a beginners guide

Actually I first planned to show this little 5 minutes guide in our last Maptime Berlin session (September 2018), unfortunately I could not make it, so I thought: Let‘s write a little blog post about geocoding addresses. I will quickly explain the concepts of what geocoding means, demonstrate how to geocode a list of addresses with the great MMQGIS plugin and I also  give hints as starting point for some more efficient address-geocoding.

What is geocoding?

So what is geocoding? Imagine you have a list of addresses and you want to locate them on a map. This conversion into a geographic coordinate is called “geocoding”. The other way round is named reverse geocding. Revers geocoding converts a coordinate to a street address. Many geocoding services are out there: ESRI has their own, Google, HERE, Bing,TomTom and also a great  open one,, Nominatim, is part of the OpenStreetMap project. Some of them are really performant, but also cost a lot.

If you are interested in the principles behind geocoding and the underlying algorithms, I recommend the article by Goldberg et al.:

Goldberg, D. W., Wilson, J. P., & Knoblock, C. A. (2007). From text to geographic coordinates: the current state of geocoding. URISA-WASHINGTON DC-, 19(1), 33, available at: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.135.3589&rep=rep1&type=pdf#page=34

Very simplified the main core of geocoding algorithms is the fuzzy matching of two strings, with some distance calculations in between them.

What are applications for geocoding addresses? Literally you can connect any street address with a geographic coordinate. Applications can be found e.g. in geo-marketing to locate costumers.

Where to get the addresses from? Enterprises as Schober sell address data. The city you live normally trades your address data and sells your registration to the German Post. Beside the mentioned enterprises a great open address service exists: Open addresses, an open data address collection: http://openaddresses.io/ The service covers around ~500 Mioaddresses. For Berlin the service covers good data with around 375 000 entries.

Geocode using the MMQGIS QGIS plugin

So let’s dive in into the practical part. To demonstrate you how to geocode a short list of addresses, I downloaded the list of addresses of all polling stations for Potsdam’s mayors election, that take place tomorrow (21st of September). The data was published on Potsdam’s open data portal: https://opendata.potsdam.de/explore/dataset/wahlbezirke_wahllokale/

As download format I have chosen “csv” and what I get, is a semicolon separated csv table. To use the data with the MMQGIS plugin, some data cleaning had to be done. You can use an editor (as Atom, Gedit, or Notepad++) and convert the semicolons to colons (needed by the plugin). Just find and and replace them. Street-name and house-numbers are separated and  have to be concatenated (e.g. in LibreOffice). If not, MMQGIS does not allow to geocode down to the house number. Further a city field is recommended, otherwise the geocoder will search world wide and match addresses with the same name. So finally an extract of the list would look like this:

 

 

Next you have to start QGIS and install the MMQGIS plugin. MMQGIS is a really great plugin for vector data manipulation developed by Michael Minn: http://michaelminn.com/linux/mmqgis/

Once you installed the plugin, you can start the GUI with QGIS -> MMQGIS -> Geocode

Two geocoding services can be chosen, the proprietary google service and Nominatim. The google service requires an API key. According to their latest price plan, google charges nothing up to 1000 requests, than they charge 0,50 US cent per 1000 requests up to 100 000 daily.

Nominatim is open source, so for our demo I chose Nominatim from the pull down menu. Watch out: To geocode via a GUI is very slow. So you have to be patient and wait a little while but finally Nominatim manages to find 103 of 130 poll stations in Potsdam. Below you can see a screenshot of the result.

Some final remarks: Make it more efficient

Geocoing with MMQGIS is recommended, when it should be really quick and when you just have a few addresses. For bigger lists use a script based approach. Suitable libraries are for example geopy: https://github.com/geopy/geopy For R I found this nice blog post describing how to geocode with R and providing a nice script: http://www.storybench.org/geocode-csv-addresses-r/    I used the script already several times and it works just fine. I am sure there are more libraries out there, I am not aware of.

For constant maybe automatized geocoding of mass data, I recommend a quick and robust service as google or an own Nominatim instance on a server. But how to set up a Nominatim instance? A tutorial can be found here: http://nominatim.org/release-docs/latest/admin/Installation/ Also Photon comes to my mind, to speed up geocoding-tasks: Komoot, a Potsdam based company, that offers navigation for bikers and hikers provided Photon, an alternative to Nominatim (or built on top of Nominatim): https://github.com/komoot/photon

 

Playing around with open addresses

For fun I played around with open addresses, the service I mentioned above, and downloaded the address list of Berlin (~8.7 MB). The list is accessible from the open addresses website: https://s3.amazonaws.com/data.openaddresses.io/runs/491996/de/berlin.zip The csv contains around 375k addresses of Berlin and the data already has coordinates (this is way too much for MMQGIS). Just as a thought experiment: when I would send a flyer to each of the addresses this would cost (according to the Deutsche Post) 99€ for 1000 flyer. So for 37500 € I could send a flyer to almost each household in Berlin. OK, long live spam for advertising :-).

 

 

 

 

 

 

 

Meine Buchveröffentlichung: Räumliche Analyse und Visualisierung von Mietpreisdaten, Geoinformatische Studie zur räumlichen Optimierung von Immobilienportalen

Meine Dissertation wurde von Springer Spektrum als Buch veröffentlicht. Unter dem folgenden Link findet man die Veröffentlichung: http://www.springer.com/de/book/9783658177737#springer

Das Inhaltsverzeichnis kann man hier herunterladen: http://bit.ly/2p0Aj6F

Einen Produktflyer findet man hier: http://bit.ly/2ovHeAM

 

Über das Buch

Harald Schernthanner verfolgt das Ziel, aus geoinformatischer Sicht eine konzeptionelle Grundlage zur räumlichen Optimierung von Immobilienportalen zu schaffen. Dabei geht der Autor davon aus, dass Verfahren der räumlichen Statistik und des maschinellen Lernens zur Mietpreisschätzung sich besser als die bisher eingesetzten Verfahren der hedonischen Regression zur räumlichen Optimierung von Immobilienportalen eignen. Er zeigt, dass die von Immobilienportalen publizierten webbasierten Mietpreiskarten nicht die tatsächlichen räumlichen Verhältnisse auf Immobilienmärkten wiedergeben. Alternative webbasierte Darstellungsformen, wie beispielsweise „Gridmaps“, sind dem Status quo der Immobilienpreiskarten von Immobilienportalen überlegen und visualisieren die tatsächlichen räumlichen Verhältnisse von Immobilienpreisen zweckmäßiger.

Shell script to automate the download of rental data using WGET, JQ & Cron

Until recently I fetched rental data via a script based on R´s Jsonlite library and by accessing Nestoria´s API  (a vertical search engine, that bundles the data of several real estate portals). My first script had to be executed manually; in the next attempt, I started to automate the data downloading, unlikely Nestoria blocked my IP.  I admit, I excessively downloaded data and did the download using a static IP and a  cronjob that has been executed always on a daily basis, on the same daytime. This resulted in a 403 error (IP forbidden, I used a static IP). So together with Nico (@bellackn) an alternative was figured out. Instead of Jsonlite, our shell script uses WGET and makes use of the great JQ tool (CSV to JSON Parser). Thank´s Nico, for the input and ideas.

Next a few of the most relevant lines of code are explained. The entire code can be seen and downloaded from Github: https://github.com/hatschito/Rental_data_download_shell_script_WGET_JQ

We use the  w 60 and –random-wait flag, this tells WGET to either wait 0, 60 or 120 secs to download. This behavior tricks the server. Within WGET also the area of interest is defined. The API allows LAT/LONG in decimal degrees or place names.

wget -w 60 --random-wait -qO- "http://api.nestoria.de/api?country=de&pretty=1&encoding=json&action=search_listings&
place_name=$place&listing_type=rent&page=1";

After that, the first page is downloaded. The first page has to be altered with the sed command (UNIX command to parse text). A while loop does the downloading of the remaining pages, the page number to be downloaded can be modified. We receive JSON files, that have to be parsed to a geodata format.

While Loop:

echo -e "\nOrt: $place\nSeite 1 wird heruntergeladen."
sed '/application/ d' ./rentalprice-$datum.json > clr-rentalprice-$datum.json
 
i=1
while [ $i -le 25 ]
#insert the number of pages you want to download, here: 2 to 28
#(find out how much pages you need/Nestoria offers - the json with "locs" in the file name should have just one comma at the end
#of the file - lower the number according to the odd commas - e.g. for Potsdam, it's 28)
# -----> (you'll also have to comment the deletion of the locs-file way down the script in order to do so...) > ./rentalprice-locs-$datum.json
  printf "," >> ./rentalprice-locs-$datum.json
  i=$[$i+1]
done

Parse JQ to CSV:

JQ, a command line JSON interpreter, parses the JSON to CSV

jq .response.listings < rental_prices-$place-$datum.json | in2csv -f json > CSV-rental_prices.csv

In the following step the data is loaded to a short R script. R´s SP library converts the CSV to a shapefile (actually we will skip this part and in the next version – GDAL should manage the file conversion).
Back in the shell script a timestamp with the current date and time is appended to the shapefile. After some cleaning at the end of the Shell script, finally a Cronjob is created in order to schedule the daily data download. The Cronjob also can be done via a GUI: https://wiki.ubuntuusers.de/GNOME_Schedule/

Still the resulting shapefiles are stored file based, but I plan to hook the script to a PostgreSQL database, that was already installed on my Linux VServer.

Feel free to use our script, if you are interested in downloading geocoded rental data for your analysis and your area of interest. Any feedback or comment is appreciated.

How to get GRASS GIS 7.0.5 working on Mac OS Sierra?

For some image processing tasks (especially I needed  the GRASS GIS r.report command) and inspired by Peter Löwes recent blog post about the origins of GRASS GIS,  I installed GRASS GIS 7.0.5 on my MAC with MAC OS Sierra. The plan was to use GRASS GIS within QGIS. The installation can be tricky 😃.

That´s why I thought, I’ll provide a short tutorial, where I gather all necessary installation steps, collected from different websites. This should help you, to have a smooth installation and to support you in avoiding certain possible pitfalls. The most important parts of the installation are well documented on http://grassmac.wikidot.com/. GRASS GIS for Mac can be downloaded here: http://grassmac.wikidot.com/downloads

Several changes and installations have to be made on your Mac with macOS Sierra to run GRASS GIS:

I. The following frameworks have to be installed:

The frameworks are compiled by William Kyngesburye and can be downloaded from his website: http://www.kyngchaos.com/software:frameworks  Big thanks to William, for porting and compiling QGIS and GRASS GIS for all GIS users! The frameworks also can be downloaded from the sources below  and  should be installed in the order given:

1. GDAL Complete 2.1 (2016-9-17 – 32bit) download
2. GDAL Complete 1.11 (2015-10-26 – 32bit) download
3. FreeType 2.4.12-1 download
4. cairo 1.12.2-1 (Install AFTER GDAL and FreeType) download
5. Numpy 1.8.0-1 download
6. MatPlotLib 1.3.0-3 (32bit) download
7. pandoc 1.13.1 download
8. PIL 1.1.7-4 download
10. Active States TclTk 8.5.7 (only for TclTk NVIZ in GRASS 6) download

II.MAC´s System Integrity Protection feature has to be disabled:

The steps to disable the system integrity protection were taken from http://grassmac.wikidot.com/:

1. Restart your Mac in Recovery Mode. To do this, choose Restart from the Apple menu, and as soon as the screen turns black hold down Command + R on the keyboard until the Apple logo appears on your screen.

2. Select Terminal from the Utilities menu.

3. In the Terminal Window that opens type: csrutil disable
– Press the Return key.
– Choose Restart from the Apple menu.

III. Install Python 2.7 to avoid the well-known error: “Bad CPU type in executable”

This error is well-known and documented: http://bit.ly/2n6BFcl

As workaround, I (re-)installed the latest version of Python 2.7, as documented by Sylvain Poulain. Also for me, this worked.

IV. The Bash profile has to be altered

The following two lines have to be added to the bash profile, because GRASS GIS doesn´t seem to get along with 64-bit architectures:

  • export GRASS_PYTHON=/usr/bin/python2.7 export
  • GRASS_PYTHONWX=/usr/bin/pythonw2.7
  1. Start up the terminal
  2. Type “cd ~/” to go to your home folder
  3. Type “touch .bash_profile” to create a  new file.
  4. Edit .bash_profile with your favorite editor (or you can just type “open -e .bash_profile” to open it in TextEdit.
  5. Add the following to lines: export GRASS_PYTHON=/usr/bin/python2.7 export -GRASS_PYTHONWX=/usr/bin/pythonw2.7
  6. Type “. .bash_profile” to reload .bash_profile and update any functions you add.

If everything works, GRASS GIS should start up and can be used by the command line, by it’s  GUI. 

V Configure GRASS applications folder in QGIS

The way I prefer to use GRASS is within QGIS´s processing toolbox. To run GRASS GIS tools within QGIS, the GRASS GIS application folder has to be defined. This will avoid the error below.

In my opinion is´s very convenient to run GRASS algorithms within QGIS. Just define the path of your GRASS GIS installation via QGIS -> processing ->providers -> GRASS7 folder . In my case the path is: /Applications/QGIS.app/Contents/MacOS/grass7. 

After pointing to the GRASS GIS folder, I finally can use GRASS GIS the way I want to and I get access to more than 300 image processing algorithms within QGIS and more than 400 within the GRASS GUI!