Schernthanner

Dr. rer. nat. Harald Schernthanner

Web-scrape and geocode rental data

As fun-project, from time to time I go on looking into rental data and do some analysis; but what is the best, most fancy analysis, when your database is messy or as it happened in my case, the database stoped to exist.

My former source to download rental data, the Nestoria API „dried up“  some months ago, as their service is not maintained anymore.
So I was looking for a new way to download rental data and I stumbled over the really nice blog entry from 2018 on the blog „Statis Quo“, where it is explained on how to scrape rental data with the „beautiful soup“ Python library. Based on this code, that had to be updated, as Immobilienscout24 changed the structure, how they put their data on the website, I rewrote the code and soon had a new nice efficient way, to scrape through the rental portals webpages and download their data as CSV. So the first part of the code is mainly from Statis Quo and I got it to run again, with my changes (which took me a nice afternoon, about three espresso and some sweat :-)).


The scraping part was not enough for my needs, so the code was extended with a geocoding part. Geopy is used to geocode the CSV with the Nominatim and export a geopackage.
At the moment I use the code to and daily scrape and geocode data. The data later on is cleaned, filtered and imported to a PostgreSQL database and serves as database for several models.

Feel free to use the python script, if you also like to download some real estate data. The code is available on GitHub: https://github.com/hatschito/scrape_geocode_rental_data/

Previous Article