Building Geospatial Visualisations In Python

There are many tools that you can use to draw geographic charts in Python. This short guide is meant to show you the fundamental concepts that apply to most of these tools.


  • The 3 most important things you need to plot a geographic map are

    • Geometry - Describes the geographical boundries of the location(s)

      • This is usually provided as a Shapefile or a GeoJson file describing the shape of the locations using polygons
    • Values - The information to be displayed per location e.g. population

      • Typically available as CSV or other structured data formats
    • A geo-visualisation library

Common Geo-Visualisation Libraries

  • In this short guide, we will only show the example on Folium, however the core concepts should be similar on the other alternatives.
  • We will use the following datasets

Step 1. Preparing the data

  • Before we proceed, we need to install geopandas to be able to read geographic files in Python
pip install geopandas
  • Next we download the location data from the link provided above
# Download the zip file to the current directory

# Also download the life expectancy csv file

# Unzip the file

Note: These commands will only run on a bash shell or alternatively you can prefix each comand with ! to make it run from a jupyter code cell

  • Load the data into Python
import pandas as pd
import geopandas

# Load the life expectancy data into a normal Pandas dataframe
df = pd.read_csv('zaf_female_life_expectency_2011.csv')
print("Preview first 2 rows:", df.head(), sep="\n")

# Load the geometry data into a GeoPandas dataframe
gdf = geopandas.read_file('zaf_state/ZAF_STATES.shp')
print("Preview first 2 rows:", gdf.head(), sep="\n")
  • Notice that the names of the provinces are provided in the NAME_1 column on the geopandas dataframe, whereas the life expectency dataset uses the location
  • Lets change this so that both datasets use the name location to describe the provinces. In addition we will discard all the other columns on the geopandas dataframe expect the geometry column
zaf_boundries = gdf.rename(columns={'NAME_1', 'location'}).loc[['location', 'geometry']]

# Also make sure we are using the correct crs(coordinate reference system)
# Folium expects our data to use: EPSG4326 aka Latitude/Longitufe
zaf_boundries = zaf_boundries.to_crs(epsg=4326)

# Finally make sure the locations in both datasets are both lower case
# It is also recommended to replace dashes with forwad slashes
zaf_boundries.location = zaf_boundries.location.apply(lambda s: s.lower().replace('-', '/'))
df.location = zaf_boundries.location.apply(lambda s: s.lower().replace('-', '/'))
  • We finally have a matching key between the quantity we want to display(i.e. life expectancy) and the corresponding geometry(or shape) of the locations
  • We will now pass this information directly to Folium to Plot a Chloropeth map

Step 2: Creating a Chloropeth map

import folium

# First save our geometry data as a geojson file
zaf_boundries.to_file('zaf_boundries.geojson', drivers='GeoJSON')

# The default location that will show on the map
zaf_coords = [-28.4792, 24.6727]

map_obj = folium.Map(location=zaf_coords, zoom_start=6, crs='EPSG3857')
    columns=["location", "age"],
    key_on='',  # json path to our linking key i.e. province
    legend_name="Folium Chloropeth starter example",

# Use this if you have more than one quantity to display e.g. life expectancy, population

# Display map by printing it

# Alternatively save the map to an html file"index.html')
  • Thats all it takes, you can view the full script to reproduce this guide on Github

Closing Remarks

  • This was a brief guide to demonstate how to visualise geographic data in Python using the [Folium] library.
  • To learn more, we encourage you to continue exploring more options through the hyperlinks provided.
  • Thanks for reading!

Written by@NdamuleloNemakh
Welcome to my coding diary, I hope you find it useful!