DCC Demo repository

1. Urban population density by built area

An urbanist interested in space logistics calculates a more accurate value for urban population density based on the total area of built environment within a municipality. The total area of a city includes water, roads, sidewalks etc., but in this case, the researcher is interested in only the area occupied by buildings, i.e., the total ground floor area of all buildings in the city. They use data from the PDOK (built area) and CBS (population size) to find a value of population density per built area. They must convert the PDOK data, which is offered in .gml format, into geoJSON prior to conducting their analysis using Python in a Jupyter notebook.

Link to Repository: https://github.com/delft-dh/DCC-Demo-Population-Density-by-Built-Area

Concepts illustrated:

  • Geospatial file type conversion (GML to GeoJSON)

  • Extracting layers by conditional values

  • Calculating spatial area by polygon label

  • Managing packages (and conflicts) using conda environments

Tools demonstrated:

  • Python

  • geopandas for spatial area calculation

  • ogr2ogr for data conversion

  • Jupyter notebooks

2. WWII bombings heat map

A historian studying infrastructure damaged in WWII visualises the volume of bombs that landed in countries throughout the world during the war. Data are represented in a heatmap. All (python) code is available and reproducible from a Jupyter notebook.

Link to Repository: https://github.com/delft-dh/DCC-Demo-Heatmap-of-WWII-Bombings

Concepts illustrated:

  • Version control with Git

  • Data cleaning

  • Creating a custom heatmap

Tools demonstrated:

  • Python

  • Geopandas (Python library for geospatial data)

  • Jupyter Notebooks

  • Git

Based on this tutorial from the Programming Historian, with a case study of WWII bombings.

Ideas for Demo Repositories

These ideas were generated by members of the DDH community as potentially useful reproducible data analysis processes. Though DDH-specific examples do not yet exist, references to use to develop them have been included where possible. Development of these challenges can be taken on by any any member of the community - it’s a great chance to develop your coding skills and share with everyone!

Polygon Area Overlap

“Between the 10 datasets that are loaded, which two show the most, or least, similarity in spatial patterns?”

Challenge: Calculate degree to which polygons in n spatial datasets overlap.

Geotagging images for spatial analysis

Reference: