Datasets

This section covers the datasets required to run the course interactively. For archival reasons, all of those listed here have been mirrored in the repository for this course so, if you have downloaded the course, you already have a local copy of them.

Madrid

Airbnb properties

This dataset has been sourced from the course “Spatial Modelling for Data Scientists”. The file imported here corresponds to the v0.1.0 version.

This dataset contains a pre-processed set of properties advertised on the AirBnb website within the region of Madrid (Spain), together with house characteristics.

This dataset is licensed under a license CC0 1.0 Universal Public Domain Dedication.

Airbnb neighbourhoods

This dataset has been directly sourced from the website Inside Airbnb. The file was imported on February 10th 2021.

This dataset contains neighbourhood boundaries for the city of Madrid, as provided by Inside Airbnb.

This dataset is licensed under a license CC0 1.0 Universal Public Domain Dedication.

Arturo

This dataset contains the street layout of Madrid as well as scores of habitability, where available, associated with street segments. The data originate from the Arturo Project, by 300,000Km/s, and the available file here is a slimmed down version of their official street layout distributed by the project.

This dataset is licensed under a license CC0 1.0 Universal Public Domain Dedication.

Sentinel 2 - 120m mosaic

This dataset contains four scenes for the region of Madrid (Spain) extracted from the Digital Twin Sandbox Sentinel-2 collection, by the SentinelHub. Each scene corresponds to the following dates in 2019:

  • January 1st
  • April 1st
  • July 10th
  • November 17th

Each scene includes red, green, blue and near-infrared bands.

This dataset is licensed under a license CC0 1.0 Universal Public Domain Dedication.

Sentinel 2 - 10m GHS composite

This dataset contains a scene for the region of Madrid (Spain) extracted from the GHS Composite S2, by the European Commission.

This dataset is licensed under a license CC0 1.0 Universal Public Domain Dedication.

Cambodia

Pollution

Surface with \(NO_2\) measurements (tropospheric column) information attached from Sentinel 5.

Friction surfaces

This dataset is an extraction of the following two data products by Weiss et al. (2020) {cite}weiss2020global and distributed through the Malaria Atlas Project:

  • Global friction surface enumerating land-based travel walking-only speed without access to motorized transport for a nominal year 2019 (Minutes required to travel one metre)
  • Global friction surface enumerating land-based travel speed with access to motorized transport for a nominal year 2019 (Minutes required to travel one metre)

Each is provided on a separate fie.

Regional aggregates

This dataset relies on boundaries from the Humanitarian Data Exchange. The file is provided by the World Food Programme through the Humanitarian Data Exchange and was accessed on February 15th 2021.`

Pollution and friction aggregated at Level 2 (municipality) administrative boundaries for Cambodia.

This dataset is licensed under a license CC0 1.0 Universal Public Domain Dedication.

Cambodian cities

Extract from the Urban Centre Database (UCDB), version 1.2, of the centroid for Cambodian cities.

This dataset is licensed under a license CC0 1.0 Universal Public Domain Dedication.