import momepy
import geopandas
import contextily
import xarray, rioxarray
import osmnx as ox
import numpy as np
import matplotlib.pyplot as plt
6 Transport costs
6.1 Packages and modules
= (
ox.settings.overpass_settings '[out:json][timeout:90][date:"2021-03-07T00:00:00Z"]'
)
6.2 Data
Assuming you have the file locally on the path ../data/
:
= geopandas.read_file("../data/arturo_streets.gpkg")
streets = geopandas.read_file("../data/madrid_abb.gpkg")
abbs = geopandas.read_file("../data/neighbourhoods.geojson") neis
6.3 pandana
graphs
import pandana
Before building the routing network, we convert to graph and back in momepy
to “clean” the network and ensure it complies with requirements for routing.
= momepy.nx_to_gdf( # Convert back to geo-table
nodes, edges # Convert to a clean NX graph
momepy.gdf_to_nx( ='True') # We "explode" to avoid multi-part rows
streets.explode(index_parts
)
)= nodes.set_index("nodeID") # Reindex nodes on ID nodes
Once we have nodes and edges “clean” from the graph representation, we can build a pandana.Network
object we will use for routing:
= pandana.Network(
streets_pdn
nodes.geometry.x,
nodes.geometry.y,"node_start"],
edges["node_end"],
edges["mm_len"]]
edges[[
)
streets_pdn
Generating contraction hierarchies with 8 threads.
Setting CH node vector of size 49985
Setting CH edge vector of size 66499
Range graph removed 444 edges of 132998
. 10% . 20% . 30% . 40% . 50% . 60% . 70% . 80% . 90% . 100%
<pandana.network.Network at 0x16584a850>
6.4 Shortest-path routing
How do I go from A to B?
For example, from the first Airbnb in the geo-table…
= abbs.loc[[0], :].to_crs(streets.crs) first
…to Puerta del Sol.
import geopy
= "gds4eco"
geopy.geocoders.options.default_user_agent = geopandas.tools.geocode(
sol "Puerta del Sol, Madrid", geopy.Nominatim
).to_crs(streets.crs) sol
geometry | address | |
---|---|---|
0 | POINT (440284.049 4474264.421) | Puerta del Sol, Barrio de los Austrias, Sol, C... |
First we snap locations to the network:
= streets_pdn.get_node_ids(
pt_nodes 0], sol.geometry.x.iloc[0]],
[first.geometry.x.iloc[0], sol.geometry.y.iloc[0]]
[first.geometry.y.iloc[
) pt_nodes
0 3071
1 35729
Name: node_id, dtype: int64
Then we can route the shortest path:
= streets_pdn.shortest_path(
route_nodes 0], pt_nodes[1]
pt_nodes[
) route_nodes
array([ 3071, 3476, 8268, 8266, 8267, 18695, 18693, 1432, 1430,
353, 8175, 8176, 18121, 17476, 16858, 14322, 16857, 17810,
44795, 41220, 41217, 41221, 41652, 18924, 18928, 48943, 18931,
21094, 21095, 23219, 15398, 15399, 15400, 47446, 47447, 23276,
47448, 23259, 23260, 23261, 27951, 27952, 27953, 48327, 11950,
11949, 11944, 19475, 19476, 27333, 30088, 43294, 11940, 11941,
11942, 48325, 37484, 48316, 15893, 15890, 15891, 29954, 25453,
7341, 34991, 23608, 28217, 21648, 21649, 21651, 39075, 25108,
25102, 25101, 25100, 48518, 47287, 34623, 31187, 29615, 48556,
22844, 48553, 48555, 40922, 40921, 40923, 48585, 46372, 46371,
46370, 45675, 45676, 38778, 38777, 19144, 20498, 20497, 20499,
47737, 42303, 42302, 35730, 35727, 35729])
With this information, we can build the route line manually.
The code to generate the route involves writing a function and is a bit more advanced than expected for this course. If this looks too complicated, do not despair.
from shapely.geometry import LineString
def route_nodes_to_line(nodes, network):
= network.nodes_df.loc[nodes, :]
pts = geopandas.GeoDataFrame(
s "src_node": [nodes[0]], "tgt_node": [nodes[1]]},
{=[LineString(pts.values)],
geometry=streets.crs
crs
)return s
We can calculate the route:
= route_nodes_to_line(route_nodes, streets_pdn) route
And we get it back as a geo-table (with one row):
route
src_node | tgt_node | geometry | |
---|---|---|---|
0 | 3071 | 3476 | LINESTRING (442606.507 4478714.516, 442597.100... |
Please note this builds a simplified line for the route, not one that is based on the original geometries.
= plt.subplots()
fig, ax
route.plot(=(9, 9),
figsize="red",
color=ax
ax
)
contextily.add_basemap(
ax, =route.crs,
crs=contextily.providers.CartoDB.Voyager,
source=14
zoom
)
plt.show()
But distance calculations are based on the original network). If we wanted to obtain the length of the route:
= streets_pdn.shortest_path_length(
route_len 0], pt_nodes[1]
pt_nodes[
)round(route_len / 1000, 3) # Dist in Km
5.458
Challenge:
- What is the network distance between CEMFI and Puerta del Sol?
- BONUS I: how much longer is it than if you could fly in a straight line?
- BONUS II: if one walks at a speed of 5 Km/h, how long does the walk take you?
6.5 Weighted routing
How do I go from A to B passing by the “best” buildings?
This is really an extension of standard routing that takes advantage of the flexibility of pandana.Network
objects.
Note that the route
we defined above, does not pass by the “best” buildings.
= route.total_bounds
bb
= plt.subplots()
fig, ax
streets.cx[0]: bb[2], bb[1]:bb[3]
bb[
].plot("average_quality", scheme="quantiles", ax=ax
)
="r", linewidth=2.5, ax=ax)
route.plot(color
"Mean Building Quality")
ax.set_title(
ax.set_axis_off()
plt.show()
The overall process to achieve this is the very similar; the main difference is, when we build the Network
object, to replace distance (mm_len
) with a measure that combines distance and building quality. Note that we want to maximise building quality, but the routing algorithms use a minimisation function. Hence, our composite index will need to reflect that.
The strategy is divided in the following steps:
- Re-scale distance between 0 and 1
- Build a measure inverse to building quality in the \([0, 1]\) range
- Generate a combined measure (
wdist
) by picking a weighting parameter - Build a new
Network
object that incorporateswdist
instead of distance - Compute route between the two points of interest
For 1., we can use the scaler in scikit-learn
:
from sklearn.preprocessing import minmax_scale
Then generate and attach to edges
a scaled version of mm_len
:
"scaled_dist"] = minmax_scale(edges["mm_len"]) edges[
We can compare distance with scaled distance. The correlation should be perfect, the scaling is only a change of scale or unit.
= plt.subplots()
fig, ax "mm_len", "scaled_dist", ax=ax)
edges.plot.scatter("Distance Vs Scaled Distance")
ax.set_title( plt.show()
We move on to 2., with a similar approach. We will use the negative of the building quality average (average_quality
):
"scaled_inv_bquality"] = minmax_scale(
edges[-edges["average_quality"]
)
And again, we can plot the relation between building quality and the scaled quality.
= plt.subplots()
fig, ax
edges.plot.scatter("average_quality", "scaled_inv_bquality", ax=ax
)"Quality Vs Inv. Scaled Quality")
ax.set_title( plt.show()
Taking 1. and 2. into 3. we can build wdist
. For this example, we will give each dimension the same weight (0.5), but this is at discretion of the researcher.
= 0.5
w "wdist"] = (
edges["scaled_dist"] * w +
edges["scaled_inv_bquality"] * (1-w)
edges[ )
Now we can recreate the Network
object based on our new measure (4.) and provide routing. Since it is the same process as with distance, we will do it all in one go:
# Build new graph object
= pandana.Network(
w_graph
nodes.geometry.x,
nodes.geometry.y,"node_start"],
edges["node_end"],
edges["wdist"]]
edges[[
)# Snap locations to their nearest node
= w_graph.get_node_ids(
pt_nodes 0], sol.geometry.x.iloc[0]],
[first.geometry.x.iloc[0], sol.geometry.y.iloc[0]]
[first.geometry.y.iloc[
)# Generate route
= w_graph.shortest_path(
w_route_nodes 0], pt_nodes[1]
pt_nodes[
)# Build LineString
= route_nodes_to_line(
w_route
w_route_nodes, w_graph )
Generating contraction hierarchies with 8 threads.
Setting CH node vector of size 49985
Setting CH edge vector of size 66499
Range graph removed 444 edges of 132998
. 10% . 20% . 30% . 40% . 50% . 60% . 70% . 80% . 90% . 100%
Now we are ready to display it on a map:
= plt.subplots()
fig, ax # Building quality
streets.plot("average_quality",
="quantiles",
scheme="magma",
cmap=0.5,
linewidth=(9, 9),
figsize=ax
ax
)# Shortest route
route.plot(="xkcd:orange red", linewidth=3, ax=ax, label="Shortest"
color
)# Weighted route
w_route.plot(="xkcd:easter green", linewidth=3, ax=ax, label="Weighted"
color
)# Styling
ax.set_axis_off()
plt.legend() plt.show()
Challenge:
1. Explore the differences in the output of weighted routing if you change the weight between distance and the additional constrain.
2. Recreate weighted routing using the linearity of street segments. How can you go from A to B avoiding long streets?
6.6 Proximity
What is the nearest internet cafe for Airbnb’s without WiFi?
First we identify Airbnb’s without WiFi:
= abbs.query(
no_wifi "WiFi == '0'"
).to_crs(streets.crs)
Then pull WiFi spots in Madrid from OpenStreetMap:
= ox.features_from_place(
icafes "Madrid, Spain", tags={"amenity": "internet_cafe"}
).to_crs(streets.crs).reset_index()
= plt.subplots()
fig, ax
no_wifi.plot(="red",
color=1,
markersize=0.5,
alpha="Airbnb no WiFi",
label=(9, 9),
figsize=ax
ax
)
icafes.plot(=ax, color="lime", label="Internet cafes"
ax
)
contextily.add_basemap(
ax, =no_wifi.crs,
crs=contextily.providers.CartoDB.Voyager
source
)
ax.set_axis_off()
plt.legend() plt.show()
The logic for this operation is the following:
- Add the points of interest (POIs, the internet cafes) to the network object (
streets_pdn
) - Find the nearest node to each POI
- Find the nearest node to each Airbnb without WiFi
- Connect each Airbnb to its nearest internet cafe
We can add the internet cafes to the network object (1.) with the set_pois
method. Note we set maxitems=1
because we are only going to query for the nearest cafe. This will make computations much faster.
streets_pdn.set_pois(="Internet cafes", # Our name for the layer in the `Network` object
category=1, # Use to count only nearest cafe
maxitems=100000, # 100km so everything is included
maxdist=icafes.geometry.x, # X coords of cafes
x_col=icafes.geometry.y, # Y coords of cafes
y_col )
Once the cafes are added to the network, we can find the nearest one to each node (2.). Note there are some nodes for which we can’t find a nearest cafe. These are related to disconnected parts of the network.
= streets_pdn.nearest_pois(
cafe2nnode 100000, # Max distance to look for
"Internet cafes", # POIs to look for
=1, # No. of POIs to include
num_pois=True # Store POI ID
include_poi_ids# Then add the internet cafee IDs and name
).join('osmid', 'name']],
icafes[[="poi1"
on# Rename the distance from node to cafe
).rename(={1: "dist2icafe"}
columns
) cafe2nnode.head()
dist2icafe | poi1 | osmid | name | |
---|---|---|---|---|
nodeID | ||||
0 | 5101.421875 | 9.0 | 3.770327e+09 | Silver Envíos 2 |
1 | 5190.265137 | 9.0 | 3.770327e+09 | Silver Envíos 2 |
2 | 5252.475098 | 9.0 | 3.770327e+09 | Silver Envíos 2 |
3 | 5095.101074 | 9.0 | 3.770327e+09 | Silver Envíos 2 |
4 | 5676.117188 | 9.0 | 3.770327e+09 | Silver Envíos 2 |
To make things easier down the line, we can link cafe2nnode
to the cafe IDs. And we can also link Airbnb’s to nodes (3.) following a similar approach as we have seen above:
= streets_pdn.get_node_ids(
abbs_nnode
no_wifi.geometry.x, no_wifi.geometry.y
) abbs_nnode.head()
26 8872
50 10905
62 41158
63 34257
221 32215
Name: node_id, dtype: int64
Finally, we can bring together both to find out what is the nearest internet cafe for each Airbnb (4.).
= no_wifi[
abb_icafe "geometry"] # Keep only geometries of ABBs w/o WiFi
[
].assign(=abbs_nnode # Attach to thse ABBs the nearest node in the network
nnode# Join to each ABB the nearest cafe using node IDs
).join(
cafe2nnode, ="nnode"
on
) abb_icafe.head()
geometry | nnode | dist2icafe | poi1 | osmid | name | |
---|---|---|---|---|---|---|
26 | POINT (443128.256 4483599.841) | 8872 | 4926.223145 | 9.0 | 3.770327e+09 | Silver Envíos 2 |
50 | POINT (441885.677 4475916.602) | 10905 | 1876.392944 | 19.0 | 6.922981e+09 | Locutorio |
62 | POINT (440439.640 4476480.771) | 41158 | 1164.812988 | 17.0 | 5.573414e+09 | NaN |
63 | POINT (438485.311 4471714.377) | 34257 | 1466.537964 | 5.0 | 2.304485e+09 | NaN |
221 | POINT (439941.104 4473117.914) | 32215 | 354.268005 | 15.0 | 5.412145e+09 | NaN |
Challenge: Calculate distances to nearest internet cafe for ABBs with WiFi. On average, which of the two groups (with and without WiFi) are closer to internet cafes?
6.7 Accessibility
This flips the previous question on its head and, instead of asking what is the nearest POI to a given point, along the network (irrespective of distance), it asks how many POIs can I access within a network-based distance radius?
= ox.features_from_place(
parks "Madrid, Spain", tags={"leisure": "park"}
).to_crs(streets.crs)
- For example, how many parks are within 500m(-euclidean) of an Airbnb?
We draw a radius of 500m around each AirBnb:
= geopandas.GeoDataFrame(
buffers =abbs.to_crs(
geometry
streets.crsbuffer(
).500
) )
Then intersect it with the location of parks, and count by buffer (ie. Airbnb):
= geopandas.sjoin(
park_count
parks, buffers
).groupby("index_right"
).size()
- How many parks are within 500m(-network) of an Airbnb?
We need to approach this as a calculation within the network. The logic of steps thus looks like:
- Use the aggregation module in
pandana
to count the number of parks within 500m of each node in the network - Extract the counts for the nodes nearest to Airbnb properties
- Assign park counts to each Airbnb
We can set up the aggregate engine (1.). This involves three steps:
- Obtain nearest node for each park
= streets_pdn.get_node_ids(
parks_nnode
parks.centroid.x, parks.centroid.y )
- Insert the parks’ nearest node through
set
so it can be “aggregated”
set(
streets_pdn.="Parks"
parks_nnode, name )
- “Aggregate” for a distance of 500m, effectively counting the number of parks within 500m of each node
= streets_pdn.aggregate(
parks_by_node =500, type="count", name="Parks"
distance
) parks_by_node.head()
nodeID
0 5.0
1 5.0
2 6.0
3 8.0
4 1.0
dtype: float64
At this point, we have the number of parks within 500m of every node in the network. To identify those that correspond to each Airbnb (3.), we first pull out the nearest nodes to each ABB:
= abbs.to_crs(streets.crs).geometry
abbs_xys = streets_pdn.get_node_ids(
abbs_nnode
abbs_xys.x, abbs_xys.y )
And use the list to assign the count of the nearest node to each Airbnb:
= abbs_nnode.map(
park_count_network
parks_by_node
) park_count_network.head()
0 4.0
1 9.0
2 5.0
3 0.0
4 12.0
Name: node_id, dtype: float64
- For which areas do both differ most?
We can compare the two counts above to explore to what extent the street layout is constraining access to nearby parks.
= geopandas.GeoDataFrame(
park_comp
{"Euclidean": park_count,
"Network": park_count_network
},=abbs.geometry,
geometry=abbs.crs
crs )
= plt.subplots()
fig, ax "Euclidean", "Network", ax=ax)
park_comp.plot.scatter(0, 0], [1, 1], color='red') #45-degree line
ax.axline([ plt.show()
Note there are a few cases where there are more network counts than Euclidean. These are due to the slight inaccuracies introduced by calculating network distances from nodes rather than the locations themselves.
Geographically:
= plt.subplots(1, 3, figsize=(15, 5))
fig, axs
# Euclidean count
abbs.to_crs(
streets.crs
).assign(=park_count
n_parks0).plot(
).fillna("n_parks",
="fisherjenkssampled",
scheme=0.5,
alpha=1,
markersize=True,
legend=axs[0]
ax
)
contextily.add_basemap(0],
axs[=streets.crs,
crs=contextily.providers.CartoDB.PositronNoLabels
source
)0].set_axis_off()
axs[0].set_title("Euclidean Distances")
axs[
# Count difference
= park_comp.query(
with_parks "(Network > 0) & (Euclidean > 0)"
)= 100 * (
count_diff "Euclidean"] -
with_parks["Network"]
with_parks[/ with_parks["Euclidean"]
)
abbs.to_crs(
streets.crs
).assign(=count_diff
n_parks
).dropna().plot("n_parks",
="fisherjenkssampled",
scheme=0.5,
alpha=1,
markersize=True,
legend=axs[1]
ax
)
contextily.add_basemap(1],
axs[=streets.crs,
crs=contextily.providers.CartoDB.PositronNoLabels
source
)1].set_axis_off()
axs[1].set_title("Count Difference (%)")
axs[
# Network count
abbs.to_crs(
streets.crs
).assign(=park_count_network
n_parks0).plot(
).fillna("n_parks",
="fisherjenkssampled",
scheme=0.5,
alpha=1,
markersize=True,
legend=axs[2]
ax
)
contextily.add_basemap(2],
axs[=streets.crs,
crs=contextily.providers.CartoDB.PositronNoLabels
source
)2].set_axis_off()
axs[2].set_title("Network Distances")
axs[
plt.show()
Challenge: Calculate accessibility to other ABBs from each ABB through the network. How many ABBs can you access within 500m of each ABB?
Note you will need to use the locations of ABBs both as the source and the target for routing in this case.
6.8 Next steps
If you found the content in this block useful, the following resources represent some suggestions on where to go next:
- The
pandana
tutorial and documentation are excellent places to get a more detailed and comprehensive view into the functionality of the library