street-network-models
Street network models and indicators for every urban area in the world
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
✓Committers with academic emails
1 of 1 committers (100.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.2%) to scientific vocabulary
Repository
Street network models and indicators for every urban area in the world
Basic Info
Statistics
- Stars: 86
- Watchers: 3
- Forks: 11
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Urban Street Network Models and Indicators
This project uses OSMnx to model and analyze the street networks of every urban area in the world then shares the results (models and indicators) in an open data repository in the Harvard Dataverse.
Citation
Boeing, G. 2025. Urban Science Beyond Samples: Updated Street Network Models and Indicators for Every Urban Area in the World. Working paper. https://github.com/gboeing/street-network-models
Computing environment
The following sections provide notes on reproducibility. Given the resource requirements, it's best to run the workflow in a high-performance computing cluster, but it's feasible to run it on a well-equipped personal computer.
System requirements:
- RAM/CPU: minimum of 32gb for single-threaded execution (note: you'll have to edit
config.jsonto set the CPU counts to 1). Recommended 128gb + 24 CPU cores for multithreaded execution as parameterized in the config file. - Disk space: 2 terabytes.
- OS: agnostic, but this workflow was developed and tested on Linux.
Runtime environment: create a new conda environment using the environment.yml file to install all the necessary packages to run the workflow. You can install a Jupyter kernel in it, if you wish, like python -m ipykernel install --user --name snm --display-name "Python (snm)".
Input data
Create a project data root folder with a inputs subfolder and place the unzipped input data in it. This project uses the Global Human Settlement Layer urban centers dataset to define the world's urban areas' boundary polygons, specifically, their Urban Centre Database 2025:
Mari Rivero, Ines; Melchiorri, Michele; Florio, Pietro; Schiavina, Marcello; Goch, Katarzyna; Politis, Panagiotis; Uhl, Johannes; Pesaresi, Martino; Maffenini, Luca; Sulis, Patrizia; Crippa, Monica; Guizzardi, Diego; Pisoni, Enrico; Belis, Claudio; Jacome Felix Oom, Duarte; Branco, Alfredo; Mwaniki, Dennis; Kochulem, Edwin; Githira, Daniel; Carioli, Alessandra; Ehrlich, Daniele; Tommasi, Pierpaolo; Kemper, Thomas; Dijkstra, Lewis (2024): GHS-UCDB R2024A - GHS Urban Centre Database 2025. European Commission, Joint Research Centre (JRC) [Dataset] doi: 10.2905/1a338be6-7eaf-480c-9664-3a8ade88cbcd PID: http://data.europa.eu/89h/1a338be6-7eaf-480c-9664-3a8ade88cbcd
Workflow
The workflow is organized into folders and scripts, as follows.
1. Construct models
1.1. Prep data
Load the GHS urban centers dataset, retain useful columns, save as a GeoPackage file.
1.2. Download cache
Uses OSMnx to download OSM raw data to a cache for subsequent parallel processing.
1.3. Create graphs
Use cached OSM raw data to construct a MultiDiGraph of each street network. Can be done in parallel with multiprocessing by changing cpus config setting. Saves to disk as GraphML file. Parameterized to get only drivable streets, retain all, simplify, and truncate by edge. Does this for every urban center's polygon boundary if it meets the following conditions:
- is marked with a "high" quality control score
- has >1 km2 built-up area
- includes ≥3 nodes
2. Attach elevation
This project uses three data sources for elevation:
- ASTERv3 GDEM at 30 meter resolution
- SRTMGL1 GDEM at 30 meter resolution with voids filled (version 3.0 global 1 arc second)
- Google Maps Elevation API
We use ASTER and SRTM to attach elevation data to each graph node in each model, then calculate edge grades. Both of these are public, free, open data. We just use Google Maps elevation as a validation dataset.
A few notes. A previous iteration of this project used to use CGIAR's post-processed SRTM v4.1, but they only provide 90m resolution SRTM data. The Google billing scheme is changing in March 2025, rendering Google elevation data collection at this scale possibly infeasible in the future without substantial funding to pay for it. Historically, each billing account gets $200 usage credit free each month. The price per HTTP request was $0.005. Therefore you would get up to 200 / 0.005 = 40,000 free requests each month, within the usage limits of 512 locations per request and 6,000 requests per minute. URLs must be properly encoded to be valid and are limited to 16,384 characters for all web services. With three billing accounts, you could process this entire workflow for free once a month.
2.1. ASTER and SRTM
2.1.1. Download ASTER
Download each ASTER DEM tif file (requires NASA EarthData login credentials).
2.1.2. Download SRTM
Download each SRTM DEM hgt file (requires NASA EarthData login credentials).
2.1.3. Build VRTs
Build two VRT virtual raster files (one for all the ASTER files and one for all the SRTM files) for subsequent querying.
2.1.4. Attach node elevations
Load each GraphML file saved in step 1.3 and add SRTM and ASTER elevation attributes to each node by querying the VRTs then resave the GraphML to disk.
2.2. Google Elevation
2.2.1. Cluster nodes
We want to send node coordinates to the elevation API in batches. But the batches need to consist of (approximately) adjacent nodes because the Google API uses a smoothing function to estimate elevation. If the nodes are from different parts of the planet (or at different elevations), this smoothing will result in very coarse-grained approximations of individual nodes' elevations. So, load all the node coordinates for each graph and spatially cluster them into equal-size clusters of 512 coordinates apiece, then save as a CSV file.
2.2.2. Make URLs
Load the CSV file of node clusters and construct an API URL for each, with a key (requires 3 Google API keys).
2.2.3. Download Google elevations
Request each URL and save node ID and elevation to disk for all nodes.
2.2.4. Choose best elevation
Load each GraphML file and select either ASTER or SRTM to use as the official node elevation value, for each node, based on which is closer to the Google value (as a tie-breaker). Then calculate all edge grades and add as edge attributes. Re-save graph to disk as GraphML.
3. Calculate stats
3.1. Calculate betweenness centrality
Load each GraphML file and calculate length-weighted node betweenness centrality for all nodes, using IGraph.
3.2. Calculate stats
Load each saved graph's GraphML file. Calculate each stat as described in the metadata file.
3.3. Merge stats
Merge the street network stats with the urban centers stats (from the GeoPackage file created in step 1.1). Save to disk with indicators named as described in the metadata file.
3.4. Create metadata
Create metadata files for the graphs (node/edge attributes) and stats.
4. Upload repository
4.1. Generate files
Save graphs to disk as GeoPackages and node/edge list files. Then ensure we have what we expect: verify that we have the same number of countries for each file type, the same number of gpkg, graphml, and node/edge list files, and that the same set of country/city names exists across gkpg, graphml, and node/edge lists.
4.2. Stage files
Compress and zip all model files (GeoPackages, GraphML, node/edge lists) into a staging area for upload to Dataverse.
4.3. Upload to Dataverse
Upload to Dataverse using their v1 Native API. First log in and create an API key if you don't have an active one (they expire annually). If this is a revision to existing datasets, create a draft dataset revision on the Dataverse (edit dataset > metadata > change something > save). Otherwise, if this is the first upload ever, create a new Dataverse and new empty datasets within it, structured like:
- Global Urban Street Networks
- Global Urban Street Networks GeoPackages
- Global Urban Street Networks GraphML Files
- Global Urban Street Networks Node/Edge Lists
- Global Urban Street Networks Measures
- Global Urban Street Networks Metadata
Then run the script to upload all the repository files automatically to their respective datasets in the Dataverse (note: if this a dataset revision, set delete_existing = True to first clear out all the carried-over files in the draft). Next, manually upload the indicators and metadata files to their respective datasets in the Dataverse. Finally, visit the Dataverse on the web to publish the draft.
Owner
- Name: Geoff Boeing
- Login: gboeing
- Kind: user
- Location: Los Angeles, California
- Company: University of Southern California
- Website: https://geoffboeing.com/
- Twitter: gboeing
- Repositories: 60
- Profile: https://github.com/gboeing
Urban planning professor at USC: urban analytics, street networks, rental markets, data science.
Citation (CITATION.cff)
cff-version: 1.2.0
title: "Urban Street Network Models and Measures"
authors:
- family-names: "Boeing"
given-names: "Geoff"
orcid: "https://orcid.org/0000-0003-1851-6411"
website: "https://geoffboeing.com"
url: "https://github.com/gboeing/street-network-models"
repository-code: "https://github.com/gboeing/street-network-models"
preferred-citation:
type: report
title: "Urban Science Beyond Samples: Updated Street Network Models and Indicators for Every Urban Area in the World"
authors:
- family-names: "Boeing"
given-names: "Geoff"
orcid: "https://orcid.org/0000-0003-1851-6411"
website: "https://geoffboeing.com"
year: 2025
url: "https://github.com/gboeing/street-network-models"
GitHub Events
Total
- Watch event: 9
- Delete event: 6
- Push event: 49
- Pull request event: 11
- Fork event: 2
- Create event: 8
Last Year
- Watch event: 9
- Delete event: 6
- Push event: 49
- Pull request event: 11
- Fork event: 2
- Create event: 8
Committers
Last synced: 10 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Geoff Boeing | b****g@u****u | 162 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 1
- Total pull requests: 14
- Average time to close issues: 4 days
- Average time to close pull requests: 5 days
- Total issue authors: 1
- Total pull request authors: 2
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 12
- Bot issues: 0
- Bot pull requests: 1
Past Year
- Issues: 0
- Pull requests: 13
- Average time to close issues: N/A
- Average time to close pull requests: 5 days
- Issue authors: 0
- Pull request authors: 2
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 11
- Bot issues: 0
- Bot pull requests: 1
Top Authors
Issue Authors
- nlnathan5 (1)
Pull Request Authors
- gboeing (13)
- dependabot[bot] (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/cache v4 composite
- actions/checkout v4 composite
- mamba-org/setup-micromamba v2 composite