quantifying-barriers-of-urban-mobility
Quantifying Barriers of Urban Mobility
https://github.com/pintergreg/quantifying-barriers-of-urban-mobility
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (5.5%) to scientific vocabulary
Keywords
Repository
Quantifying Barriers of Urban Mobility
Basic Info
- Host: GitHub
- Owner: pintergreg
- License: bsd-2-clause
- Language: Python
- Default Branch: main
- Homepage: https://pintergreg.github.io/quantifying-barriers-of-urban-mobility/
- Size: 86.5 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 1
Topics
Metadata Files
README.md
code for Quantifying Barriers of Urban Mobility
This repository contains scripts to reproduce the results of the research paper: Quantifying Barriers of Urban Mobility
input data
As described in the paper the raw mobile positioning data cannot be shared, but the networks constructed from the mobility is available. The CSVs are edgelists, which connects places in Budapest, hence the filenames, 'place connections'. The second part of the filename is the date interval and there is an optional variant identifier.
The public input files for
- the pre-pandemic interval: data/place_connections_2019-09-01_2020-02-29.csv
- the COVID-19 interval: data/place_connections_2020-11-01_2021-04-31.csv
- the Liberty Bride case study
- 1-2 June 2019: data/place_connections_2019-06-01_2019-06-02.csv
- 8-9 June 2019: data/place_connections_2019-06-08_2019-06-09.csv
- 15-16 June 2019: data/place_connections_2019-06-15_2019-06-16.csv
- 22-23 June 2019: data/place_connections_2019-06-22_2019-06-23.csv
- 29-30 June 2019: data/place_connections_2019-06-29_2019-06-30.csv
- 6-7 July 2019: data/place_connections_2019-07-06_2019-07-07.csv
- 13-14 July 2019: data/place_connections_2019-07-13_2019-07-14.csv
- 20-21 July 2019: data/place_connections_2019-07-20_2019-07-21.csv
- 27-28 July 2019: data/place_connections_2019-07-27_2019-07-28.csv
- Budapest district groups vs. agglomeration sector setting
- the seven district group of Budapest
- place_connections_2019-09-01_2020-02-29_eastern_pest_inner.csv
- place_connections_2019-09-01_2020-02-29_eastern_pest_outer.csv
- place_connections_2019-09-01_2020-02-29_inner_pest.csv
- place_connections_2019-09-01_2020-02-29_north_buda.csv
- place_connections_2019-09-01_2020-02-29_north_pest.csv
- place_connections_2019-09-01_2020-02-29_south_buda.csv
- place_connections_2019-09-01_2020-02-29_south_pest.csv
- the six sectors of the agglomeration
- place_connections_2019-09-01_2020-02-29_northern_sector.csv
- place_connections_2019-09-01_2020-02-29_eastern_sector.csv
- place_connections_2019-09-01_2020-02-29_south_eastern_sector.csv
- place_connections_2019-09-01_2020-02-29_southern_sector.csv
- place_connections_2019-09-01_2020-02-29_western_sector.csv
- place_connections_2019-09-01_2020-02-29_north_western_sector.csv
[!IMPORTANT] each of the Liberty Bridge file have a downtown variant
edgelist schema
- device_id: integer
- day: date in YYYY-MM-DD format
- source: block ID (integer) in the house_blocks.geojson
- target: block ID (integer) in the house_blocks.geojson
- weight: integer, edge weight describing how many times a device moved between block A and B in a day
- distance: integer (in kilometers), the euclidean distance between the centroid of the source and target block
- practically unused
geographic data
The geographic data is extracted from OpenStreetMap.
- administrative
- natural
- infrastructure
- see below at barriers section
- miscellaneous
- handdrawn shapefile of Budapest downtown
- used for the Liberty Bridge case study <!-- - dictionary that associates the settlements to the sectors of the agglomeration
- defined by the Hungarian Central Statistical Office -->
- handdrawn shapefile of Budapest downtown
barriers
- primary roads:
data/barriers_mp.geojson- contains only motorways and primary roads according to OSM highways types, because trunks are not significant in Budapest

- secondary roads:
data/barriers_mps.geojson- for continuity it also contains primary roads
- railways:
data/barriers_railways.geojson - river:
data/barriers_river.geojson- river Danube as a LineString
Roads are extracted from OpenStreetMap using OSMnx.
The get_roads.py script was developed to do the task.
The river is extracted with the get_rivers.py script.
generate barriers
The 'city enclosures' module is responsible for generating the barrier files. Detailed information is available about its arguments at the here.
- for primary roads
python get_barrier_polygons.py -a ../data/budapest.geojson -o output/barriers_mp --no-railway -s motorway primary --river ../data/duna.geojson -t 0 -b 0 - for secondary roads
python get_barrier_polygons.py -a ../data/budapest.geojson -o output/barriers_mps --no-railway -s motorway primary secondary --river ../data/duna.geojson -t 0 -b 0 - for the river
poetry run python get_barrier_polygons.py -a data/budapest.geojson -o output/barriers_river -t 0 -b 0 --street --no-railway --river data/duna.geojson<!-- - for the railways -->
[!IMPORTANT] This extracts data from OSM, if the road network changes upstream, the result can also change. The barrier files used for this study are included in this repository.
blocks
The blocks shapefile is just a special case of the roads, where every OSM highway type is taken into consideration.
python get_barrier_polygons.py -a ../data/budapest.geojson -o output/house_blocks_new --no-railway -s motorway trunk primary secondary tertiary unclassified residential --river ../data/duna.geojson
workflow
1. Detect communities
poetry run python src/place_network_louvain.py --observed-network data/#{input} --block data/house_blocks.geojson --community-dir place_communities/#{network}
2. Generate beeline trips
After observed, run with seed (-seed) values from 0 to 9.
For the two main networks (pre-pandemic and pandemic) the community detection is issued 10 times with seed values 0-9.
poetry run python src/generate_beeline_trips.py --observed observed --blocks data/house_blocks.geojson --network-dir output/network/
poetry run python src/generate_beeline_trips.py --observed observed --seed 0 --blocks data/house_blocks.geojson --network-dir output/network/
3. Calculate barrier crossings
Run also for all the random networks (--network) values from seed00 to seed99.
python src/calculate_barrier_crossings.py --network observed --multithreading --roads output/roads/ --river data/duna_linestring.geojson --admin-data data
4. Convert to NetworkX edgelist
The 'place connection' CSVs are practically edgelists, but still contains additional information about the edges. Some scripts are designed to work exclusively with NetworkX edgelists. The purpose of this script is to make the conversion.
poetry run python src/convert_to_edgelist.py --input data/<INPUT> --output output/network/ --suffix <SUFFIX>
where <INPUT> is a place connection CSV and <SUFFIX> is an optional variant identifier.
5. Generate beeline trips
As in the original, raw data there are no complete trajectories, the route between the origin and destination is assumed as a straight line.
In this step the straight lines are generated as geometries (LineStrings) between the centroid of the source and target blocks.
poetry run python src/generate_beeline_trips.py --observed observed_<INPUT> --blocks data/house_blocks.geojson --network-dir output/network/
where <INPUT> is a NetworkX edgelist without extension. Note that it is also gzipped (.edgelist.gz), but the previous step takes care of the compression.
In other words, the INPUT is actually the 'name' of the network.
6. Calculate barrier crossings
In this step the barrier crossing are calculated, if a trip (as a LineString) intersects a barrier geometry, then that is counted as a crossing.

poetry run python src/calculate_barrier_crossings.py --network observed_<INPUT> --multithreading --pool 2
where the <INPUT> is the 'name' of the network, as before.
The --pool argument can be used to increase the parallelism in the processing, but note that it will increase the RAM usage as well.
7. Calculate community crossings
Similar to the previous step, but calculates the community border crossings.

poetry run python src/calculate_community_crossings.py --network observed_<INPUT> --communities output/place_communities/<INPUT>/louvain --run-stop 10
where the <INPUT> is the 'name' of the network, as before.
For the --run-stop argument set how many times the Louvain community detection was executed. The script will calculate the community crossings for each execution.
Do not change it unless you know what you are doing.
8. Calculating the Barrier Crossing Ratio
In this step, the Barrier Crossing Ratio (BCR) is calculated, using the output of the previous two steps. BCR is defined as follows.
$$ BCR\gamma = \dfrac{1}{n}\frac{\sum{m} \text{{CB}}(Mk^i, U)}{\sum{m} \left(\text{{CB}}(Mk^i, U) \times \text{{CC}}(Mk^i, C_{\gamma})\right)} $$
where $m$ is the total number of mobility edges, $CB$ is a binary function that evaluates to 1 if $Mk^i$, the $k^{th}$ mobility edge from block $i$, crosses an urban barrier $U$ and 0 otherwise, while the function $CC$ takes the value of 1 if $Mk^i$ crosses mobility clusters and $n$ is the number of Louvain iterations at resolution $\gamma$.
poetry run python src/null_model_obs_ratio.py --barrier-crossing output/barrier_crossing/observed_<INPUT> --community-crossing output/community_crossing/observed_<INPUT> --output output/obs_ratio/<INPUT>
where the <INPUT> is the 'name' of the network, as before.
[!IMPORTANT] The Barrier Crossing Ration (BCR) output files are tracked in this repository to provide them without the requirement of executing the whole data processing pipeline.
pipelines
To aid and (partially) automatize the comutation workflow some 'pipeline' script were developed.
Three of them is (currently) written in Ruby (requires at least 3.0), the fourth one is in Python (can work within the predefined environment, see the details below).
- pipeline.rb: can be used to process a network
- pipeline_libertybridge.rb: specifically developed to process the networks for Liberty Bridge case study
- pipelineweekdayweekend.rb: specifically developed to process the networks for workday-holiday comparison networks
- pipeline_group.py: specifically developed to process the networks for district group vs. agglomeration comparisions
requirements
The project is managed by Poetry, the dependencies are listed in pyproject.toml.
To install dependencies and set up virtual environment, use the command poetry lock && poetry install.
[!WARNING] Executing the scripts describe above will create partial result files of tens of gigabytes. This is because of the multiple execution of the Louvain community detection. Even with using the pipeline script, calculating everything will take several hours.
[!WARNING] Although precise hardware resource requirements are not provided, please note that most of the data processing can be done using a laptop with an 11th Gen Intel® Core™ i5-1135G7 processor and 40 GB of RAM available. The only exception is the OLS calculation which requires much more RAM as the current implementation (gravity.jl) keeps every model in memory until the result table is generated.
license
- The code is licensed under BSD-2-Clause
- The observed networks (data/place_connections*.csv) are licensed under Open Data Commons Open Database License (ODbL)
- The documentation and figures are CC BY 4.0
- The shape files are from OpenStreetMap and licensed under the Open Data Commons Open Database License (ODbL)
debug
Development repository version: 788c3c6f33b42f442e4f7c2e0023c0b7551c3752 Please note that the development repository is not public, this is added only for debugging purposes.Owner
- Name: Gergő Pintér
- Login: pintergreg
- Kind: user
- Location: Budapest, Hungary
- Twitter: pintergreg
- Repositories: 24
- Profile: https://github.com/pintergreg
data scientist, PhD | Research Fellow, Corvinus University of Budapest
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: code and data for Quantifying Barriers of Urban Mobility
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Gergő
family-names: Pintér
email: gergo.pinter@uni-corvinus.hu
orcid: 'https://orcid.org/0000-0003-4731-3816'
affiliation: Corvinus University of Budapest
repository-code: 'https://github.com/pintergreg/quantifying-barriers-of-urban-mobility'
abstract: >-
Software to reproduce the results of the research paper "Quantifying Barriers of Urban Mobility"
license: BSD-2-Clause
GitHub Events
Total
- Release event: 1
- Push event: 23
- Public event: 1
- Create event: 2
Last Year
- Release event: 1
- Push event: 23
- Public event: 1
- Create event: 2
Dependencies
- certifi 2024.12.14
- charset-normalizer 3.4.1
- contourpy 1.3.1
- cycler 0.12.1
- fonttools 4.55.3
- geopandas 1.0.1
- haversine 2.8.1
- idna 3.10
- joblib 1.4.2
- kiwisolver 1.4.8
- matplotlib 3.10.0
- matplotlib-scalebar 0.8.1
- more-itertools 10.5.0
- networkx 3.4.2
- numpy 2.2.1
- osmnx 2.0.1
- packaging 24.2
- pandas 2.2.3
- pillow 11.1.0
- pyarrow 18.0.0
- pyogrio 0.10.0
- pyparsing 3.2.1
- pyproj 3.7.0
- python-dateutil 2.9.0.post0
- pytz 2024.2
- pyyaml 6.0.2
- requests 2.32.3
- scikit-learn 1.6.1
- scipy 1.15.1
- seaborn 0.13.2
- shapely 2.0.6
- six 1.17.0
- threadpoolctl 3.5.0
- tzdata 2024.2
- urllib3 2.3.0
- PyYAML ~6.0
- geopandas ~1.0
- haversine ~2.8.0
- matplotlib ~3.10
- matplotlib-scalebar ~0.8.1
- more-itertools ~10.5
- networkx ~3.4
- numpy ~2.2
- osmnx ~2.0
- pandas ~2.2
- pyarrow ~18.0
- pyogrio ~0.10
- python ~3.13
- scikit-learn ~1.6
- seaborn ~0.13
- shapely ~2.0
- actions/checkout v4 composite
- peaceiris/actions-gh-pages v4 composite
- peaceiris/actions-hugo v3 composite