https://github.com/anerv/bikedna_analysis

Code for analyzing the results from running BikeDNA BIG (https://github.com/anerv/BikeDNA_BIG) on bicycle infrastructure data from Denmark.

https://github.com/anerv/bikedna_analysis

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (18.3%) to scientific vocabulary

Keywords

bicycle-infrastructure bicycle-network data-quality geospatial-data open-street-map sustainable-mobility urban-planning volunteered-geographic-information
Last synced: 5 months ago · JSON representation

Repository

Code for analyzing the results from running BikeDNA BIG (https://github.com/anerv/BikeDNA_BIG) on bicycle infrastructure data from Denmark.

Basic Info
  • Host: GitHub
  • Owner: anerv
  • License: agpl-3.0
  • Language: Jupyter Notebook
  • Default Branch: main
  • Homepage:
  • Size: 359 MB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
bicycle-infrastructure bicycle-network data-quality geospatial-data open-street-map sustainable-mobility urban-planning volunteered-geographic-information
Created almost 3 years ago · Last pushed about 2 years ago
Metadata Files
Readme License

README.md

BikeDNA logo

Analysis of BikeDNA Denmark

This repository contains the code for analyzing the results from running BikeDNA, in a version adapted for large data sets, on nationwide data for Denmark, comparing data from OpenStreetMap (OSM) and GeoDanmark.

The analysis is an exploratory analysis focused on detecting spatial patterns in the data quality, looking at, for example, the correlations between administrative divisions and differences in data completeness, correlations between OSM tag quality and population density, and dentifying areas with large differences between the two data sources.

For a full reproducible setup with all input data, see DOI.

Workflow

The analysis is based on Jupyter notebooks. It therefore requires an installation of Python, including tools for Jupyter notebook.

0. Run BikeDNA

The first step is to successfuly run BikeDNA BIG doing both intrinsic and extrinsic analysis of OSM and GeoDanmark data.

I. Installation

First clone this repository (recommended) to your local machine or download it.

To avoid cloning the history and larger branches with example data and plots, use:

python git clone -b main --single-branch https://github.com/anerv/bikedna_dk_analysis --depth 1

Create Python conda environment

To ensure that all packages needed for the analysis are installed, it is recommended to create and activate a new conda environment using the environment.yml:

python conda env create --file=environment.yml conda activate bikedna_analysis

If this fails, the environment can be created by running:

python conda config --prepend channels conda-forge conda create -n bikedna_analysis --strict-channel-priority geopandas pyarrow pandas folium pyyaml matplotlib contextily rasterio rioxarray jupyterlab ipykernel h3-py splot pysal plotly plotly_express conda activate bikedna_analysis

This method does not control the library versions and should be used as a last resort.

The code for BikeDNA has been developed and tested using macOS 13.2.1.

Install package

The repository has been set up using the structure described in the Good Research Developer. Once the repository has been downloaded, navigate to the main folder in a terminal window and run the command

python pip install -e .

Lastly, add the environment kernel to Jupyter via:

python python -m ipykernel install --user --name=bikedna_analysis

Run Jupyter Lab or Notebook with kernel bikedna (Kernel > Change Kernel > bikedna_analysis).

II. Setup

Fill out the configuration file

In order to run the code, the configuration file config.yml must be filled out. The config.yml on the main branch contains settings for, for example, CRS and the name of the study area used for folder structure setup, plot naming, and result labelling. The configuration file also specifies where to find the data and results from running BikeDNA (step 0).

Plot settings can be changed in scripts/settings/plotting.py.

Set up the folder structure & import data

Next, to create the required folder structure and to copy the results from running BikeDNA, navigate to the main folder in a terminal window and run the Python file setup_folders_input_data.py

python python setup_folders_input_data.py

This should return:

python ... Successfully created folder results/compare_analysis/ Successfully created folder results/osm_analysis/ Successfully created folder results/ref_analysis/ ...

To validate that the results and data were successfully copied to this directory, check that the results folder now contains a subfolder reference and osm with content matching the output of BikeDNA.

Provide data sets

In addition to the input data from BikeDNA, the analysis makes use of:

  • A dataset with muncipal boundaries: municipalities.gpkg
  • A dataset with the total population in each municipality: muni_pop.csv
  • Population rasters with the local population density

These data sets are already provided as part of this repository for an analysis covering all of Denmark, using the default study area settings in the config.yml. If other datasets are to be used, once the folders have been created:

  • remove the existing data files
  • place the files municipalities.gpkg and muni_pop.csv in the folder data > municipalities > 'study_area' > raw
  • place the population rasters in the folder data > population > 'study_area' > raw
  • specify the name of the population rasters in config.yml

Warning The notebooks making use of the municipal and population input data are at the moment hardcoded to use the datasets provided on this reposity, with municipal boundaries for Denmark from Dataforsyningen, municipal population data from Statistics Denmark, and population rasters from the Global Human Settlement Layer (GHSL).

III. Analysis

Notebooks

All analysis notebooks are in the scripts folder.

Population

  • prepare_population_grid.ipynb: This notebook processes the population rasters and converts the data into H3 hexagons at the chosen resolutions.

OSM

  • municipal_analysis_OSM.ipynb: The notebook indexes the results of the intrinsic analysis of OSM by municipality and examines correlations between municipality and high/low data quality.
  • analyze_OSM_tags.ipynb: The notebook runs an analysis of spatial patterns in existing and missing tags in the OSM data.

GeoDanmark

  • municipal_analysis_reference.ipynb: The notebook indexes the results of the intrinsic analysis of the GeoDanmark data by municipality and examines correlations between municipality and high/low data quality.

Compare

  • extrinsic_analysis.ipynb: Looks at spatial patterns in differences between the two data sets, and contrats the findings with areas of high and low population density.
  • municipal_comparison.ipynb: Compares the outcome of the notebooks looking at the quality and completeness at the municipal level.

Additionally, the scripts folder contain the notebook explore_spatia_weights_sensitivity.ipynb used to explore the sensitivity of the analysis of spatial patterns in infrastructure density differences to the definition of spatial weights.

Warning Most notebooks can be run independently, but both municipal_analysis_OSM.ipynb and municipal_analysis_reference.ipynb must be run before municipal_comparison.ipynb, and pop_grid.ipynb must be run before extrinsic_analysis.ipynb and analyze_OSM_tags.ipynb.

Results

The results folder contains the results from running BikeDNA, used as inputs in this analysis (in the folders results/compare, results/osm, and results/reference), and the outputs from running the analysis notebooks.

Output data and plots from the analysis of the BikeDNA outputs are stored in the _analysis folders:

  • municipal_analysis_OSM.ipynb & analyze_OSM_tags.ipynbresults/osm_analysis/'study_area'/
  • municipal_analysis_reference.ipynbresults/reference_analysis/'study_area'/
  • extrinsic_analysis.ipynb & municipal_comparison.ipynbresults/compare_analysis/'study_area'/

Since this is an exploratory analysis producing a high number of maps and figures, only selected plots are automatically saved.

Reproduce plots in QGIS

Most of the plots from the accompanying paper <!-- TODO: INSERT LINK WHEN READY --> have been prepared in QGIS. To recreate the plots, run the Python script export_plot_data.py and open the QGIS project file illustrations.qgz. A few subsets of the data used in illustrations have been selected and exported manually and are specific to the analysis of the OSM and GeoDanmark data sets in Denmark. These data can be found in the qgis/data_manual folder.

Get in touch

Do you have any questions or feedback? Reach us at anev@itu.dk (Ane Rahbek Vier) or anvy@itu.dk (Anastassia Vybornova).

Data & Licenses

Our code is free to use and repurpose under the AGPL 3.0 license.

The repository includes data from the following sources:

OpenStreetMap

OpenStreetMap contributors
License: Open Data Commons Open Database License

Downloaded spring 2023.

GeoDanmark

Contains data from GeoDanmark (retrieved spring 2022) SDFI (Styrelsen for Dataforsyning og Infrastruktur)
License: GeoDanmark

Downloaded spring 2023.

Dataforsyningen

SDFI (Styrelsen for Dataforsyning og Infrastruktur) License: Vilkr for brug af frie geografiske data

Downloaded spring 2023.

Statistics Denmark

Contains data from Statistics Denmark - https://statistikbanken.dk/folk1a

Downloaded spring 2023.

GHSL

Contains data from the European Commission's GHSL (Global Human Settlement Layer)

Schiavina M., Freire S., Carioli A., MacManus K. (2023): GHS-POP R2023A - GHS population grid multitemporal (1975-2030).European Commission, Joint Research Centre (JRC).

Downloaded fall 2022.

Credits

Supported by the Danish Road Directorate.

Owner

  • Name: Ane R V
  • Login: anerv
  • Kind: user
  • Location: Cph
  • Company: ITU

PhD student at NERDS, ITU Copenhagen Geospatial Data Science // Mobility // Urban Data

GitHub Events

Total
Last Year

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 184
  • Total Committers: 2
  • Avg Commits per committer: 92.0
  • Development Distribution Score (DDS): 0.005
Past Year
  • Commits: 184
  • Committers: 2
  • Avg Commits per committer: 92.0
  • Development Distribution Score (DDS): 0.005
Top Committers
Name Email Commits
Ane Rahbek Vierø v****e@h****m 183
Ane R V 4****v 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels