GTFS Segments

GTFS Segments: A Fast and Efficient Library to Generate Bus Stop Spacings - Published in JOSS (2024)

https://github.com/utel-uiuc/gtfs_segments

Science Score: 100.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 18 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: sciencedirect.com, joss.theoj.org
  • Committers with academic emails
    1 of 3 committers (33.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

bus distribution gtfs-feed python stop transit transit-data

Keywords from Contributors

mesh

Scientific Fields

Political Science Social Sciences - 39% confidence
Last synced: 4 months ago · JSON representation ·

Repository

GTFS Segments: A fast and efficient library to generate bus stop spacings

Basic Info
Statistics
  • Stars: 44
  • Watchers: 1
  • Forks: 3
  • Open Issues: 3
  • Releases: 8
Topics
bus distribution gtfs-feed python stop transit transit-data
Created over 3 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

DOI Tests Documentation Status PyPI version Downloads image

Elsevier Stargazers Issues MIT License <!-- Contributors --> <!-- Forks -->

Logo

GTFS Segments

A fast and efficient library to generate bus stop spacings

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgments

About The Project

The gtfs-segments is a Python (3.9+) package that represents GTFS data for buses in a concise tabular manner using segments. The distribution of bus stop spacings can be viewed by generating histograms. The stop spacings can be visualized at the network, route, or segment level. The segment data can be exported to well-known formats such as .csv or .geojson for further analysis. Additionally, the package provides commands to download the latest data from @mobility data sources.

The package condenses the raw GTFS data by considering the services offered only on the busiest day(in the data). More discussion on the interpretation of different weightings for stop spacings, and the process in which the package condenses information can be seen in our paper. The usage of the package is detailed in documentation. The stop spacings dataset containing over 540 transit providers in the US generated using this package can be found on Harvard Dataverse.

(back to top)

Getting Started

Prerequisites

The major dependencies of this library are the following packages.

  • numpy
  • shapely
  • pandas
  • scipy
  • geopandas
  • matplotlib
  • contextily

The detailed list of package dependencies can be found in requirements.txt

Installation

Option A

Use pip to install the package.

sh pip install gtfs-segments

ℹ️ Windows users may have to download and install Microsoft Visual C++ distributions. Follow these instructions.

📓 Google Colab : You can install and use the gtfs-segments via google colab. Here is a tutorial to help you get started. Make a copy and get started with your work!

Option B

  1. Clone the repo

    sh git clone https://github.com/UTEL-UIUC/gtfs_segments.git

  2. Install geopandas using the following code. Read more here

    sh conda create -n geo_env -c conda-forge python=3.11 geopandas conda activate geo_env

  3. Install the gtfs_segments package

    sh cd gtfs_segments python setup.py install

(back to top)

Usage

ℹ️ For documentation, please refer to the Documentation

Import the package using

python import gtfs_segments

Get GTFS Files

Fetch all sources

python from gtfs_segments import fetch_gtfs_source sources_df = fetch_gtfs_source() sources_df.head()

sources

Fetch source by name/provider/state

python from gtfs_segments import fetch_gtfs_source sources_df = fetch_gtfs_source(place ='Chicago') sources_df

sources

Automated Download

python from gtfs_segments import download_latest_data download_latest_data(sources_df,"output_folder")

Manual Download

Download the GTFS .zip files from @transitfeeds or @mobility data.

Get GTFS Segments

```python from gtfssegments import getgtfssegments segmentsdf = getgtfssegments("pathtogtfszipfile")

[Optional] Run in parallel using multiple CPU cores

segmentsdf = getgtfssegments("pathtogtfszip_file", parallel = True) ```

Alternatively, filter a specific agency by passing agency_id as a string or multiple agencies as list ["SFMTA",]

segments_df = get_gtfs_segments("path_to_gtfs_zip_file",agency_id = "SFMTA") segments_df

data

Table generated by gtfs-segments using data from San Francisco’s Muni system. Each row contains the following columns:

  1. segment_id: the segment's identifier, produced by gtfs-segments
  2. stop_id1: the identifier of the segment's beginning stop. The identifier is the same one the agency has chosen in the stops.txt file of its GTFS package.
  3. stop_id2: The identifier of the segment's ending stop.
  4. route_id: The same route ID listed in the agency's routes.txt file.
  5. direction_id: The route's direction identifier.
  6. traversals: The number of times the indicated route traverses the segment during the "measurement interval." The "measurement interval" chosen is the busiest day in the GTFS schedule: the day which has the most bus services running.
  7. distance: The length of the bus segment in meters.
  8. geometry: The segment's LINESTRING (a format for encoding geographic paths) written in WGS84 (EPGS:4326) coordinates, that is, unprojected longitude-latitude pairs, as used in GTFS.
  9. traversal_time: The time (in seconds) that it takes for the bus to traverse the segment.
  10. speed: The speed of the bus (in kmph) while traversing the segment. Default to np.inf♾ in case traversal_time is zero.

Each row does not represent one segment. Rather, each row maps to a combination of a segment, a route that includes that segment, and a direction. For instance, a segment included in eight routes will appear as eight rows, which will have the same information except for route_id and traversals (since some routes might traverse the segment more than others). This choice enables filtering by route and preserves how many times each route traverses each segment during the measurement interval. The direction identifier is used for very rare cases (mostly loops) in which a route visits the same two stops, in the same order, but in different directions.

Visualize Spacings

Visualize stop spacings at network, route and segments levels along with basemaps and stop locations.

ℹ️ For more information on visualization refer to the Visualization Tutorial

ℹ️ Alternatively, use view_spacings_interactive to view the stop spacings interactively.

python from gtfs_segments import view_spacings view_spacings(segments_df,route = ['8'],segment = ['6364-3725-1'],basemap=True)

data

Heatmap

View the heatmap of stop spacings ("distance" as metric). Use Diverging colormaps to highlight narrow and wide spacings. Set light_mode = False for Dark mode.

python from gtfs_segments import view_heatmap f = view_heatmap(df, cmap='RdBu', light_mode=True)

data

python view_heatmap(df, cmap="YlOrRd", interactive=True, light_mode=False)

data

Plot Distributions

python from gtfs_segments import plot_hist plot_hist(segments_df, max_spacing = 1200)

histogram
Optionally save figures using ```python plot_hist(segments_df,file_path = "spacings_hist.png",save_fig = True) ``` ## Summary Statistics ### Get Network Summary Stats ```python from gtfs_segments import summary_stats summary_stats(segments_df,max_spacing = 3000,export = True,file_path = "summary.csv") ```
histogram

Get Route Summary Stats

python from gtfs_segments import get_route_stats,get_bus_feed feed = get_bus_feed('path_to_gtfs.zip') get_route_stats(feed)

histogram

Here each row contains the following columns:

  1. route: The route_id for the route of interest
  2. direction: The direction_id of the route
  3. route_length: The total length of the route. Units: Kilometers (Km)
  4. total time: The total scheduled time to travel the whole route. Units: Hours (Hr)
  5. headway: The average headway between consecutive buses for the route. A NaN indicates only 1 trip. Units: Hours (Hr)
  6. peak_buses: The 15-minute interval where the route has the maximum number of buses concurrently running.
  7. average_speed: The average speed of the bus along the route. Units: Kmph
  8. n_bus_avg: The average number of buses concurrently running
  9. bus_spacing: The average spacing (in distance) between consecutive buses. Units: Kilometers (Km)
  10. stop_spacing: The average distance between two consecutive stops. Units: Kilometers (Km)

Download Segments Data

Download the data as either .csv or .geojson

```python from gtfssegments import exportsegments exportsegments(segmentsdf,'filename', output_format ='geojson')

Get csv without geometry

exportsegments(segmentsdf,'filename', output_format ='csv',geometry = False) ```

(back to top)

Roadmap

  • [x] Add interactive visualization with folium
  • [x] Log trips that do not have shapes
  • [ ] Visualize catchment areas for stops

See the open issues for a full list of proposed features (and known issues).

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

Citing gtfs-segments

If you use gtfs-segments in your research please use the following BibTeX entry:

bibtex @article{Devunuri_GTFS_Segments_A_2024, author = {Devunuri, Saipraneeth and Lehe, Lewis}, doi = {10.21105/joss.06306}, journal = {Journal of Open Source Software}, month = mar, number = {95}, pages = {6306}, title = {{GTFS Segments: A Fast and Efficient Library to Generate Bus Stop Spacings}}, url = {https://joss.theoj.org/papers/10.21105/joss.06306}, volume = {9}, year = {2024} } Alternative: Check the Cite this repository

Citing stop spacings paper

If you use stop spacings paper in your research please use the following BibTeX entry:

bibtex @article{Devunuri2024, title = {Bus Stop Spacing Statistics: {{Theory}} and Evidence}, shorttitle = {Bus Stop Spacing Statistics}, author = {Devunuri, Saipraneeth and Lehe, Lewis J. and Qiam, Shirin and Pandey, Ayush and Monzer, Dana}, year = {2024}, month = jan, journal = {Journal of Public Transportation}, volume = {26}, pages = {100083}, issn = {1077-291X}, doi = {10.1016/j.jpubtr.2024.100083}, url = {https://www.sciencedirect.com/science/article/pii/S1077291X24000031}, urldate = {2024-03-07}, keywords = {Bus stop,GTFS,Public Transit,Stop Spacings,Transit Planning} } <!--

Citing stop spacings dataset

If you use the stop spacings dataset in your research please use the following BibTeX entry:

bibtex @data{DVN/SFBIVU_2022, author = {Devunuri, Saipraneeth and Shirin Qiam and Lewis Lehe}, publisher = {Harvard Dataverse}, title = {{Bus Stop Spacings for Transit Providers in the US}}, UNF = {UNF:6:zUgB0CGrPL27iqhKd/umRA==}, year = {2022}, version = {V1}, doi = {10.7910/DVN/SFBIVU}, url = {https://doi.org/10.7910/DVN/SFBIVU} } -->

(back to top)

Contributing

Contributions are what makes the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

For more information refer to CONTRIBUTING.md

Contact

Saipraneeth Devunuri - @praneethDevunu1 - sd37@illinois.edu

Project Link: https://github.com/UTEL-UIUC/gtfs_segments

Acknowledgments

  • Parts of the code use the Partridge library
  • Do check out gtfs_functions which was an inspiration for this project
  • Shoutout to Mobility Data for compiling GTFS from around the globe and constantly maintaining them

(back to top)

Owner

  • Name: UTEL-UIUC
  • Login: UTEL-UIUC
  • Kind: organization

JOSS Publication

GTFS Segments: A Fast and Efficient Library to Generate Bus Stop Spacings
Published
March 19, 2024
Volume 9, Issue 95, Page 6306
Authors
Saipraneeth Devunuri ORCID
Department of Civil and Environmental Engineering, University of Illinois Urbana-Champaign
Lewis Lehe ORCID
Department of Civil and Environmental Engineering, University of Illinois Urbana-Champaign
Editor
Olivia Guest ORCID
Tags
GTFS Public Transit Stop Spacings Bus Stops

Citation (CITATION.cff)

cff-version: "1.2.0"
authors:
- family-names: Devunuri
  given-names: Saipraneeth
  orcid: "https://orcid.org/0000-0002-5911-4681"
- family-names: Lehe
  given-names: Lewis
  orcid: "https://orcid.org/0000-0001-8029-1706"
contact:
- family-names: Devunuri
  given-names: Saipraneeth
  orcid: "https://orcid.org/0000-0002-5911-4681"
doi: 10.5281/zenodo.10681151
message: If you use this software, please cite our article in the
  Journal of Open Source Software.
preferred-citation:
  authors:
  - family-names: Devunuri
    given-names: Saipraneeth
    orcid: "https://orcid.org/0000-0002-5911-4681"
  - family-names: Lehe
    given-names: Lewis
    orcid: "https://orcid.org/0000-0001-8029-1706"
  date-published: 2024-03-19
  doi: 10.21105/joss.06306
  issn: 2475-9066
  issue: 95
  journal: Journal of Open Source Software
  publisher:
    name: Open Journals
  start: 6306
  title: "GTFS Segments: A Fast and Efficient Library to Generate Bus
    Stop Spacings"
  type: article
  url: "https://joss.theoj.org/papers/10.21105/joss.06306"
  volume: 9
title: "GTFS Segments: A Fast and Efficient Library to Generate Bus Stop
  Spacings"

GitHub Events

Total
  • Issues event: 4
  • Watch event: 13
  • Delete event: 2
  • Issue comment event: 5
  • Push event: 5
  • Pull request event: 5
  • Create event: 2
Last Year
  • Issues event: 4
  • Watch event: 13
  • Delete event: 2
  • Issue comment event: 5
  • Push event: 5
  • Pull request event: 5
  • Create event: 2

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 204
  • Total Committers: 3
  • Avg Commits per committer: 68.0
  • Development Distribution Score (DDS): 0.029
Past Year
  • Commits: 6
  • Committers: 2
  • Avg Commits per committer: 3.0
  • Development Distribution Score (DDS): 0.333
Top Committers
Name Email Commits
Praneeth Devunuri s****7@i****u 198
dependabot[bot] 4****] 5
lewis500 l****0@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 6
  • Total pull requests: 13
  • Average time to close issues: about 1 month
  • Average time to close pull requests: about 8 hours
  • Total issue authors: 2
  • Total pull request authors: 2
  • Average comments per issue: 1.17
  • Average comments per pull request: 0.15
  • Merged pull requests: 11
  • Bot issues: 0
  • Bot pull requests: 7
Past Year
  • Issues: 4
  • Pull requests: 6
  • Average time to close issues: 3 days
  • Average time to close pull requests: about 5 hours
  • Issue authors: 1
  • Pull request authors: 2
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 4
Top Authors
Issue Authors
  • araichev (4)
  • praneethd7 (2)
Pull Request Authors
  • praneethd7 (9)
  • dependabot[bot] (8)
Top Labels
Issue Labels
enhancement (1)
Pull Request Labels
dependencies (8) python (2)

Packages

  • Total packages: 3
  • Total downloads:
    • pypi 258 last-month
  • Total dependent packages: 1
    (may contain duplicates)
  • Total dependent repositories: 0
    (may contain duplicates)
  • Total versions: 33
  • Total maintainers: 1
proxy.golang.org: github.com/utel-uiuc/gtfs_segments
  • Versions: 8
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 6.5%
Average: 6.7%
Dependent repos count: 7.0%
Last synced: 4 months ago
proxy.golang.org: github.com/UTEL-UIUC/gtfs_segments
  • Versions: 8
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 6.5%
Average: 6.7%
Dependent repos count: 7.0%
Last synced: 4 months ago
pypi.org: gtfs-segments

GTFS Segments: A fast and efficient library to generate bus stop spacings

  • Homepage: https://github.com/UTEL-UIUC/gtfs_segments
  • Documentation: https://gtfs-segments.readthedocs.io
  • License: MIT License Copyright (c) 2023, Saipraneeth Devunuri, Lewis Lehe Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
  • Latest release: 2.1.7
    published almost 2 years ago
  • Versions: 17
  • Dependent Packages: 1
  • Dependent Repositories: 0
  • Downloads: 258 Last month
Rankings
Dependent packages count: 6.6%
Stargazers count: 16.6%
Average: 19.3%
Downloads: 19.4%
Forks count: 23.2%
Dependent repos count: 30.6%
Maintainers (1)
Last synced: 4 months ago

Dependencies

requirements.txt pypi
  • Shapely ==1.8.2
  • contextily ==1.2.0
  • geopandas ==0.10.2
  • matplotlib ==3.5.1
  • numpy ==1.22.2
  • pandas ==1.4.1
  • partridge ==1.1.1
  • requests ==2.27.1
  • scipy ==1.8.0
  • setuptools ==62.4.0
  • utm ==0.7.0
.github/workflows/draft-pdf.yml actions
  • actions/checkout v3 composite
  • actions/upload-artifact v1 composite
  • openjournals/openjournals-draft-action master composite
.github/workflows/python-package.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite
.github/workflows/python-publish.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite
  • pypa/gh-action-pypi-publish 27b31702a0e7fc50959f5ad993c78deac1bdfc29 composite
docs/requirements.txt pypi
  • mkdocs *
  • mkdocs-jupyter *
  • mkdocs-material *
  • mkdocstrings *
  • mkdocstrings-python *
setup.py pypi
pyproject.toml pypi
  • charset_normalizer *
  • contextily *
  • faust-cchardet *
  • geopandas *
  • isoweek *
  • matplotlib *
  • numpy *
  • pandas *
  • requests *
  • scipy *
  • shapely *
  • utm *