infostop

Python package for detecting stops in trajectory data

https://github.com/ulfaslak/infostop

Science Score: 64.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
    2 of 8 committers (25.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.8%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Python package for detecting stops in trajectory data

Basic Info
  • Host: GitHub
  • Owner: ulfaslak
  • License: other
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 10.3 MB
Statistics
  • Stars: 64
  • Watchers: 6
  • Forks: 10
  • Open Issues: 12
  • Releases: 0
Created almost 7 years ago · Last pushed about 1 year ago
Metadata Files
Readme License Citation

README.md

Infostop

Python package for detecting stop locations in mobility data

Build Status

This package implements the algorithm described in https://arxiv.org/pdf/2003.14370.pdf, for detecting stop locations in time-ordered location data.

Infostop is useful to anyone who wishes to detect stationary events in location coordinate streams. It is, thus, a framework to simplify dense and rich location time-series into sequences of events.

Usage

Given a location trace such as:

```Python

data array([[ 55.75259295, 12.34353885 ], [ 55.7525908 , 12.34353145 ], [ 55.7525876 , 12.3435386 ], ..., [ 63.40379175, 10.40477095 ], [ 63.4037841 , 10.40480265 ], [ 63.403787 , 10.4047871 ]]) ```

Or with time information

```Python

data array([[ 55.75259295, 12.34353885, 1581401760 ], [ 55.7525908 , 12.34353145, 1581402760 ], [ 55.7525876 , 12.3435386 , 1581403760 ], ..., [ 63.40379175, 10.40477095, 1583401760 ], [ 63.4037841 , 10.40480265, 1583402760 ], [ 63.403787 , 10.4047871 , 1583403760 ]]) ```

A stop location solution can be obtained using:

```Python

from infostop import Infostop model = Infostop() labels = model.fit_predict(data) ```

Alternatively, data can also be a list of numpy.arrays, in which case it is assumed that list elements are seperate traces in the same space. In this multi segment (or multi user) case, Infostop finds stop locations that are shared by different segments.

Solutions can be plotted using:

```Python

from infostop import plotmap folmap = plotmap(model) folmap.m ```

Plotting this onto a map:

img

Advantages

  • Simplicity: At its core, the method works by two steps. (1) Reducing the location trace to the medians of each stationary event and (2) embedding the resulting locations into a network that connects locations that are within a user-defined distance and clustering that network.
  • Multi-trace support: Currently, no other libraries support clustering multiple traces at once to find global stop locations. Infostop does. The image above visualizes stop locations at a campus for a population of almost 1000 university students.
  • Flow based: Spatial clusters correspond to collections of location points that contain large amounts of flow when represented as a network. This enables the recovery of locations where traces slightly overlap.
  • Speed: First the point space is reduced to the median of stationary points (executed in a fast C++ module), then spatially neighboring points connected using a Ball search tree algorithm, and finally the network is clustered using the C++ based Infomap program. For example, clustering 100.000 location points takes about a second.

Installation

pip install infostop

Development notes

We welcome contributions. Before you get started, you may want to read the notes below.

You should create a virtual environment. In your local infostop folder, do: Bash $ make env

Install infostop into your virtual environment. Do this by running: Bash (env) $ make install This command will also delete any pre-existing installation of Infostop, so you will probably want to run it after each code update.

Run tests: Bash (env) $ make test

Check test coverage: Bash (env) $ make coverage (env) $ cd htmlcov (env) $ python -m http.server 8001 Then go to localhost:8001 in your browser to look at the coverage report.

Format code with black. We don't want to argue about code formatting. Please run black to apply standard formatting to your code before your make a pull request.

The Makefile implements a number of commands that are useful during development. Go ahead and execute make help to see descriptions of available commands, or inspect the file so you understand what's going on.

Convenient: create an ipykernel for the virtual environment If you use Jupyter notebooks, you can install the virtual environment into Jupyter as a kernel. Run: Bash (env) $ pip install ipykernel (env) $ python -m ipykernel install --user --name=infostop_env This lets you select the virtual environment as a kernel in a Jupyter notebook.

Versioning and deployment to PyPI If your update should trigger a version increment and package rerelease, please execute the increment_version.py script ONCE and tag your final commit. After running the commit command, to tag the commit you would run something like: Bash (env) $ git tag -a v1.0.11 -m "Infostop version 1.0.11" Finally, push first the tags and then your commits. Bash (env) $ git push --tags && git push When mergining a PR with a tagged commit, the PyPI deployment action is triggered, and the new version of Infostop becomes publicly available shortly thereafter.

Citation (CITATION.cff)

cff-version: 1.2.0
title: Infostop
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Ulf
    family-names: Aslak
    email: ulfaslak@gmail.com
  - given-names: Laura
    family-names: Alessandretti
    email: lauale@dtu.dk
    affiliation: Technical University of Denmark
identifiers:
  - type: doi
    value: 10.48550/arXiv.2003.14370
repository-code: 'https://github.com/ulfaslak/infostop'
abstract: >-
  Data-driven research in mobility has prospered in recent
  years, providing solutions to real-world challenges
  including forecasting epidemics and planning
  transportation. These advancements were facilitated by
  computational tools enabling the analysis of large-scale
  data-sets of digital traces. One of the challenges when
  pre-processing spatial trajectories is the so-called stop
  location detection, that entails the reduction of raw time
  series to sequences of destinations where an individual
  was stationary. The most widely adopted solution to this
  problem was proposed by Hariharan and Toyama (2004) and
  involves filtering out non-stationary measurements, then
  applying agglomerative clustering on the stationary
  points. This state-of-the-art solution, however, suffers
  of two limitations: (i) frequently visited places located
  very close (such as adjacent buildings) are likely to be
  merged into a unique location, due to inherent measurement
  noise, (ii) traces for multiple users can not be analysed
  simultaneously, thus the definition of destination is not
  shared across users. In this paper, we describe the
  Infostop algorithm that overcomes the limitations of the
  state-of-the-art solution by leveraging the flow-based
  network community detection algorithm Infomap. We test
  Infostop for a population of ∼1000 individuals with highly
  overlapping mobility. We show that the size of locations
  detected by Infostop saturates for increasing number of
  users and that time complexity grows slower than for
  previous solutions. We demonstrate that Infostop can be
  used to easily infer social meetings. Finally, we provide
  an open-source implementation of Infostop, written in
  Python and C++, that has a simple API and can be used both
  for labeling time-ordered coordinate sequences (GPS or
  otherwise), and unordered sets of spatial points.
preferred-citation:
  type: article
  authors:
  - family-names: "Aslak"
    given-names: "Ulf"
    orcid: "https://orcid.org/0000-0003-4704-3609"
  - family-names: "Alessandretti"
    given-names: "Laura"
    orcid: "https://orcid.org/0000-0001-6003-1165"
  journal: "arXiv preprint"
  doi: 10.48550/arXiv.2003.14370
  title: "Infostop: scalable stop-location detection in multi-user mobility data"
  year: 2020
  arxiv: "2003.14370"

GitHub Events

Total
  • Issues event: 2
  • Watch event: 2
  • Issue comment event: 3
  • Push event: 1
  • Pull request event: 2
  • Fork event: 2
Last Year
  • Issues event: 2
  • Watch event: 2
  • Issue comment event: 3
  • Push event: 1
  • Pull request event: 2
  • Fork event: 2

Committers

Last synced: about 1 year ago

All Time
  • Total Commits: 84
  • Total Committers: 8
  • Avg Commits per committer: 10.5
  • Development Distribution Score (DDS): 0.298
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
ulfaslak u****n@g****m 59
Ulf u****k@U****l 7
Alessandretti 8****7@s****t 6
Ulf Aslak Jensen u****e@h****k 5
Endre Mark Borza e****a@g****m 2
lalessan l****n 2
Laura Alessandretti l****e@h****k 2
Ulf u****k@u****e 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 19
  • Total pull requests: 10
  • Average time to close issues: about 1 month
  • Average time to close pull requests: about 24 hours
  • Total issue authors: 16
  • Total pull request authors: 6
  • Average comments per issue: 1.42
  • Average comments per pull request: 1.5
  • Merged pull requests: 8
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: about 8 hours
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 1.5
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • lalessan (2)
  • charlesdaigle (2)
  • katrinabrock (2)
  • carlomarxdk (1)
  • lcandeago (1)
  • sp794uk (1)
  • alexandradec (1)
  • jinzhuyu (1)
  • dkori (1)
  • ulfaslak (1)
  • NA-Dev (1)
  • janpiskur (1)
  • jGaboardi (1)
  • VRehnberg (1)
  • michelegirolami (1)
Pull Request Authors
  • lalessan (3)
  • urosjarc (2)
  • endremborza (2)
  • VRehnberg (1)
  • ulfaslak (1)
  • michielbakker (1)
Top Labels
Issue Labels
enhancement (2) feature request (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 177 last-month
  • Total dependent packages: 1
  • Total dependent repositories: 2
  • Total versions: 20
  • Total maintainers: 1
pypi.org: infostop

Temporospatial clustering in Python. Well suited for mobility data.

  • Versions: 20
  • Dependent Packages: 1
  • Dependent Repositories: 2
  • Downloads: 177 Last month
Rankings
Dependent packages count: 4.7%
Stargazers count: 9.0%
Dependent repos count: 11.6%
Average: 12.5%
Forks count: 12.6%
Downloads: 24.5%
Maintainers (1)
Last synced: 7 months ago

Dependencies

docs/requirements.txt pypi
  • pip >=18.0
requirements.txt pypi
  • folium >=0.7.0
  • infomap ==1.0.6
  • numpy *
  • pybind11 >=2.4
  • scikit-learn *
  • tqdm *
setup.py pypi
  • folium >=0.7.0
  • infomap ==1.0.6
  • numpy *
  • pybind11 >=2.4
  • scikit-learn *
  • tqdm *
.github/workflows/deploy.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • pypa/gh-action-pypi-publish release/v1 composite