niimpy

Python module for analysis of behavorial data

https://github.com/digitraceslab/niimpy

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.6%) to scientific vocabulary
Last synced: 7 months ago · JSON representation ·

Repository

Python module for analysis of behavorial data

Basic Info
Statistics
  • Stars: 14
  • Watchers: 4
  • Forks: 8
  • Open Issues: 9
  • Releases: 4
Created almost 8 years ago · Last pushed 11 months ago
Metadata Files
Readme License Citation

README.md

Niimpy

maintenance-status Test Build Test installation from source codecov License: MIT Binder

What

Niimpy is a Python package for analyzing and quantifying behavioral data. It uses pandas to read data from disk, perform basic manipulations, provides explorative data analysis functions, offers many high-level preprocessing functions for various types of data, and has functions for behavioral data analysis.

For Who

Niimpy is intended for researchers and data scientists analyzing digital digital behavioral data. Its purpose is to facilitate data analysis by providing a standardized replicable workflow.

Why

Digital behavioral studies using personal digital devices typically produce rich multi-sensor longitudinal datasets of mixed data types. Analyzing such data requires multidisciplinary expertise and software designed for the purpose. Currently, no standardized workflow or tools exist to analyze such data sets. The analysis requires domain knowledge in multiple fields and programming expertise. Niimpy package is specifically designed to analyze longitudinal, multimodal behavioral data. Niimpy is a user-friendly open-source package that can be easily expanded and adapted to specific research requirements. The toolbox facilitates the analysis phase by providing tools for data management, preprocessing, feature extraction, and visualization. The more advanced analysis methods will be incorporated into the toolbox in the future.

How

The toolbox is divided into four layers by functionality: 1) reading, 2) preprocessing, 3) exploration, and 4) analysis. For more information about the layers, refer the toolbox architecture chapter :doc:architecture. Quickstart guide would be a good place to start :doc:quick-start. More detailed demo Jupyter notebooks are provided in user guide chapter :doc:demo_notebooks/Exploration. Instructions for individual functions can be found under API chapter :doc:api/niimpy.

Installation

  • Only supports Python 3 (tested on 3.8 and above)

  • This is a normal Python package to install.

pip install niimpy

  • It can also be installed manually:

pip install https://github.com/digitraceslab/niimpy/archive/master.zip

Getting started with location data

All of the functions for reading, preprocessing, and feature extraction for location data is in location.py. Currently implemented features are:

  • dist_total: total distance a person traveled in meter.
  • variance, log_variance: variance is defined as sum of variance in latitudes and longitudes.
  • speed_average, speed_variance, and speed_max: statistics of speed (m/s). Speed, if not given, can be calculated by dividing the distance between two consequitive bins by their time difference.
  • n_bins: number of location bins that a user recorded in dataset.
  • n_static: number of static points. Static points are defined as bins whose speed is lower than a threshold.
  • n_moving: number of moving points. Equivalent to n_bins - n_static.
  • n_home: number of static bins which are close to the person's home. Home is defined the place most visited during nights. More formally, all the locations recorded during 12 Am and 6 AM are clusterd and the center of largest cluster is assumed to be home.
  • max_dist_home: maximum distance from home.
  • n_sps: number of significant places. All of the static bins are clusterd using DBSCAN algorithm. Each cluster represents a Signicant Place (SP) for a user.
  • n_rare: number of rarely visited (referred as outliers in DBSCAN).
  • n_transitions: number of transitions between significant places.
  • n_top1, n_top2, n_top3, n_top4, n_top5: number of bins in the top N cluster. In other words, n_top1 shows the number of times the person has visited the most freqently visited place.
  • entropy, normalized_entropy: entropy of time spent in clusters. Normalized entropy is the entropy divided by the number of clusters.

Usage:

```python import pandas as pd import niimpy import niimpy.location as nilo

CONTROLPATH = "PATH/TO/CONTROL/DATA" PATIENTPATH = "PATH/TO/PATIENT/DATA"

Read data of control and patients from database

locationcontrol = niimpy.readsqlite(CONTROLPATH, table='AwareLocation', addgroup='control', tz='Europe/Helsinki') locationpatient = niimpy.readsqlite(PATIENTPATH, table='AwareLocation', addgroup='patient', tz='Europe/Helsinki')

Concatenate the two dataframes to have one dataframe

location = pd.concat([locationcontrol, locationpatient])

Remove low-quality and outlier locations

location = nilo.filter_location(location)

Downsample locations (median filter). Bin size is 10 minute.

location = niimpy.util.aggregate(location, freq='10min', methodnumerical='median') location = location.resetindex(0).dropna()

Feature extraction

features = nilo.extract_features( lats=location['latitude'], lons=location['longitude'], users=location['user'], groups=location['group'], times=location.index, speeds=location['speed'] ) ```

Documentation

Niimpy documentation is hosted at [readthedocs]https://digitraceslab.github.io/niimpy/.

Development

This is a pretty typical Python project with code and documentation as you might expect.

requirements-dev.txt contains some basic dev requirements, which includes a editable dev install of niimpy itself (pip install -e).

Run tests with: pytest .

Documentation is built with Sphinx: ``` cd docs make html

output in _build/html/

```

Enable nbdime Jupyter notebook diff and merge via git with: nbdime config-git --enable

See also

  • To learn about pandas, see its documentation. It is not the most clearly written documentation you will find, but you should try starting with the "Package overview" and "10 minutes to pandas" sections.

  • Matplotlib is the standard Python plotting package, but Seaborn will produce nicer graphics by default. Hint: look for examples and copy them.

Owner

  • Name: Digital Traces Lab
  • Login: digitraceslab
  • Kind: organization

A project for collecting, processing, and analyzing data from wearable and consumer devices for measuring behavioral and physiological patterns

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'Niimpy: Python module for analysis of behaviorial data '
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Arsi
    family-names: Ikäheimonen
    email: arsi.ikaheimonen@aalto.fi
    affiliation: Aalto University
    orcid: 'https://orcid.org/0000-0002-1617-6911'
  - given-names: Ana Maria
    family-names: Hoys
    email: ana.trianahoyos@aalto.fi
    affiliation: Aalto University
  - given-names: Talayeh
    family-names: Aledavood
    email: talayeh.aledavood@aalto.fi
    affiliation: talayeh.aledavood@aalto.fi
  - given-names: Nguyen
    family-names: Luong
    email: nguyen.luong@aalto.fi
    affiliation: Aalto University
  - given-names: Amirnohammad
    family-names: Ziael
    email: amirmohammad.ziaeibideh@aalto.fi
    affiliation: Aalto University
  - given-names: 'Jarno '
    family-names: Rantaharju
    email: jarno.rantaharju@aalto.fi
    affiliation: Aalto University
  - given-names: Richard
    family-names: Darst
    email: richard.darst@aalto.fi
    affiliation: Aalto University
repository-code: 'https://github.com/digitraceslab/niimpy'
keywords:
  - >-
    Data analysis toolbox, Digital Behavioral Studies,
    Mobile Sensing, Python Package
license: CC-BY-4.0
version: 1.1.0
date-released: '2023-05-22'
preferred-citation:
  type: article
  title: 'Niimpy: Python module for analysis of behaviorial data '
  authors:
  - given-names: Arsi
    family-names: Ikäheimonen
    email: arsi.ikaheimonen@aalto.fi
    affiliation: Aalto University
    orcid: 'https://orcid.org/0000-0002-1617-6911'
  - given-names: Ana Maria
    family-names: Hoys
    email: ana.trianahoyos@aalto.fi
    affiliation: Aalto University
  - given-names: Talayeh
    family-names: Aledavood
    email: talayeh.aledavood@aalto.fi
    affiliation: talayeh.aledavood@aalto.fi
  - given-names: Nguyen
    family-names: Luong
    email: nguyen.luong@aalto.fi
    affiliation: Aalto University
  - given-names: Amirnohammad
    family-names: Ziael
    email: amirmohammad.ziaeibideh@aalto.fi
    affiliation: Aalto University
  - given-names: 'Jarno '
    family-names: Rantaharju
    email: jarno.rantaharju@aalto.fi
    affiliation: Aalto University
  - given-names: Richard
    family-names: Darst
    email: richard.darst@aalto.fi
    affiliation: Aalto University
  abstract: >-
    Niimpy is a Python package for analyzing and quantifying
    behavioral data. It uses pandas to read data from disk,
    perform basic manipulations, provides explorative data
    analysis functions, offers many high-level preprocessing
    functions for various types of data, and has functions for
    behavioral data analysis.
  doi: "10.1016/j.softx.2023.101472"
  journal: "SoftwareX"
  start: 101472 # First page number
  end: 101472 # Last page number
  title: "Niimpy: A toolbox for behavioral data analysise"
  volume: 23
  year: 2023

GitHub Events

Total
  • Create event: 4
  • Release event: 1
  • Issues event: 17
  • Watch event: 3
  • Issue comment event: 15
  • Push event: 50
  • Pull request event: 10
Last Year
  • Create event: 4
  • Release event: 1
  • Issues event: 17
  • Watch event: 3
  • Issue comment event: 15
  • Push event: 50
  • Pull request event: 10

Committers

Last synced: about 3 years ago

All Time
  • Total Commits: 581
  • Total Committers: 18
  • Avg Commits per committer: 32.278
  • Development Distribution Score (DDS): 0.678
Top Committers
Name Email Commits
Arsi Ikäheimonen 6****n@u****m 187
Richard Darst r****d@z****t 146
AnaTomomi a****s@a****i 64
Your Name l****4@g****m 62
Rantaharju Jarno j****u@a****i 53
Ziaei Bideh Amirmohammad a****h@a****i 25
Nguyen Luong l****n@u****m 13
AmirMohammad Ziaei a****a@o****m 7
Aledavood Talayeh a****1@p****i 5
Aledavood Talayeh a****1@p****i 4
Arsi Ikäheimonen a****n@a****i 4
arc1909 7****9@u****m 3
talayeha p****4@y****m 3
Aledavood Talayeh a****1@p****i 1
Jarno Rantaharju r****r@g****m 1
Triana Hoyos Ana t****1@l****i 1
Jarno Rantaharju j****u@h****i 1
Triana Hoyos Ana t****1@l****i 1

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 100
  • Total pull requests: 76
  • Average time to close issues: 6 months
  • Average time to close pull requests: 16 days
  • Total issue authors: 10
  • Total pull request authors: 5
  • Average comments per issue: 1.08
  • Average comments per pull request: 0.66
  • Merged pull requests: 73
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 15
  • Pull requests: 14
  • Average time to close issues: 3 months
  • Average time to close pull requests: about 1 month
  • Issue authors: 3
  • Pull request authors: 1
  • Average comments per issue: 1.33
  • Average comments per pull request: 0.21
  • Merged pull requests: 12
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • rantahar (46)
  • ArgonSilicon (35)
  • amirzia (5)
  • AnaTomomi (4)
  • rkdarst (4)
  • talayeha (3)
  • dado93 (2)
  • AlirezaT99 (1)
  • arc1909 (1)
  • lnknguyen (1)
Pull Request Authors
  • rantahar (60)
  • ArgonSilicon (18)
  • AnaTomomi (7)
  • lnknguyen (4)
  • rkdarst (3)
Top Labels
Issue Labels
enhancement (16) v2.0 (6) question (5) First Version (4) v1.2 (3) discuss (3) documentation (3) good first issue (2) bug (2) invalid (1) help wanted (1) duplicate (1) wontfix (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 48 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 6
  • Total maintainers: 2
pypi.org: niimpy

Python module for analysis of behavioral data

  • Versions: 6
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 48 Last month
Rankings
Dependent packages count: 10.1%
Forks count: 12.5%
Stargazers count: 16.1%
Dependent repos count: 21.5%
Average: 21.8%
Downloads: 48.7%
Maintainers (2)
Last synced: 8 months ago

Dependencies

requirements-dev.txt pypi
  • geopy * development
  • ipykernel * development
  • nbdime * development
  • nbsphinx * development
  • numpydoc * development
  • pytest * development
  • sphinx * development
  • sphinx_rtd_theme * development
requirements.txt pypi
  • coverage *
  • matplotlib *
  • numpy *
  • pandas *
  • plotly *
  • python-dateutil *
  • scikit-learn *
  • seaborn *
  • sklearn *
.github/workflows/codecov.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • codecov/codecov-action v2 composite
.github/workflows/docs.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
.github/workflows/install.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
.github/workflows/test.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite