repytah

repytah: An Open-Source Python Package for Building Aligned Hierarchies for Sequential Data - Published in JOSS (2023)

https://github.com/smith-tinkerlab/repytah

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
    7 of 17 committers (41.2%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

mir music-information-retrieval

Scientific Fields

Mathematics Computer Science - 34% confidence
Last synced: 6 months ago · JSON representation

Repository

repytah is a python package that builds aligned hierarchies for sequential data streams where repetitions have meaning (like music)

Basic Info
Statistics
  • Stars: 2
  • Watchers: 1
  • Forks: 1
  • Open Issues: 2
  • Releases: 3
Topics
mir music-information-retrieval
Created over 6 years ago · Last pushed about 2 years ago
Metadata Files
Readme Contributing License Code of conduct Authors

README.md

repytah

A Python package that builds aligned hierarchies for sequential data streams.

PyPI Anaconda-Server Badge

License CI

codecov

DOI

Documentation

See our website for a complete reference manual and introductory tutorials.

This example tutorial will show you a usage of the package from start to finish.

Summary

We introduce repytah, a Python package that constructs the aligned hierarchies representation that contains all possible structure-based hierarchical decompositions for a finite length piece of sequential data aligned on a common time axis. In particular, this representation--introduced by Kinnaird [@Kinnaird_ah] with music-based data (like musical recordings or scores) as the primary motivation--is intended for sequential data where repetitions have particular meaning (such as a verse, chorus, motif, or theme). Although the original motivation for the aligned hierarchies representation was finding structure for music-based data streams, there is nothing inherent in the construction of these representations that limits repytah to only being used on sequential data that is music-based.

The repytah package builds these aligned hierarchies by first extracting repeated structures (of all meaningful lengths) from the self-dissimilarity matrix (SDM) for a piece of sequential data. Intentionally repytah uses the SDM as the starting point for constructing the aligned hierarchies, as an SDM cannot be reversed-engineered back to the original signal and allows for researchers to collaborate with signals that are protected either by copyright or under privacy considerations. This package is a Python translation of the original MATLAB code by Kinnaird [-@Kinnaird_code] with additional documentation, and the code has been updated to leverage efficiencies in Python.

Problems Addressed

Sequential data streams often have repeated elements that build on each other, creating hierarchies. Therefore, the goal of repytah is to extract these repetitions and their relationships to each other in order to form aligned hierarchies.

To learn more about aligned hierarchies, see this paper by Kinnaird (ISMIR 2016) which introduces aligned hierarchies in the context of music-based data streams.

Audience

People working with sequential data where repetitions have meaning will find repytah useful including computational scientists, advanced undergraduate students, younger industry experts, and many others.

An example application of repytah is in Music Information Retrieval (MIR), i.e., in the intersection of music and computer science.

Installation

The latest stable release is available on PyPI, and you can install it by running:

bash pip install repytah

If you use Anaconda, you can install the package using conda-forge:

bash conda install -c conda-forge repytah

To build repytah from source, say python setup.py build. Then, to install repytah, say python setup.py install.

Alternatively, you can download or clone the repository and use pip to handle dependencies:

bash unzip repytah.zip pip install -e repytah-main

or

bash git clone https://github.com/smith-tinkerlab/repytah.git pip install -e repytah

By calling pip list you should see repytah now as an installed package:

bash repytah (0.x.x, /path/to/repytah)

Current and Future Work - Elements of the Package

  • Aligned Hierarchies - This is the fundamental output of the package, of which derivatives can be built. The aligned hierarchies for a given sequential data stream is the collection of all possible hierarchical structure decompositions, aligned on a common time axis. To this end, we offer all possible structure decompositions in one cohesive object.
    • Includes walk through file example.py using supplied input.csv
  • Forthcoming Aligned sub-Hierarchies - (AsH) - These are derivatives of the aligned hierarchies and are described in Aligned sub-Hierarchies: a structure-based approach to the cover song task
  • Forthcoming Start-End and S_NL diagrams
  • Forthcoming SuPP and MaPP representations

MATLAB code

The original code to this project was written in MATLAB by Katherine M. Kinnaird. It can be found here.

Acknowledgements

This code was developed as part of Smith College's Summer Undergraduate Research Fellowship (SURF) from 2019 to 2022 and has been partially funded by Smith College's CFCD funding mechanism. Additionally, as Kinnaird is the Clare Boothe Luce Assistant Professor of Computer Science and Statistical & Data Sciences at Smith College, this work has also been partially supported by Henry Luce Foundation's Clare Boothe Luce Program.

Additionally, we would like to acknowledge and give thanks to Brian McFee and the librosa team. We significantly referenced the Python package librosa in our development process.

Citing

Please cite repytah using the following:

C. Jia et al., repytah: A Python package that builds aligned hierarchies for sequential data streams. Python package version 0.1.2, 2023. [Online]. Available: https://github.com/smith-tinkerlab/repytah.

JOSS Publication

repytah: An Open-Source Python Package for Building Aligned Hierarchies for Sequential Data
Published
May 15, 2023
Volume 8, Issue 85, Page 5213
Authors
Chenhui Jia
Smith College, USA
Lizette Carpenter
Smith College, USA
Thu Tran
Smith College, USA
Amanda Y. Liu
Smith College, USA
Sasha Yeutseyeva
Smith College, USA
Marium Tapal ORCID
Smith College, USA
Yingke Wang
Columbia University, USA
Zoie Kexin Zhao
Smith College, USA
Jordan Moody
Smith College, USA
Denise Nava
Smith College, USA
Eleanor Donaher
Smith College, USA
Lillian Yushu Jiang
Smith College, USA
Ben Bruncati
Smith College, USA
Katherine M. Kinnaird ORCID
Smith College, USA
Editor
Mehmet Hakan Satman ORCID
Tags
Music Information Retrieval Structure representations Aligned Hierarchies Music Structure Analysis

GitHub Events

Total
  • Watch event: 2
Last Year
  • Watch event: 2

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 1,098
  • Total Committers: 17
  • Avg Commits per committer: 64.588
  • Development Distribution Score (DDS): 0.789
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Thu Tran t****n@s****u 232
Chenhui-Jia j****i@1****m 177
Jordan M j****o@g****m 148
Lizette l****r@s****u 109
Marium Tapal m****l@g****m 88
aliu-12 7****2 71
d-nava 4****a 61
Katherine M. Kinnaird k****d@s****u 49
sashayeu s****a@s****u 42
Yingke Wang b****4@s****u 30
zoiezhao z****9@s****u 27
Sasha Yeutseyeva s****a@g****m 24
edonaher 4****r 20
kbruncati 4****i 9
Mehmet Hakan Satman m****n@g****m 6
Yingke Wang 5****4 3
Denise Nava d****a@D****l 2
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 2
  • Total pull requests: 100
  • Average time to close issues: about 1 month
  • Average time to close pull requests: 5 days
  • Total issue authors: 2
  • Total pull request authors: 11
  • Average comments per issue: 3.5
  • Average comments per pull request: 0.24
  • Merged pull requests: 89
  • Bot issues: 0
  • Bot pull requests: 4
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • mariumtapal (1)
  • Rocsg (1)
Pull Request Authors
  • bwang64 (25)
  • Chenhui-Jia (23)
  • aliu-12 (16)
  • thuntran (12)
  • kmkinnaird (11)
  • dependabot[bot] (5)
  • zoiezhao (4)
  • jbytecode (2)
  • sashayeu (1)
  • lcarpenter20 (1)
  • mariumtapal (1)
Top Labels
Issue Labels
Pull Request Labels
dependencies (5) python (5)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 13 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 3
  • Total maintainers: 1
pypi.org: repytah

Python package for building Aligned Hierarchies for sequential data streams

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 13 Last month
Rankings
Dependent packages count: 6.6%
Downloads: 23.8%
Average: 24.8%
Forks count: 30.5%
Dependent repos count: 30.6%
Stargazers count: 32.3%
Maintainers (1)
Last synced: 6 months ago

Dependencies

docs/requirements.txt pypi
  • nbsphinx ==0.8.6
  • readthedocs-sphinx-search ==0.1.0
  • sphinx ==3.4.3
  • sphinx_rtd_theme >=0.3.1
setup.py pypi
  • matplotlib *
  • numpy *
  • pandas *
  • scipy *
.github/workflows/check_repytah.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/draft-pdf.yml actions
  • actions/checkout v3 composite
  • actions/upload-artifact v1 composite
  • openjournals/openjournals-draft-action master composite
pyproject.toml pypi
environment.yml pypi
  • opencv-python ==4.6.0.66