repytah
repytah: An Open-Source Python Package for Building Aligned Hierarchies for Sequential Data - Published in JOSS (2023)
Science Score: 95.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: zenodo.org -
✓Committers with academic emails
7 of 17 committers (41.2%) from academic institutions -
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords
Scientific Fields
Repository
repytah is a python package that builds aligned hierarchies for sequential data streams where repetitions have meaning (like music)
Basic Info
- Host: GitHub
- Owner: smith-tinkerlab
- License: isc
- Language: Python
- Default Branch: main
- Homepage: https://repytah.readthedocs.io/en/latest/
- Size: 16.3 MB
Statistics
- Stars: 2
- Watchers: 1
- Forks: 1
- Open Issues: 2
- Releases: 3
Topics
Metadata Files
README.md
A Python package that builds aligned hierarchies for sequential data streams.
Documentation
See our website for a complete reference manual and introductory tutorials.
This example tutorial will show you a usage of the package from start to finish.
Summary
We introduce repytah, a Python package that constructs the aligned hierarchies representation that contains all possible structure-based hierarchical decompositions for a finite length piece of sequential data aligned on a common time axis. In particular, this representation--introduced by Kinnaird [@Kinnaird_ah] with music-based data (like musical recordings or scores) as the primary motivation--is intended for sequential data where repetitions have particular meaning (such as a verse, chorus, motif, or theme). Although the original motivation for the aligned hierarchies representation was finding structure for music-based data streams, there is nothing inherent in the construction of these representations that limits repytah to only being used on sequential data that is music-based.
The repytah package builds these aligned hierarchies by first extracting repeated structures (of all meaningful lengths) from the self-dissimilarity matrix (SDM) for a piece of sequential data. Intentionally repytah uses the SDM as the starting point for constructing the aligned hierarchies, as an SDM cannot be reversed-engineered back to the original signal and allows for researchers to collaborate with signals that are protected either by copyright or under privacy considerations. This package is a Python translation of the original MATLAB code by Kinnaird [-@Kinnaird_code] with additional documentation, and the code has been updated to leverage efficiencies in Python.
Problems Addressed
Sequential data streams often have repeated elements that build on each other, creating hierarchies. Therefore, the goal of repytah is to extract these repetitions and their relationships to each other in order to form aligned hierarchies.
To learn more about aligned hierarchies, see this paper by Kinnaird (ISMIR 2016) which introduces aligned hierarchies in the context of music-based data streams.
Audience
People working with sequential data where repetitions have meaning will find repytah useful including computational scientists, advanced undergraduate students, younger industry experts, and many others.
An example application of repytah is in Music Information Retrieval (MIR), i.e., in the intersection of music and computer science.
Installation
The latest stable release is available on PyPI, and you can install it by running:
bash
pip install repytah
If you use Anaconda, you can install the package using conda-forge:
bash
conda install -c conda-forge repytah
To build repytah from source, say python setup.py build.
Then, to install repytah, say python setup.py install.
Alternatively, you can download or clone the repository and use pip to handle dependencies:
bash
unzip repytah.zip
pip install -e repytah-main
or
bash
git clone https://github.com/smith-tinkerlab/repytah.git
pip install -e repytah
By calling pip list you should see repytah now as an installed package:
bash
repytah (0.x.x, /path/to/repytah)
Current and Future Work - Elements of the Package
- Aligned Hierarchies - This is the fundamental output of the package, of which derivatives can be built. The aligned hierarchies for a given sequential data stream is the collection of all possible hierarchical structure decompositions, aligned on a common time axis. To this end, we offer all possible structure decompositions in one cohesive object.
- Includes walk through file
example.pyusing suppliedinput.csv
- Includes walk through file
- Forthcoming Aligned sub-Hierarchies - (AsH) - These are derivatives of the aligned hierarchies and are described in Aligned sub-Hierarchies: a structure-based approach to the cover song task
- Forthcoming Start-End and S_NL diagrams
- Forthcoming SuPP and MaPP representations
MATLAB code
The original code to this project was written in MATLAB by Katherine M. Kinnaird. It can be found here.
Acknowledgements
This code was developed as part of Smith College's Summer Undergraduate Research Fellowship (SURF) from 2019 to 2022 and has been partially funded by Smith College's CFCD funding mechanism. Additionally, as Kinnaird is the Clare Boothe Luce Assistant Professor of Computer Science and Statistical & Data Sciences at Smith College, this work has also been partially supported by Henry Luce Foundation's Clare Boothe Luce Program.
Additionally, we would like to acknowledge and give thanks to Brian McFee and the librosa team. We significantly referenced the Python package librosa in our development process.
Citing
Please cite repytah using the following:
C. Jia et al., repytah: A Python package that builds aligned hierarchies for sequential data streams. Python package version 0.1.2, 2023. [Online]. Available: https://github.com/smith-tinkerlab/repytah.
JOSS Publication
repytah: An Open-Source Python Package for Building Aligned Hierarchies for Sequential Data
Authors
Smith College, USA
Smith College, USA
Smith College, USA
Smith College, USA
Smith College, USA
Columbia University, USA
Smith College, USA
Smith College, USA
Smith College, USA
Smith College, USA
Smith College, USA
Smith College, USA
Tags
Music Information Retrieval Structure representations Aligned Hierarchies Music Structure AnalysisGitHub Events
Total
- Watch event: 2
Last Year
- Watch event: 2
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Thu Tran | t****n@s****u | 232 |
| Chenhui-Jia | j****i@1****m | 177 |
| Jordan M | j****o@g****m | 148 |
| Lizette | l****r@s****u | 109 |
| Marium Tapal | m****l@g****m | 88 |
| aliu-12 | 7****2 | 71 |
| d-nava | 4****a | 61 |
| Katherine M. Kinnaird | k****d@s****u | 49 |
| sashayeu | s****a@s****u | 42 |
| Yingke Wang | b****4@s****u | 30 |
| zoiezhao | z****9@s****u | 27 |
| Sasha Yeutseyeva | s****a@g****m | 24 |
| edonaher | 4****r | 20 |
| kbruncati | 4****i | 9 |
| Mehmet Hakan Satman | m****n@g****m | 6 |
| Yingke Wang | 5****4 | 3 |
| Denise Nava | d****a@D****l | 2 |
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 2
- Total pull requests: 100
- Average time to close issues: about 1 month
- Average time to close pull requests: 5 days
- Total issue authors: 2
- Total pull request authors: 11
- Average comments per issue: 3.5
- Average comments per pull request: 0.24
- Merged pull requests: 89
- Bot issues: 0
- Bot pull requests: 4
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- mariumtapal (1)
- Rocsg (1)
Pull Request Authors
- bwang64 (25)
- Chenhui-Jia (23)
- aliu-12 (16)
- thuntran (12)
- kmkinnaird (11)
- dependabot[bot] (5)
- zoiezhao (4)
- jbytecode (2)
- sashayeu (1)
- lcarpenter20 (1)
- mariumtapal (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 13 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 3
- Total maintainers: 1
pypi.org: repytah
Python package for building Aligned Hierarchies for sequential data streams
- Homepage: https://github.com/smith-tinkerlab/repytah
- Documentation: https://repytah.readthedocs.io/
- License: ISC
-
Latest release: 0.1.2
published almost 3 years ago
Rankings
Maintainers (1)
Dependencies
- nbsphinx ==0.8.6
- readthedocs-sphinx-search ==0.1.0
- sphinx ==3.4.3
- sphinx_rtd_theme >=0.3.1
- matplotlib *
- numpy *
- pandas *
- scipy *
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/checkout v3 composite
- actions/upload-artifact v1 composite
- openjournals/openjournals-draft-action master composite
- opencv-python ==4.6.0.66

