https://github.com/bootphon/speech-map
Mean Average Precision over words or n-grams with speech features
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.0%) to scientific vocabulary
Repository
Mean Average Precision over words or n-grams with speech features
Basic Info
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Mean Average Precision over words or n-grams with speech features
Compute the Mean Average Precision (MAP) with speech features.
This is the MAP@R from equation (3) of https://arxiv.org/abs/2003.08505.
Installation
This package is available on PyPI:
bash
pip install speech-map
It is much more efficient to use the Faiss backend for the k-NN, instead of the naive PyTorch backend. Since Faiss is not available on PyPI, you can install this package in a conda environment with your conda variant:
- CPU version:
bash micromamba create -f environment-cpu.yaml - GPU version:
bash CONDA_OVERRIDE_CUDA=12.6 micromamba create -f environment-gpu.yaml
Usage
CLI
``` ❯ python -m speechmap --help usage: _main__.py [-h] [--pooling {MEAN,MAX,MIN,HAMMING}] [--frequency FREQUENCY] [--backend {FAISS,TORCH}] features jsonl
Mean Average Precision over n-grams / words with speech features
positional arguments: features Path to the directory with pre-computed features jsonl Path to the JSONL file with annotations
options: -h, --help show this help message and exit --pooling {MEAN,MAX,MIN,HAMMING} Pooling (default: MEAN) --frequency FREQUENCY Feature frequency in Hz (default: 50 Hz) --backend {FAISS,TORCH} KNN (default: FAISS) ```
Python API
You most probably need only two functions: build_embeddings_and_labels and mean_average_precision.
Use them like this:
```python from speechmap import buildembeddingsandlabels, meanaverageprecision
embeddings, labels = buildembeddingsandlabels(pathtofeatures, pathtojsonl) print(meanaverage_precision(embeddings, labels)) ```
In this example, path_to_features is a path to a directory containing features stored in individual PyTorch
tensor files, and path_to_jsonl is the path to the JSONL annotations file.
You can also use those functions in a more advanced setting like this:
```python from speechmap import Pooling, buildembeddingsandlabels, meanaverageprecision
embeddings, labels = buildembeddingsandlabels( pathtofeatures, pathtojsonl, pooling=Pooling.MAX, frequency=100, featuremaker=mymodel, fileextension=".wav", ) print(meanaverageprecision(embeddings, labels)) ```
This is a minimal package, and you can easily go through the code in src/speech_map/core.py if you want to check the details.
Data
We distribute in data the words and n-grams annotations for LibriSpeech evaluation subsets. Decompress them with zstd.
We have not used the n-grams annotations recently; there is probably too much samples and they would need some clever subsampling.
References
MAP for speech representations:
bibtex
@inproceedings{carlin11_interspeech,
title = {Rapid evaluation of speech representations for spoken term discovery},
author = {Michael A. Carlin and Samuel Thomas and Aren Jansen and Hynek Hermansky},
year = {2011},
booktitle = {Interspeech 2011},
pages = {821--824},
doi = {10.21437/Interspeech.2011-304},
issn = {2958-1796},
}
Data and original implementation:
bibtex
@inproceedings{algayres20_interspeech,
title = {Evaluating the Reliability of Acoustic Speech Embeddings},
author = {Robin Algayres and Mohamed Salah Zaiem and Benoît Sagot and Emmanuel Dupoux},
year = {2020},
booktitle = {Interspeech 2020},
pages = {4621--4625},
doi = {10.21437/Interspeech.2020-2362},
issn = {2958-1796},
}
Owner
- Name: CoML
- Login: bootphon
- Kind: organization
- Email: syntheticlearner@gmail.com
- Location: Paris, France
- Website: https://cognitive-ml.fr
- Repositories: 55
- Profile: https://github.com/bootphon
GitHub Events
Total
- Public event: 1
- Push event: 4
Last Year
- Public event: 1
- Push event: 4
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/checkout v4 composite
- actions/upload-artifact v4 composite
- astral-sh/ruff-action v3 composite
- astral-sh/setup-uv v5 composite
- numpy ==1.26.4
- polars >=1.30.0
- torch >=2.6.0
- tqdm >=4.67.1
- appnope 0.1.4
- asttokens 3.0.0
- cffi 1.17.1
- colorama 0.4.6
- comm 0.2.2
- debugpy 1.8.14
- decorator 5.2.1
- executing 2.2.0
- filelock 3.18.0
- fsspec 2025.5.1
- ipykernel 6.29.5
- ipython 9.3.0
- ipython-pygments-lexers 1.1.1
- jedi 0.19.2
- jinja2 3.1.6
- jupyter-client 8.6.3
- jupyter-core 5.8.1
- markupsafe 3.0.2
- matplotlib-inline 0.1.7
- mpmath 1.3.0
- nest-asyncio 1.6.0
- networkx 3.5
- numpy 1.26.4
- nvidia-cublas-cu12 12.6.4.1
- nvidia-cuda-cupti-cu12 12.6.80
- nvidia-cuda-nvrtc-cu12 12.6.77
- nvidia-cuda-runtime-cu12 12.6.77
- nvidia-cudnn-cu12 9.5.1.17
- nvidia-cufft-cu12 11.3.0.4
- nvidia-cufile-cu12 1.11.1.6
- nvidia-curand-cu12 10.3.7.77
- nvidia-cusolver-cu12 11.7.1.2
- nvidia-cusparse-cu12 12.5.4.2
- nvidia-cusparselt-cu12 0.6.3
- nvidia-nccl-cu12 2.26.2
- nvidia-nvjitlink-cu12 12.6.85
- nvidia-nvtx-cu12 12.6.77
- packaging 25.0
- parso 0.8.4
- pexpect 4.9.0
- platformdirs 4.3.8
- polars 1.30.0
- prompt-toolkit 3.0.51
- psutil 7.0.0
- ptyprocess 0.7.0
- pure-eval 0.2.3
- pycparser 2.22
- pygments 2.19.1
- python-dateutil 2.9.0.post0
- pywin32 310
- pyzmq 26.4.0
- ruff 0.12.0
- setuptools 80.9.0
- six 1.17.0
- speech-map 0.1.0
- stack-data 0.6.3
- sympy 1.14.0
- torch 2.7.0
- tornado 6.5.1
- tqdm 4.67.1
- traitlets 5.14.3
- triton 3.3.0
- typing-extensions 4.13.2
- wcwidth 0.2.13