spdl

Scalable and Performant Data Loading

https://github.com/facebookresearch/spdl

Science Score: 64.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
    1 of 7 committers (14.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (5.6%) to scientific vocabulary

Keywords

dl ml
Last synced: 6 months ago · JSON representation ·

Repository

Scalable and Performant Data Loading

Basic Info
Statistics
  • Stars: 295
  • Watchers: 10
  • Forks: 15
  • Open Issues: 15
  • Releases: 11
Topics
dl ml
Created over 1 year ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Code of conduct Citation Authors

README.md

SPDL

SPDL (Scalable and Performant Data Loading) is a library and project to explore the design of performant data loading.

It provides flexible pipeline abstraction and a set of operations used for processing array data.

Documentation

Please checkout the documentation.

License

SPDL is BSD 2-Clause licensed, as found in the LICENSE file.

Citation

Please use the following BibTex for citing our project if you find it useful.

@misc{hira2025scalableperformantdataloading, title={Scalable and Performant Data Loading}, author={Moto Hira and Christian Puhrsch and Valentin Andrei and Roman Malinovskyy and Gael Le Lan and Abhinandan Krishnan and Joseph Cummings and Miguel Martin and Gokul Gunasekaran and Yuta Inoue and Alex J Turner and Raghuraman Krishnamoorthi}, year={2025}, eprint={2504.20067}, archivePrefix={arXiv}, primaryClass={cs.DC}, url={https://arxiv.org/abs/2504.20067}, }

Owner

  • Name: Meta Research
  • Login: facebookresearch
  • Kind: organization
  • Location: Menlo Park, California

Citation (CITATION.cff)

@misc{hira2025scalableperformantdataloading,
      title={Scalable and Performant Data Loading}, 
      author={Moto Hira and Christian Puhrsch and Valentin Andrei and Roman Malinovskyy and Gael Le Lan and Abhinandan Krishnan and Joseph Cummings and Miguel Martin and Gokul Gunasekaran and Yuta Inoue and Alex J Turner and Raghuraman Krishnamoorthi},
      year={2025},
      eprint={2504.20067},
      archivePrefix={arXiv},
      primaryClass={cs.DC},
      url={https://arxiv.org/abs/2504.20067}, 
}

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 515
  • Total Committers: 7
  • Avg Commits per committer: 73.571
  • Development Distribution Score (DDS): 0.049
Past Year
  • Commits: 515
  • Committers: 7
  • Avg Commits per committer: 73.571
  • Development Distribution Score (DDS): 0.049
Top Committers
Name Email Commits
moto 8****k 490
moto-meta 1****a 13
Facebook Community Bot f****t 5
Victor 4****n 4
Zeno Gantner z****r 1
Richard Barnes r****s@u****u 1
Ayushi Dalmia a****4@g****m 1
Committer Domains (Top 20 + Academic)
umn.edu: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 28
  • Total pull requests: 1,243
  • Average time to close issues: about 1 month
  • Average time to close pull requests: about 17 hours
  • Total issue authors: 5
  • Total pull request authors: 10
  • Average comments per issue: 0.68
  • Average comments per pull request: 1.0
  • Merged pull requests: 1,108
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 25
  • Pull requests: 1,112
  • Average time to close issues: 19 days
  • Average time to close pull requests: about 10 hours
  • Issue authors: 5
  • Pull request authors: 10
  • Average comments per issue: 0.68
  • Average comments per pull request: 0.92
  • Merged pull requests: 983
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • mthrok (24)
  • npuichigo (1)
  • yberreby (1)
  • nicolas-dufour (1)
  • elyxlz (1)
Pull Request Authors
  • mthrok (1,175)
  • moto-meta (40)
  • vbourgin (12)
  • facebook-github-bot (7)
  • alxmrs (2)
  • gregorpm (2)
  • zenogantner (2)
  • ayushidalmia (1)
  • r-barnes (1)
  • yit-b (1)
Top Labels
Issue Labels
CLA Signed (3) help wanted (1)
Pull Request Labels
CLA Signed (1,144) fb-exported (57)

Packages

  • Total packages: 3
  • Total downloads:
    • pypi 1,568 last-month
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 0
    (may contain duplicates)
  • Total versions: 37
  • Total maintainers: 1
pypi.org: spdl-core

SPDL: Scalable and Performant Data Loading.

  • Versions: 12
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 339 Last month
Rankings
Dependent packages count: 9.6%
Average: 31.8%
Dependent repos count: 53.9%
Maintainers (1)
Last synced: 6 months ago
pypi.org: spdl-io

SPDL plugin for loading media data into array format.

  • Versions: 11
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 855 Last month
Rankings
Dependent packages count: 9.6%
Average: 31.8%
Dependent repos count: 53.9%
Maintainers (1)
Last synced: 6 months ago
pypi.org: spdl

Scalable and Performant Data Loading

  • Versions: 14
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 374 Last month
Rankings
Dependent packages count: 9.6%
Average: 36.4%
Dependent repos count: 63.3%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/_conda_cpu_build.yml actions
  • actions/checkout v4 composite
  • actions/upload-artifact v4 composite
  • conda-incubator/setup-miniconda v3 composite
.github/workflows/_conda_cpu_test.yml actions
  • actions/checkout v4 composite
  • actions/download-artifact v4 composite
  • conda-incubator/setup-miniconda v3 composite
.github/workflows/_conda_cuda_build.yml actions
  • actions/checkout v4 composite
  • actions/upload-artifact v4 composite
  • conda-incubator/setup-miniconda v3 composite
.github/workflows/_conda_cuda_test.yml actions
  • actions/checkout v4 composite
  • actions/download-artifact v4 composite
  • actions/setup-python v5 composite
  • conda-incubator/setup-miniconda v3 composite
.github/workflows/_wheel_cuda_build.yml actions
  • actions/checkout v4 composite
  • actions/upload-artifact v4 composite
.github/workflows/_wheel_cuda_test.yml actions
  • actions/checkout v4 composite
  • actions/download-artifact v4 composite
  • actions/setup-python v5 composite
.github/workflows/build_docs.yml actions
  • actions/checkout v4 composite
  • actions/checkout v3 composite
  • actions/download-artifact v4 composite
  • actions/upload-artifact v4 composite
  • conda-incubator/setup-miniconda v3 composite
.github/workflows/lint.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite
  • pre-commit/action v3.0.1 composite
.github/workflows/package_linux.yml actions
.github/workflows/package_macos.yml actions
packaging/Dockerfile docker
  • pytorch/manylinux2_28-builder cuda${CU_VERSION} build
docs/requirements.txt pypi
  • breathe *
  • exhale *
  • furo ==2024.7.18
  • sphinx ==7.4.7
  • sphinxcontrib-mermaid ==0.9.2
pyproject.toml pypi
setup.py pypi
  • numpy *