shpc-registry-cache
A cache of container executables (currently featuring the BioContainers) 🗃️
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.3%) to scientific vocabulary
Keywords
Repository
A cache of container executables (currently featuring the BioContainers) 🗃️
Basic Info
Statistics
- Stars: 1
- Watchers: 2
- Forks: 2
- Open Issues: 0
- Releases: 35
Topics
Metadata Files
README.md
Shpc Registry Cache
This is a static cache of container executables discovered on the path. The cache is updated once a week (Wednesday), and we store namespaced (based on OCI or Docker registry) identifiers from the repository root here. Since we primarily cache the set of BioContainers, that means the main set is under quay.io. These counts are useful for research purposes, or for applied uses like Singularity Registry HPC to derive an "ideal" set of entrypoints per container. The cache is generated via the container-executable-discovery action. For details about how the cache algorithm works, see the action as the source of truth. A brief description is included below.
Singularity Registry HPC
As an example of the usage of this cache, we use these cache entries to populate the Singularity HPC Registry. On a high level, shpc-registry is providing install configuration files for containers. Docker or other OCI registry containers are installed to an HPC system via module software, and to make this work really well, we need to know their aliases. This is where data from the cache comes in! Specifically for this use case this means we:
- Identify a new container, C, not in the registry from the executable cache here
- Create a set of global executable counts, G
- Define a set of counts from G in C as S
- Rank order S from least to greatest}
- Include any entries in S that have a frequency < 10
- Include any entries in S that have any portion of the name matching the container identifier
- Above that, add the next 10 executables with the lowest frequencies, and < 1,000
The frequencies are calculated across the cache here, included in counts.json. This produces a container configuration file with a likely good set of executables that represent the most unique to that container, based on data from the cache.
To learn more about Singularity Registry HPC you can:
- Read the documentation
- Browse the container module collection
Manual Update
To update manually, install the updater:
bash
$ python -m pip install git+https://github.com/vsoch/pipelib@main
$ python -m pip install git+https://github.com/singularityhub/guts@main
$ python -m pip install git+https://github.com/singularityhub/singularity-hpc@main
bash
$ git clone --depth 1 https://github.com/singularityhub/container-executable-discovery
$ cd container-executable-discovery/lib
$ pip install -e .
Then generate the biocontainers listing file:
bash
$ pip install -r .github/scripts/dev-requirements.txt
$ python .github/scripts/get_biocontainers.py /tmp/biocontainers.txt
And then run the update!
bash
$ container-discovery update-cache --root $(pwd) --repo-letter-prefix --namespace quay.io/biocontainers /tmp/biocontainers.txt
This is useful to run locally sometimes when there are huge containers that won't be extractable in a GitHub action.
Contribution
This registry showcases a container executable cache, and specifically includes over 8K containers from BioContainers. If you would like to add another source of container identifiers contributions are very much welcome!
License
This code is licensed under the MPL 2.0 LICENSE.
Owner
- Name: Container Tools
- Login: singularityhub
- Kind: organization
- Website: https://singularityhub.github.io
- Repositories: 31
- Profile: https://github.com/singularityhub
open source container hosting registry, tools, and clients
GitHub Events
Total
- Release event: 9
- Push event: 59
- Create event: 9
Last Year
- Release event: 9
- Push event: 59
- Create event: 9
Issues and Pull Requests
Last synced: 11 months ago
All Time
- Total issues: 4
- Total pull requests: 12
- Average time to close issues: 16 days
- Average time to close pull requests: about 4 hours
- Total issue authors: 2
- Total pull request authors: 3
- Average comments per issue: 3.5
- Average comments per pull request: 0.33
- Merged pull requests: 9
- Bot issues: 0
- Bot pull requests: 6
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- marcodelapierre (3)
- vsoch (1)
Pull Request Authors
- github-actions[bot] (6)
- vsoch (5)
- marcodelapierre (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/checkout v3 composite
- avakar/tag-and-release 8f4b627f03fe59381267d3925d39191e27f44236 composite
- actions/checkout v3 composite
- singularityhub/container-executable-discovery main composite
- beautifulsoup4 * development
- packaging * development
- pipelib * development
- requests * development
- singularity-hpc * development