https://github.com/clinical-genomics/housekeeper
File data orchestrator
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.4%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
File data orchestrator
Basic Info
Statistics
- Stars: 2
- Watchers: 10
- Forks: 0
- Open Issues: 2
- Releases: 79
Topics
Metadata Files
README.md
Housekeeper
Store, tag, fetch, and archive files with ease 🗃
Housekeeper is a tool that aims to provide:
- a backend for storing versioned bundles of files
- different interfaces (Python, CLI, REST) for fetching files based on tags
- a way to backup and retrieve bundles from long-term storage
Installation
Housekeeper written in Python 3.6+ and is available on the Python Package Index (PyPI).
bash
poetry install
If you would like to install the latest development version:
bash
git clone https://github.com/Clinical-Genomics/housekeeper
cd housekeeper
poetry install
Contributing
Housekeeper is using GitHub flow branching model as described in our development manual.
Documentation
Command line interface
Config file
Housekeeper supports a basic YAML config. The following options are supported:
```yaml
database: mysql+pymysql://userName:passWord@domain.com/database root: /path/to/root/dir ```
The root option is used to store files within the Housekeeper context.
Command: init
Setup (or reset) the database. It will simply setup all the tables in the database. You can reset an existing database by using the --reset option.
bash
housekeeper --database "sqlite:///hk.sqlite3" init
Success! New tables: bundle, file, file_tag_link, tag, version
Command: include
Include (hard-link) all files of an existing bundle version into Housekeeper and the root path.
bash
housekeeper myBundle
This will only work if the bundle only has a single version which can be "imported". If you want to import a specific version of a bundle you can use the --version option.
Command: delete files
Delete files that are not on disk anymore like his:
housekeeper delete files --tag fastq --notondisk
Remove all bam files before a certain date:
housekeeper delete files --tag bam --before 2017-06-15
Remove fastq files from a flowcell:
housekeeper delete files --tag fastq --tag H0HKKALXX
It'll always ask for confirmation, unless you add --yes:
housekeeper delete files --bundle sillyfish --yes
If you do not provide a --tag or --bundle, essentially deleting everything, the function will not let you do that.
Owner
- Name: Clinical Genomics
- Login: Clinical-Genomics
- Kind: organization
- Location: Stockholm, Sweden
- Website: https://clinical-genomics.github.io
- Repositories: 67
- Profile: https://github.com/Clinical-Genomics
GitHub Events
Total
- Create event: 10
- Release event: 5
- Issues event: 4
- Delete event: 5
- Issue comment event: 32
- Push event: 22
- Pull request review comment event: 2
- Pull request review event: 6
- Pull request event: 11
Last Year
- Create event: 10
- Release event: 5
- Issues event: 4
- Delete event: 5
- Issue comment event: 32
- Push event: 22
- Pull request review comment event: 2
- Pull request review event: 6
- Pull request event: 11
Committers
Last synced: almost 3 years ago
All Time
- Total Commits: 472
- Total Committers: 16
- Avg Commits per committer: 29.5
- Development Distribution Score (DDS): 0.286
Top Committers
| Name | Commits | |
|---|---|---|
| Robin Andeer | r****r@g****m | 337 |
| Kenny Billiau | K****u@s****e | 60 |
| Måns Magnusson | m****n@s****e | 22 |
| Kenny Billiau | k****u@s****e | 16 |
| Patrik Grenfeldt | p****t@s****e | 8 |
| Barry Stokman | b****n@s****e | 6 |
| Clinical Genomics Bot | c****m@g****m | 6 |
| Sebastian Diaz | j****a@s****e | 3 |
| Henrik Stranneheim | h****m@s****e | 3 |
| Sebastian Allard | s****d@s****e | 3 |
| barrystokman | b****n@g****m | 2 |
| Barry Stokman | 2****n@u****m | 2 |
| hiseq clinical | h****l@c****e | 1 |
| Sebastian Allard | s****s@g****m | 1 |
| Vincent Janvid | v****d@s****e | 1 |
| Mikael Laaksonen | m****n@s****e | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 61
- Total pull requests: 130
- Average time to close issues: 12 months
- Average time to close pull requests: about 1 month
- Total issue authors: 23
- Total pull request authors: 15
- Average comments per issue: 2.3
- Average comments per pull request: 2.38
- Merged pull requests: 109
- Bot issues: 0
- Bot pull requests: 4
Past Year
- Issues: 0
- Pull requests: 13
- Average time to close issues: N/A
- Average time to close pull requests: 4 days
- Issue authors: 0
- Pull request authors: 4
- Average comments per issue: 0
- Average comments per pull request: 2.69
- Merged pull requests: 12
- Bot issues: 0
- Bot pull requests: 1
Top Authors
Issue Authors
- seallard (14)
- moonso (7)
- diitaz93 (5)
- robinandeer (4)
- Vince-janv (3)
- emiliaol (3)
- islean (3)
- henrikstranneheim (2)
- barrystokman (2)
- emmser (2)
- ChrOertlin (2)
- annaengstrom (2)
- Mropat (2)
- moahaegglund (2)
- karlnyr (1)
Pull Request Authors
- henrikstranneheim (40)
- seallard (37)
- islean (16)
- moonso (11)
- Vince-janv (10)
- ingkebil (9)
- patrikgrenfeldt (5)
- diitaz93 (4)
- dependabot[bot] (3)
- ChrOertlin (2)
- mikaell (2)
- pbiology (1)
- Mropat (1)
- beatrizsavinhas (1)
- barrystokman (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 1,826 last-month
- Total dependent packages: 1
- Total dependent repositories: 2
- Total versions: 108
- Total maintainers: 4
pypi.org: housekeeper
File data orchestrator
- Documentation: https://housekeeper.readthedocs.io/
- License: MIT
-
Latest release: 4.13.13
published 10 months ago
Rankings
Maintainers (4)
Dependencies
- pytest * development
- pytest-mock * development
- Alchy *
- Click <7
- SQLAlchemy *
- coloredlogs *
- marshmallow *
- pyyaml *
- rich *
- actions/checkout v2.6.0 composite
- actions/setup-python v2 composite
- docker/build-push-action v3.2.0 composite
- pypa/gh-action-pypi-publish master composite
- actions/checkout v2.6.0 composite
- actions/setup-python v4.3.1 composite
- actions/checkout v2.6.0 composite
- actions/setup-python v4.3.1 composite
- samuelmeuli/lint-action v1 composite
- python 3.7-slim build