asperasragetter
AsperaSRAgetter provides an easy way to download sequencing data from ENA by using Aspera.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.4%) to scientific vocabulary
Repository
AsperaSRAgetter provides an easy way to download sequencing data from ENA by using Aspera.
Basic Info
Statistics
- Stars: 3
- Watchers: 1
- Forks: 2
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
AsperaSRAgetter
AsperaSRAgetter provides an easy way to download sequencing data (fastq.gz format) from European Nucleotide Archive (ENA) by using Aspera.
Installation
AsperaSRAgetter has been distributed on pypi. You can easily install AsperaSRAgetter through pip. AsperaSRAgetter depends on Aspera-CLI to retrive sequencing data from ENA. It is recommended to install Aspera-CLI with Conda.
```shell
You may create a new environment for AsperaSRAgetter, but this is optional
conda create -n AsperaSRAgetter python=3.10 conda activate AsperaSRAgetter
Install AsperaSRAgetter using pip
pip install AsperaSRAgetter
Install Aspera-CLI using conda
conda install -c hcc aspera-cli ```
Workflow
AsperaSRAgetter first inquiry for corresponding fastq.gz file report through ENA filereport API. Sencondly, the MD5 hash value and ftp url of each fastq.gz files are then resolved from the report. Lastly, ftp url is then passed to Aspera transfer command ascp
to download the fastq.gz file.
The file reports will be stored as a .tsv table as records of the downloading process.
All files' MD5 hash values are saved in .md5 file which users can further verify the integrity of files.
Usage
The command name of AsperaSRAgetter is sragetter. It accepts either one SRA accession or one TXT file containing multiple accessions (see the usage example below). Note that users need to provide the path of public key authentication file of Aspera-CLI (normally should be ENVIRONMENTPATH/etc/asperawebid_dsa.openssh)
```bash usage: sragetter [-h] [-v] [-acc ACCESSION | -f FILE] -ssh SSH_KEY -o OUTDIR
options: -h, --help show this help message and exit -v, --version Show SRAdownloader version number and exit -acc ACCESSION, --accession ACCESSION SRA data accession -f FILE, --file FILE TXT file with multiple SRA accessions -ssh SSHKEY, --ssh-key SSHKEY Public key authentication file provided by Aspera command line client download package as the 'asperawebiddsa.openssh' file -o OUTDIR, --outdir OUTDIR Path to store the downloaded SRA data
Usage
Download with one accession: $ sragetter --accession sraaccession --ssh-key sshkeypath.openssh --outdir outdir_path
Download with TXT file containing multiple accessions: $ sragetter --file sraaccessions.txt --ssh-key sshkeypath.openssh --outdir outdir_path ```
Contact
If you have any questions using AsperaSRAgetter, feel free to open an issue or contact me jirunjia@gmail.com.
Owner
- Name: RunjiaJi
- Login: RunJiaJi
- Kind: user
- Location: BeiJing, China
- Company: IGGCAS
- Repositories: 13
- Profile: https://github.com/RunJiaJi
Citation (CITATION.cff)
cff-version: 1.2.0
message: If you use this software, please cite it as below.
authors:
- family-names: Ji
given-names: Runjia
orcid: "https://orcid.org/0000-0001-7316-9251"
title: "AsperaSRAgetter: a python package to download sequencing data (fastq.gz format) from European Nucleotide Archive (ENA) by using Aspera-CLI."
version: 2.1
date-released: 2023-3-12
url: https://github.com/RunJiaJi/AsperaSRAgetter
GitHub Events
Total
- Fork event: 1
Last Year
- Fork event: 1
Issues and Pull Requests
Last synced: 11 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 27 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 4
- Total maintainers: 1
pypi.org: asperasragetter
The AsperaSRAgetter provides a easy way to download sequencing data from ENA by using Aspera.
- Homepage: https://github.com/RunJiaJi/AsperaSRAgetter
- Documentation: https://asperasragetter.readthedocs.io/
- License: MIT License
-
Latest release: 2.2
published almost 2 years ago
Rankings
Maintainers (1)
Dependencies
- pandas *
- requests *