Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
1 of 2 committers (50.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.2%) to scientific vocabulary
Keywords
Repository
python Guide aligned Sequences
Basic Info
Statistics
- Stars: 1
- Watchers: 11
- Forks: 0
- Open Issues: 1
- Releases: 0
Topics
Metadata Files
README.md
pyGaS
python Guide aligned Sequences
Docker and Singularity
There are pre-built images containing this codebase on quay.io. When pulling an image you must specify
the version there is no latest.
The docker images are known to work correctly after import into a singularity image.
Command example
The code is intended to be used as an API, not through this command line, however limited use is possible.
bash
pygas run -t examples/targets.txt.gz -q examples/queries.txt.gz -o your_result.tsv
Inputs
queries.txt- A unique list of sequences (for performance reasons), one per line
- This could be reworked to handle internally, however memory is a consideration
- Matching sequences back to real input data and related information would be the responsibility of wrapping code
targets.txt- One target sequence per line
- Reverse compliment is handled automatically, see output format.
- Targets need to be unique during mapping, expand out for things like dual guide permutations in your application
Output format
Very simple text output of values that are available in API:
```text
query reversed tid tpos cigar seq md repeat_2-7...
AAAAATCGCTGCTACAGGT False 48566 1 AAAAATCGCTGCTACAGGT M19 19 CTGGTCTCGCACCCCAGGC False 65601 1 CTGGTCTCGCACCCCAGGC M19 18T GGCGCGGTACTTGCCCAGA False 34773 1 GGCGCGGTACTTGCCCAGA S1M18 18 AAAAAAAAAAAAAAAAAAA False 0 1 AAAAAAAAAAAAAAAAAAA M19 19 True 1 1 TTTTTTTTTTTTTTTTTTT M19 19 ... ```
Where:
| Column | Description | Interpretation |
|------------|------------------------------------------|------------------------------------------------------------------|
| query | Original query sequence | |
| reversed | Read was reversed to match the target | following fields are based on this orientation |
| t_id | ID of target mapped to | 0-based numbering in order targets passed |
| t_pos | Start position within target sequence | 1-based |
| seq | Query in mapped orientation | Corresponds to cigar and md orientation |
| cigar | cigar string for use in SAM like files | For details see the SAM specification |
| md | MD string for use in SAM like files | For details see the SAM optional field specification |
Development
Install
```bash python3 -m venv venv source venv/bin/activate pip install -r requirements.txt python3 setup.py develop
see later
pre-commit install
remember to update requirements
pip freeze | grep -v virtualenv > requirements.txt ```
Testing
There are 4 layers to testing and standards:
- Local
venvtesting - Local
pre-commithooks - Tests embedded in
docker build CItests
Local venv testing
bash
/tests/scripts/run_unit_tests.sh
Local pre-commit hooks
This project additionally uses git pre-commit hooks via the pre-commit tool. These are concerned
with file formats and standards, not the actual execution of code. See ./.pre-commit-config.yaml.
Docker testing
The Docker build includes the unit tests, but removes many of the libraries before the final build stage. Mainly for CI tests.
CI tests
CI includes 2 additional tests, each based on the 2 datasets in the ./examples directory.
Updating licence headers
Please use skywalking-eyes.
Expected workflow:
- Check state before modifying
.licenserc.yaml:docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header check- You should get some 'valid' here, those without a header as 'invalid'
- Modify
.licenserc.yaml - Apply the changes:
docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header fix
- Add/commit changes
This is executed in the CI pipeline.
DO NOT edit the header in the files, please modify the date component of content in .licenserc.yaml. The exceptions being:
README.mdpygas/matrix.pyc- You will need to manually update, but the checks will accept it once updated
If you need to make more extensive changes to the license carefully test the pattern is functional.
LICENSE
``` Copyright (c) 2021
Author: CASM/Cancer IT cgphelp@sanger.ac.uk
This file is part of pygas.
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see https://www.gnu.org/licenses/.
- The usage of a range of years within a copyright statement contained within this distribution should be interpreted as being equivalent to a list of years including the first and last year specified and all consecutive years between them. For example, a copyright statement that reads ‘Copyright (c) 2005, 2007- 2009, 2011-2012’ should be interpreted as being identical to a statement that reads ‘Copyright (c) 2005, 2007, 2008, 2009, 2011, 2012’ and a copyright statement that reads ‘Copyright (c) 2005-2012’ should be interpreted as being identical to a statement that reads ‘Copyright (c) 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012’. ```
Owner
- Name: CASM IT
- Login: cancerit
- Kind: organization
- Email: cgpit@sanger.ac.uk
- Location: Hinxton, Cambridge, UK
- Website: http://www.sanger.ac.uk/science/programmes/cancer-genetics-and-genomics
- Repositories: 89
- Profile: https://github.com/cancerit
CASM IT provide bioinformatic support for Cancer, Ageing and Somatic Mutation group at the Wellcome Sanger Institute
GitHub Events
Total
Last Year
Committers
Last synced: almost 3 years ago
All Time
- Total Commits: 18
- Total Committers: 2
- Avg Commits per committer: 9.0
- Development Distribution Score (DDS): 0.056
Top Committers
| Name | Commits | |
|---|---|---|
| Keiran Raine | k****2@s****k | 17 |
| Keiran Raine | k****e@u****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 0
- Total pull requests: 4
- Average time to close issues: N/A
- Average time to close pull requests: 4 minutes
- Total issue authors: 0
- Total pull request authors: 2
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- keiranmraine (3)
- superjw (2)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- Cython ==0.29.23
- PyYAML ==5.4.1
- appdirs ==1.4.4
- attrs ==21.2.0
- cfgv ==3.2.0
- click ==8.0.0
- click-option-group ==0.5.3
- coverage ==5.5
- distlib ==0.3.1
- filelock ==3.0.12
- identify ==2.2.4
- iniconfig ==1.1.1
- nodeenv ==1.6.0
- packaging ==20.9
- pluggy ==0.13.1
- pre-commit ==2.12.1
- py ==1.10.0
- pyparsing ==2.4.7
- pytest ==6.2.4
- pytest-cov ==2.12.0
- six ==1.16.0
- toml ==0.10.2