gtf-ops
Filtering GENCODE or ENSEMBLE annotation in GTF format. Annotating missing regions in iCount genomic segmentation.
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (5.9%) to scientific vocabulary
Repository
Filtering GENCODE or ENSEMBLE annotation in GTF format. Annotating missing regions in iCount genomic segmentation.
Basic Info
- Host: GitHub
- Owner: ulelab
- License: gpl-3.0
- Language: Python
- Default Branch: main
- Size: 49.2 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
gtf-ops
FiterGtf module: Filtering GENCODE or ENSEMBLE annotation in GTF format for tag "basic" (first step) and "transcriptsupportlevel" (second step).
🔴 Important note: Currently, these tags are only available for Human (Hs) and Mouse (Mm) annotations.
ResolveUnnanotated module: Annotates missing regions in iCount genomic segmentation as "genic_other".
Features
gtf-ops package contains 2 functions, which can be run as scripts, that complement iCount segmentation.
FilterGtf.py filters GENCODE or ENSEMBL genomic annotation in GTF format. It can be used prior to running iCount segment to improve genome-level segmentation by removing lower-confidence trancripts and favoring transcripts of full protein-coding genes over ncRNA, where they overlap.
ResolveUnannotated.py annotates genome segments that are not annotated by iCount segmentation. Missing annotations occur when a region overlaps with a gene in GTF annotation, but relevant transcripts were removed during filtering. Such region is not annotated as "intergenic", because it overlaps a gene, nor can it be assigned any other region (5'UTR, 3'UTR, CDS, intron or ncRNA), due to lack of relevant transcripts.
Scripts can be run via command-line interface, more details are given in respective README files.
Owner
- Name: Ulelab
- Login: ulelab
- Kind: organization
- Location: London
- Repositories: 10
- Profile: https://github.com/ulelab
Citation (CITATION.cff)
cff-version: 0.0.0 message: "If you use this software, please cite it as below." authors: - family-names: "Kuret" given-names: "Klara" orcid: "https://orcid.org/0000-0002-8445-8080" title: "gtf-ops" version: 0.0.0 doi: https://doi.org/10.5281/zenodo.8386577 date-released: 2022-06-10 url: "https://github.com/ulelab/gtf-ops"
GitHub Events
Total
- Delete event: 1
- Push event: 4
Last Year
- Delete event: 1
- Push event: 4
Dependencies
- pandas
- plumbum
- pybedtools
- python 3.7.*