Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.4%) to scientific vocabulary
Repository
horizontal pileup
Basic Info
- Host: GitHub
- Owner: brentp
- License: apache-2.0
- Language: C
- Default Branch: master
- Size: 72.3 KB
Statistics
- Stars: 16
- Watchers: 4
- Forks: 3
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
hileup is an early-stage version of a pileup engine.
It aims to provide an interface that is: + easy-to-use + fast from an interpreted language (like python).
It is currently targetted for accessing targetted sites (e.g. < 100K sites), rather than sweeping across every site in the genome.
There is a version in nim, one in C, and a cython wrapper for the C in python.
Python
The python version, which takes a pysam AlignmentFile object looks like:
```Python import pysam import chileup
bam = pysam.AlignmentFile("tests/three.bam", "rb")
setting track_xxx to False will speed the hileup as less copying and and data access is required.
config = chileup.Config(tags=[], trackreadnames=True, trackreads=True, trackbasequalities=True, trackmappingqualities=True, excludeflags=pysam.FQCFAIL | pysam.FSECONDARY | pysam.FSUPPLEMENTARY | pysam.FDUP, minbasequality=10, minmappingquality=10)
We can ignore all reads with 'C' at this base (for example to get variant-only reads)
h = chileup.pileup(bam, "1", 1585270, config, 'C')
print(h.bases) # 'tt' print(h.read_names) # [b'A00227:74:HCWC7DSXX:1:1269:13449:13855', b'A00227:74:HCWC7DSXX:1:2426:7157:15483'] print(h.bqs) # [37, 37] numpy array that is a view into underlying data. print(h.mqs) # [60, 60] numpy view. print(h.deletions) # [(0, 8) (1, 8)] numpy view
NOTE: if you're needing this, it might be simpler to use pysam pileup.
reads = h.reads(bam.header) # [
the insertions and deletions have a .index property that can be used
to access the read-names, tags, etc that are associated with the indel event.
for ins in h.insertions: # copy of the data. print(h.read_names[ins.index], h.tags[ins.index], ins.sequence, ins.len)
print('tags:', h.tags) # copy. ```
To build python setup.py build_ext -i
To install python setup.py install
Because it minimizes operations in python, it is quite fast (for python).
NOTE that strand information is encoded by case for python (lower case == reverse strand).
C
The C version should be transparent to anyone familier with htslib The signature is:
C
hile *hileup(htsFile *htf, bam_hdr_t *hdr, hts_idx_t *idx, char *chrom, int position, hile_config_t *cfg);
where hile_config_t is a simple struct that indicates min-mapping and base-qualities and whether to
track read-names, base-qualities, etc.
```C htsFile *htf = htsopen("tests/three.bam", "rb"); int start = 1585270; bamhdrt *hdr = samhdrread(htf); htsidxt *idx = samindexload(htf, "tests/three.bam"); hileconfigt cfg = hileinitconfig(); cfg.trackbasequalities = true; cfg.trackmappingqualities = true; cfg.trackread_names = true; // track the cell-barcode so we can get per-cell pileup!! cfg.tags[0] = 'C'; cfg.tags[1] = 'B';
hile* h = hileup(htf, hdr, idx, "1", start, &cfg);
fprintf(stderr, "%s:%d ", "1", start);
for(int i=0; i < h->n; i++){
fprintf(stderr, "%c", (char)h->bases[i].base);
}
if(cfg.track_mapping_qualities) {
fprintf(stderr, " ");
for(int i=0; i < h->n; i++){
fprintf(stderr, "%c", (char)(h->bqs[i] + 33));
}
}
if(cfg.tags[0] != 0) {
fprintf(stderr, " ");
for(int i=0; i < h->n; i++){
fprintf(stderr, "%d:%s ", i, h->tags[i]);
}
}
fprintf(stderr, "\n");
hile_destroy(h);
bam_hdr_destroy(hdr);
hts_idx_destroy(idx);
hts_close(htf);
```
Owner
- Name: Brent Pedersen
- Login: brentp
- Kind: user
- Location: Oregon, USA
- Twitter: brent_p
- Repositories: 220
- Profile: https://github.com/brentp
Doing genomics
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0