pybbi

Python bindings to UCSC BigWig and BigBed library

https://github.com/nvictus/pybbi

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.4%) to scientific vocabulary

Keywords

bbi bigbed bigwig bioinformatics cython genomics kent numpy python

Keywords from Contributors

interactive marimo widgets imaging evolutionary-algorithms gym-environment mesh interpretability sequences generic
Last synced: 4 months ago · JSON representation ·

Repository

Python bindings to UCSC BigWig and BigBed library

Basic Info
  • Host: GitHub
  • Owner: nvictus
  • License: mit
  • Language: C
  • Default Branch: master
  • Homepage:
  • Size: 34.4 MB
Statistics
  • Stars: 33
  • Watchers: 8
  • Forks: 4
  • Open Issues: 2
  • Releases: 15
Topics
bbi bigbed bigwig bioinformatics cython genomics kent numpy python
Created over 9 years ago · Last pushed 9 months ago
Metadata Files
Readme License Citation

README.md

pybbi

Build Status DOI

Python interface to Jim Kent's Big Binary Indexed file (BBI) [1] library from the UCSC Genome Browser source tree using Cython.

This provides read-level access to local and remote bigWig and bigBed files but no write capabilitites. The main feature is fast retrieval of range queries into numpy arrays.

Installation

Wheels for pybbi are available on PyPI for Python 3.8, 3.9, 3.10, 3.11 on Linux (x8664 and aarch64) and Mac OSX (x8664/Intel). Apple Silicon (arm64) wheels will be made available once M1 runners are available in GitHub Actions.

$ pip install pybbi

API

The bbi.open function returns a BBIFile object.

bbi.open(path) -> BBIFile

path can be a local file path (bigWig or bigBed) or a URL. BBIFile objects are context managers and can be used in a with statement to clean up resources without calling BBIFile.close().

```python

with bbi.open('bigWigExample.bw') as f: ... x = f.fetch('chr21', 1000000, 2000000, bins=40) ```

Introspection

BBIFile.is_bigwig -> bool BBIFile.is_bigbed -> bool BBIFile.chromsizes -> OrderedDict BBIFile.zooms -> list BBIFile.info -> dict BBIFile.schema -> dict BBIFile.read_autosql() -> str

Note: BBIFile.schema['dtypes'] provides numpy data types for the fields in a bigWig or bigBed (matched from the autoSql definition).

Interval output

The actual interval records in a bigWig or bigBed can be retrieved as a pandas dataframe or as an iterator over records as tuples. The pandas output is parsed according to the file's schema.

BBIFile.fetch_intervals(chrom, start, end) -> pandas.DataFrame BBIFile.fetch_intervals(chrom, start, end, iterator=True) -> interval iterator

Summary bin records at each zoom level are also accessible.

BBIFile.fetch_summaries(chrom, start, end, zoom) -> pandas.DataFrame

Array output

Retrieve quantitative signal as an array. The signal of a bigWig file is obtained from its "value" field. The signal of a bigBed file is obtained from the genomic coverage of its intervals.

For a single range query: BBIFile.fetch(chrom, start, end, [bins [, missing [, oob, [, summary]]]]) -> 1D numpy array

To produce a stacked heatmap from a list of (1) equal-length intervals or (2) arbitrary-length intervals with bins specified: BBIFile.stackup(chroms, starts, ends, [bins [, missing [, oob, [, summary]]]]) -> 2D numpy array

  • Summary querying is supported by specifying the number of bins for coarsening. The summary statistic can be one of: 'mean', 'min', 'max', 'cov', 'std', 'or 'sum'. (default = 'mean'). Intervals need not have the same length, in which case the data from each interval will be interpolated to the same number of bins (e.g., gene bodies).

  • Missing data can be filled with a custom fill value, missing (default = 0).

  • Out-of-bounds ranges (i.e. start less than zero or end greater than the chromosome length) are permitted because of their utility e.g., for generating vertical heatmap stacks centered at specific genomic features. A separate custom fill value, oob can be provided for out-of-bounds positions (default = NaN).

Function API

The original function-based API is still available:

python bbi.is_bbi(path: str) -> bool bbi.is_bigwig(path: str) -> bool bbi.is_bigbed(path:str) -> bool bbi.chromsizes(path: str) -> OrderedDict bbi.zooms(path: str) -> list bbi.info(path: str) -> dict bbi.fetch_intervals(path: str, chrom: str, start: int, end: int, iterator: bool) -> Union[Iterable, pd.DataFrame] bbi.fetch(path: str, chrom: str, start: int, end: int, [bins: int [, missing: float [, oob: float, [, summary: str]]]]) -> np.array[1, 'float64'] bbi.stackup(path: str, chroms: np.array, starts: np.array, ends: np.array, [bins: int [, missing: float [, oob: float, [, summary: str]]]]) -> np.array[2, 'float64']

See the docstrings for complete documentation.

Related projects

  • libBigWig: Alternative C library for bigWig and bigBed files by Devon Ryan
  • pyBigWig: Python bindings for libBigWig by the same author
  • bw-python: Alternative Python wrapper to libBigWig by Brent Pederson
  • bx-python: Python bioinformatics library from James Taylor's group that includes tools for bbi files.

This library provides bindings to the reference UCSC bbi library code. Check out @dpryan79's libBigWig for an alternative and dedicated C library for big binary files. pyBigWig also provides numpy-based retrieval and bigBed support.

References

[1]: http://bioinformatics.oxfordjournals.org/content/26/17/2204.full

From source

If wheels for your platform or Python version aren't available or you want to develop, you'll need to install pybbi from source. The source distribution on PyPI ships with (slightly modified) kent utils source, which will compile before the extension module is built.

Requires - Platform: Linux or Darwin (Windows Subsystem for Linux seems to work too) - pthreads, zlib, libpng, openssl, make, pkg-config - Python 3.6+ - numpy and cython

For example, on a fresh Ubuntu instance, you'll need build-essential, make, pkg-config, zlib1g-dev, libssl-dev, libpng16-dev.

On a Centos/RedHat (rpm) system you'll need gcc, make, pkg-config, zlib-devel, openssl-devel, libpng-devel.

On a Mac, you'll need Xcode and to brew install pkg-config openssl libpng.

For development, clone the repo and install in editable mode:

$ git clone https://github.com/nvictus/pybbi.git $ cd pybbi $ pip install -e .

You can use the ARCH environment variable to specify a target architecture or ARCHFLAGS on a Mac.

Notes

Unfortunately, Kent's C source is not well-behaved library code, as it is littered with error calls that call exit(). pybbi will catch and pre-empt common input errors, but if somehow an internal error does get raised, it will terminate your interpreter instance.

Owner

  • Name: Nezar Abdennur
  • Login: nvictus
  • Kind: user
  • Location: Greater Boston Area
  • Company: UMass Chan Medical School

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: Abdennur
    given-names: Nezar
    orcid: https://orcid.org/0000-0001-5814-0864
title: "pybbi"
identifiers:
  - type: doi
    value: 10.5281/zenodo.10382980

GitHub Events

Total
  • Create event: 8
  • Issues event: 1
  • Release event: 1
  • Watch event: 4
  • Delete event: 8
  • Issue comment event: 5
  • Push event: 16
  • Pull request event: 18
  • Fork event: 1
Last Year
  • Create event: 8
  • Issues event: 1
  • Release event: 1
  • Watch event: 4
  • Delete event: 8
  • Issue comment event: 5
  • Push event: 16
  • Pull request event: 18
  • Fork event: 1

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 180
  • Total Committers: 5
  • Avg Commits per committer: 36.0
  • Development Distribution Score (DDS): 0.094
Past Year
  • Commits: 15
  • Committers: 3
  • Avg Commits per committer: 5.0
  • Development Distribution Score (DDS): 0.467
Top Committers
Name Email Commits
Nezar Abdennur n****r@g****m 163
dependabot[bot] 4****] 13
Peter Kerpedjiev p****v@g****m 2
Mark Keller 7****k 1
Benjamin A. Beasley c****e@m****t 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 17
  • Total pull requests: 40
  • Average time to close issues: 8 months
  • Average time to close pull requests: about 1 month
  • Total issue authors: 14
  • Total pull request authors: 5
  • Average comments per issue: 3.59
  • Average comments per pull request: 0.38
  • Merged pull requests: 27
  • Bot issues: 0
  • Bot pull requests: 25
Past Year
  • Issues: 0
  • Pull requests: 11
  • Average time to close issues: N/A
  • Average time to close pull requests: about 1 month
  • Issue authors: 0
  • Pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 0.45
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 9
Top Authors
Issue Authors
  • nvictus (3)
  • pkerpedjiev (2)
  • Phlya (1)
  • mildewey (1)
  • hzaumsq (1)
  • maxwellsh (1)
  • smitkadvani (1)
  • bskubi (1)
  • TkhiienLok (1)
  • dependabot[bot] (1)
  • MikeWazoWski123 (1)
  • Snehal1894 (1)
  • gfudenberg (1)
  • LeoWelter (1)
  • gibcus (1)
Pull Request Authors
  • dependabot[bot] (45)
  • nvictus (10)
  • pkerpedjiev (2)
  • musicinmybrain (2)
  • keller-mark (1)
Top Labels
Issue Labels
question (3) dependencies (1)
Pull Request Labels
dependencies (45)

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 1,803 last-month
  • Total dependent packages: 2
    (may contain duplicates)
  • Total dependent repositories: 9
    (may contain duplicates)
  • Total versions: 30
  • Total maintainers: 1
pypi.org: pybbi

Python bindings to the UCSC source for Big Binary Indexed (bigWig/bigBed) files.

  • Versions: 15
  • Dependent Packages: 2
  • Dependent Repositories: 9
  • Downloads: 1,803 Last month
Rankings
Dependent packages count: 3.1%
Dependent repos count: 4.9%
Average: 5.1%
Downloads: 7.3%
Maintainers (1)
Last synced: 4 months ago
proxy.golang.org: github.com/nvictus/pybbi
  • Versions: 15
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.5%
Average: 5.7%
Dependent repos count: 5.9%
Last synced: 5 months ago

Dependencies

setup.py pypi
  • numpy *
.github/workflows/buildwheels.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v2 composite
  • actions/upload-artifact v3 composite
  • docker/setup-qemu-action v1 composite
  • pypa/cibuildwheel v2.12.0 composite
.github/workflows/publish.yml actions
  • actions/checkout v3 composite
  • actions/download-artifact v3 composite
  • actions/setup-python v2 composite
  • actions/upload-artifact v3 composite
  • docker/setup-qemu-action v1 composite
  • pypa/cibuildwheel v2.12.0 composite
  • pypa/gh-action-pypi-publish v1.5.0 composite
pyproject.toml pypi
  • numpy *
.github/workflows/ci.yml actions
  • actions/checkout v4 composite
  • actions/setup-python v5 composite