pybedtools

Python wrapper -- and more -- for BEDTools (bioinformatics tools for "genome arithmetic")

https://github.com/daler/pybedtools

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    7 of 49 committers (14.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.1%) to scientific vocabulary

Keywords from Contributors

bioinformatics genomics dna closember qt phylogenetics protein wx tk gtk
Last synced: 10 months ago · JSON representation

Repository

Python wrapper -- and more -- for BEDTools (bioinformatics tools for "genome arithmetic")

Basic Info
Statistics
  • Stars: 319
  • Watchers: 15
  • Forks: 106
  • Open Issues: 18
  • Releases: 6
Created about 16 years ago · Last pushed over 1 year ago
Metadata Files
Readme License

README.rst

Overview
--------

.. image:: https://badge.fury.io/py/pybedtools.svg?style=flat
    :target: https://badge.fury.io/py/pybedtools

.. image:: https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg
    :target: https://bioconda.github.io

The `BEDTools suite of programs `_ is widely
used for genomic interval manipulation or "genome algebra".  `pybedtools` wraps
and extends BEDTools and offers feature-level manipulations from within
Python.

See full online documentation, including installation instructions, at
https://daler.github.io/pybedtools/.

The GitHub repo is at https://github.com/daler/pybedtools.

Why `pybedtools`?
-----------------

Here is an example to get the names of genes that are <5 kb away from
intergenic SNPs:

.. code-block:: python

    from pybedtools import BedTool

    snps = BedTool('snps.bed.gz')  # [1]
    genes = BedTool('hg19.gff')    # [1]

    intergenic_snps = snps.subtract(genes)                       # [2]
    nearby = genes.closest(intergenic_snps, d=True, stream=True) # [2, 3]

    for gene in nearby:             # [4]
        if int(gene[-1]) < 5000:    # [4]
            print gene.name         # [4]

Useful features shown here include:

* `[1]` support for all BEDTools-supported formats (here gzipped BED and GFF)
* `[2]` wrapping of all BEDTools programs and arguments (here, `subtract` and `closest` and passing
  the `-d` flag to `closest`);
* `[3]` streaming results (like Unix pipes, here specified by `stream=True`)
* `[4]` iterating over results while accessing feature data by index or by attribute
  access (here `[-1]` and `.name`).

In contrast, here is the same analysis using shell scripting.  Note that this
requires knowledge in Perl, bash, and awk.  The run time is identical to the
`pybedtools` version above:

.. code-block:: bash

    snps=snps.bed.gz
    genes=hg19.gff
    intergenic_snps=/tmp/intergenic_snps

    snp_fields=`zcat $snps | awk '(NR == 2){print NF; exit;}'`
    gene_fields=9
    distance_field=$(($gene_fields + $snp_fields + 1))

    intersectBed -a $snps -b $genes -v > $intergenic_snps

    closestBed -a $genes -b $intergenic_snps -d \
    | awk '($'$distance_field' < 5000){print $9;}' \
    | perl -ne 'm/[ID|Name|gene_id]=(.*?);/; print "$1\n"'

    rm $intergenic_snps

See the `Shell script comparison `_ in the docs
for more details on this comparison, or keep reading the full documentation at
http://daler.github.io/pybedtools.

Owner

  • Name: Ryan Dale
  • Login: daler
  • Kind: user
  • Location: Bethesda, MD
  • Company: National Institutes of Health (NIH), National Institute of Child Health and Human Development (NICHD)

GitHub Events

Total
  • Create event: 7
  • Release event: 2
  • Issues event: 10
  • Watch event: 15
  • Delete event: 4
  • Issue comment event: 28
  • Push event: 22
  • Pull request review comment event: 9
  • Pull request review event: 12
  • Pull request event: 13
  • Fork event: 6
Last Year
  • Create event: 7
  • Release event: 2
  • Issues event: 10
  • Watch event: 15
  • Delete event: 4
  • Issue comment event: 28
  • Push event: 22
  • Pull request review comment event: 9
  • Pull request review event: 12
  • Pull request event: 13
  • Fork event: 6

Committers

Last synced: about 1 year ago

All Time
  • Total Commits: 1,695
  • Total Committers: 49
  • Avg Commits per committer: 34.592
  • Development Distribution Score (DDS): 0.119
Past Year
  • Commits: 41
  • Committers: 5
  • Avg Commits per committer: 8.2
  • Development Distribution Score (DDS): 0.415
Top Committers
Name Email Commits
daler d****r@n****v 1,494
Brent Pedersen b****e@g****m 85
Saulius Lukauskas s****3@i****k 11
Libor Morkovsky l****k@g****m 10
Andrew Robbins a****w@r****e 7
duartemolha d****a@g****m 7
Rob Beagrie r****b@b****m 6
Steffen Möller s****r@g****e 5
naumenko.sa e****b@g****m 5
Matt Stone m****e@c****u 5
gpratt g****t@g****m 5
Jake Biesinger j****r@g****m 4
drechsel A****l@m****e 3
Simon van Heeringen s****n@g****m 3
PanosFirmpas p****s@g****m 3
David Cain d****n@g****m 3
first last e****e 2
Blaise Li b****t@n****g 2
Saket Choudhary s****c@g****m 2
Yunfei Guo g****8@o****m 2
valentynbez w****z@g****m 2
Andre Rendeiro a****o@g****m 2
Tim Gates t****s@i****m 1
Stephen J Bush m****s@g****m 1
Saulius Lukauskas l****s@g****m 1
PeterRobots 1****s 1
gshiba g****e@b****m 1
root r****t@d****v 1
root r****t@l****) 1
Olga Botvinnik o****k@g****m 1
and 19 more...

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 88
  • Total pull requests: 50
  • Average time to close issues: 5 months
  • Average time to close pull requests: about 2 months
  • Total issue authors: 81
  • Total pull request authors: 23
  • Average comments per issue: 2.34
  • Average comments per pull request: 1.32
  • Merged pull requests: 45
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 5
  • Pull requests: 10
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 7 days
  • Issue authors: 4
  • Pull request authors: 4
  • Average comments per issue: 1.2
  • Average comments per pull request: 2.8
  • Merged pull requests: 9
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • daler (3)
  • blaiseli (3)
  • TheChymera (2)
  • smoe (2)
  • stefanor (2)
  • hannes-brt (1)
  • Accompany0313 (1)
  • roni-fultheim (1)
  • xuebingjie1990 (1)
  • mgalardini (1)
  • kako-f (1)
  • YuanfengZhang (1)
  • mpj5142 (1)
  • woutervh (1)
  • andreforesight (1)
Pull Request Authors
  • daler (27)
  • mr-c (2)
  • blaiseli (2)
  • DavidCain (2)
  • valentynbez (2)
  • cameronraysmith (2)
  • theAeon (2)
  • duartemolha (2)
  • liyao001 (2)
  • yunfeiguo (2)
  • afg1 (1)
  • mgperry (1)
  • JureZmrzlikar (1)
  • hyandell (1)
  • simonvh (1)
Top Labels
Issue Labels
0.9.1 (1)
Pull Request Labels

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 91,542 last-month
  • Total docker downloads: 166,588
  • Total dependent packages: 71
    (may contain duplicates)
  • Total dependent repositories: 360
    (may contain duplicates)
  • Total versions: 59
  • Total maintainers: 1
pypi.org: pybedtools

Wrapper around BEDTools for bioinformatics work

  • Versions: 34
  • Dependent Packages: 71
  • Dependent Repositories: 360
  • Downloads: 91,542 Last month
  • Docker Downloads: 166,588
Rankings
Dependent packages count: 0.3%
Dependent repos count: 0.8%
Docker downloads count: 0.9%
Downloads: 1.0%
Average: 1.9%
Stargazers count: 3.7%
Forks count: 4.6%
Maintainers (1)
Last synced: 10 months ago
proxy.golang.org: github.com/daler/pybedtools
  • Versions: 25
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 9.0%
Average: 9.6%
Dependent repos count: 10.2%
Last synced: 10 months ago

Dependencies

dev-requirements.txt pypi
  • cython * development
  • matplotlib * development
  • numpydoc * development
  • pandas * development
  • pysam * development
  • pyyaml * development
  • sphinx * development
optional-requirements.txt pypi
  • bedtools *
  • genomepy >=0.8
  • matplotlib *
  • ucsc-bedgraphtobigwig *
  • ucsc-bigwigtobedgraph *
  • ucsc-wigtobigwig *
requirements.txt pypi
  • numpy *
  • pandas *
  • pysam *
  • six *
setup.py pypi
  • pysam *
  • six *
test-requirements.txt pypi
  • numpydoc * test
  • pathlib * test
  • psutil * test
  • pytest * test
  • pyyaml * test
  • sphinx * test
.github/workflows/main.yml actions
  • actions/checkout v2 composite
  • actions/upload-artifact v2 composite
docker/pbt-test-py2/Dockerfile docker
  • ubuntu 14.04 build
docker/pbt-test-py3/Dockerfile docker
  • ubuntu 14.04 build