Science Score: 77.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 5 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
✓Committers with academic emails
1 of 21 committers (4.8%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.5%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Genomic interval operations on Pandas DataFrames
Basic Info
Statistics
- Stars: 184
- Watchers: 10
- Forks: 34
- Open Issues: 32
- Releases: 30
Topics
Metadata Files
README.md
Bioframe: Operations on Genomic Interval Dataframes

Bioframe enables flexible and scalable operations on genomic interval dataframes in Python.
Bioframe is built directly on top of Pandas. Bioframe provides:
- A variety of genomic interval operations that work directly on dataframes.
- Operations for special classes of genomic intervals, including chromosome arms and fixed-size bins.
- Conveniences for diverse tabular genomic data formats and loading genome assembly summary information.
Read the documentation, including the guide, as well as the publication for more information.
Bioframe is an Affiliated Project of NumFOCUS.
Installation
Bioframe is available on PyPI and bioconda:
sh
pip install bioframe
Contributing
Interested in contributing to bioframe? That's great! To get started, check out the contributing guide. Discussions about the project roadmap take place on the Open2C Slack and regular developer meetings scheduled there. Anyone can join and participate!
Interval operations
Key genomic interval operations in bioframe include:
- overlap: Find pairs of overlapping genomic intervals between two dataframes.
- closest: For every interval in a dataframe, find the closest intervals in a second dataframe.
- cluster: Group overlapping intervals in a dataframe into clusters.
- complement: Find genomic intervals that are not covered by any interval from a dataframe.
Bioframe additionally has functions that are frequently used for genomic interval operations and can be expressed as combinations of these core operations and dataframe operations, including: coverage, expand, merge, select, and subtract.
To overlap two dataframes, call:
```python
import bioframe as bf
bf.overlap(df1, df2) ```
For these two input dataframes, with intervals all on the same chromosome:

overlap will return the following interval pairs as overlaps:

To merge all overlapping intervals in a dataframe, call:
```python
import bioframe as bf
bf.merge(df1) ```
For this input dataframe, with intervals all on the same chromosome:

merge will return a new dataframe with these merged intervals:

See the guide for visualizations of other interval operations in bioframe.
File I/O
Bioframe includes utilities for reading genomic file formats into dataframes and vice versa. One handy function is read_table which mirrors pandas’s readcsv/readtable but provides a schema argument to populate column names for common tabular file formats.
python
jaspar_url = 'http://expdata.cmmt.ubc.ca/JASPAR/downloads/UCSC_tracks/2022/hg38/MA0139.1.tsv.gz'
ctcf_motif_calls = bioframe.read_table(jaspar_url, schema='jaspar', skiprows=1)
Tutorials
See this jupyter notebook for an example of how to assign TF motifs to ChIP-seq peaks using bioframe.
Citing
If you use bioframe in your work, please cite:
bibtex
@article{bioframe_2024,
author = {Open2C and Abdennur, Nezar and Fudenberg, Geoffrey and Flyamer, Ilya M and Galitsyna, Aleksandra A and Goloborodko, Anton and Imakaev, Maxim and Venev, Sergey},
doi = {10.1093/bioinformatics/btae088},
journal = {Bioinformatics},
title = {{Bioframe: Operations on Genomic Intervals in Pandas Dataframes}},
year = {2024}
}
Owner
- Name: Open Chromosome Collective
- Login: open2c
- Kind: organization
- Email: open.chromosome.collective@gmail.com
- Website: https://open2c.github.io/
- Repositories: 14
- Profile: https://github.com/open2c
Citation (CITATION.cff)
cff-version: 1.2.0
type: software
title: bioframe
license: MIT
repository-code: 'https://github.com/open2c/bioframe'
message: >-
If you use this software, please cite it using the
metadata from this file.
authors:
- given-names: Nezar
family-names: Abdennur
orcid: 'https://orcid.org/0000-0001-5814-0864'
- given-names: Geoffrey
family-names: Fudenberg
orcid: "https://orcid.org/0000-0001-5905-6517"
- given-names: Ilya
family-names: Flyamer
orcid: "https://orcid.org/0000-0002-4892-4208"
- given-names: Aleksandra
family-names: Galitsyna
orcid: "https://orcid.org/0000-0001-8969-5694"
- given-names: Anton
family-names: Goloborodko
orcid: "https://orcid.org/0000-0002-2210-8616"
- given-names: Maxim
family-names: Imakaev
orcid: "https://orcid.org/0000-0002-5320-2728"
- given-names: Sergey
family-names: Venev
orcid: "https://orcid.org/0000-0002-1507-7460"
abstract: >-
Bioframe is a library to enable flexible and performant
operations on genomic interval data frames in Python.
keywords:
- bioinformatics
- genomics
- ranges
- intervals
- dataframes
- pandas
- numpy
- Python
identifiers:
- type: doi
value: 10.5281/zenodo.3897573
description: Zenodo
- type: doi
value: 10.1101/2022.02.16.480748
description: bioRxiv preprint
- type: doi
value: 10.1093/bioinformatics/btae088
description: Publication
preferred-citation:
type: article
title: "Bioframe: Operations on Genomic Intervals in Pandas Dataframes"
authors:
- family-names: Open2C
- given-names: Nezar
family-names: Abdennur
orcid: 'https://orcid.org/0000-0001-5814-0864'
- given-names: Geoffrey
family-names: Fudenberg
orcid: "https://orcid.org/0000-0001-5905-6517"
- given-names: Ilya
family-names: Flyamer
name-suffix: M
orcid: "https://orcid.org/0000-0002-4892-4208"
- given-names: Aleksandra
family-names: Galitsyna
name-suffix: A
orcid: "https://orcid.org/0000-0001-8969-5694"
- given-names: Anton
family-names: Goloborodko
orcid: "https://orcid.org/0000-0002-2210-8616"
- given-names: Maxim
family-names: Imakaev
orcid: "https://orcid.org/0000-0002-5320-2728"
- given-names: Sergey
family-names: Venev
orcid: "https://orcid.org/0000-0002-1507-7460"
journal: Bioinformatics
year: 2024
url: "https://doi.org/10.1093/bioinformatics/btae088"
doi: "10.1093/bioinformatics/btae088"
GitHub Events
Total
- Create event: 2
- Release event: 2
- Issues event: 2
- Watch event: 13
- Delete event: 2
- Issue comment event: 9
- Push event: 62
- Pull request event: 17
- Fork event: 10
Last Year
- Create event: 2
- Release event: 2
- Issues event: 2
- Watch event: 13
- Delete event: 2
- Issue comment event: 9
- Push event: 62
- Pull request event: 17
- Fork event: 10
Committers
Last synced: 8 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Nezar Abdennur | n****r@g****m | 257 |
| Anton Goloborodko | g****n@g****m | 115 |
| gfudenberg | g****g@g****m | 88 |
| Geoff Fudenberg | g****g@L****l | 56 |
| agalitsyna | a****a@g****m | 30 |
| Sergey Venev | s****y@g****m | 11 |
| mimakaev | m****v@g****m | 8 |
| Phlya | f****r@g****m | 7 |
| pre-commit-ci[bot] | 6****] | 6 |
| Sameer Abraham | s****0@g****m | 5 |
| Félix Raimundo | g****s@g****m | 3 |
| Nilesh Patra | n****h@n****o | 3 |
| dependabot[bot] | 4****] | 3 |
| luisdiaz1997 | l****3@h****m | 2 |
| smit kadvani | s****i@g****m | 2 |
| George Spracklin | g****n@g****m | 1 |
| Gökçen Eraslan | e****n@g****m | 1 |
| Harshit | h****6@g****m | 1 |
| Isaac Virshup | i****p@g****m | 1 |
| aafkevandenberg | a****g@g****m | 1 |
| Thomas Reimonn | t****n@u****u | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 84
- Total pull requests: 114
- Average time to close issues: 8 months
- Average time to close pull requests: 23 days
- Total issue authors: 26
- Total pull request authors: 20
- Average comments per issue: 2.33
- Average comments per pull request: 0.81
- Merged pull requests: 94
- Bot issues: 0
- Bot pull requests: 16
Past Year
- Issues: 1
- Pull requests: 12
- Average time to close issues: N/A
- Average time to close pull requests: about 2 months
- Issue authors: 1
- Pull request authors: 5
- Average comments per issue: 0.0
- Average comments per pull request: 0.17
- Merged pull requests: 2
- Bot issues: 0
- Bot pull requests: 3
Top Authors
Issue Authors
- golobor (12)
- sergpolly (10)
- gfudenberg (10)
- Phlya (9)
- nvictus (7)
- ivirshup (5)
- penguinpee (4)
- agalitsyna (3)
- WANGchuang715 (2)
- vbchavali (2)
- mimakaev (2)
- endrebak (2)
- skytguuu (2)
- benjaminbauer (2)
- marade (1)
Pull Request Authors
- nvictus (45)
- gfudenberg (17)
- pre-commit-ci[bot] (12)
- dependabot[bot] (7)
- agalitsyna (6)
- Manas-7854 (6)
- gamazeps (6)
- smitkadvani (4)
- sergpolly (3)
- Samia35-2973 (2)
- harshit148 (2)
- milandvijay (2)
- Phlya (2)
- golobor (2)
- emdann (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 14
-
Total downloads:
- pypi 6,644 last-month
- Total docker downloads: 354
-
Total dependent packages: 8
(may contain duplicates) -
Total dependent repositories: 5
(may contain duplicates) - Total versions: 74
- Total maintainers: 4
pypi.org: bioframe
Operations and utilities for Genomic Interval Dataframes.
- Documentation: https://bioframe.readthedocs.io/
- License: MIT
-
Latest release: 0.8.0
published 9 months ago
Rankings
Maintainers (3)
alpine-edge: py3-bioframe-doc
Pandas utilities for tab-delimited and other genomic data files (documentation)
- Homepage: https://github.com/open2c/bioframe
- License: MIT
-
Latest release: 0.8.0-r0
published 9 months ago
Rankings
Maintainers (1)
alpine-v3.22: py3-bioframe-pyc
Precompiled Python bytecode for py3-bioframe
- Homepage: https://github.com/open2c/bioframe
- License: MIT
-
Latest release: 0.8.0-r0
published 9 months ago
Rankings
Maintainers (1)
alpine-v3.22: py3-bioframe
Pandas utilities for tab-delimited and other genomic data files
- Homepage: https://github.com/open2c/bioframe
- License: MIT
-
Latest release: 0.8.0-r0
published 9 months ago
Rankings
Maintainers (1)
alpine-edge: py3-bioframe
Pandas utilities for tab-delimited and other genomic data files
- Homepage: https://github.com/open2c/bioframe
- License: MIT
-
Latest release: 0.8.0-r0
published 9 months ago
Rankings
Maintainers (1)
alpine-edge: py3-bioframe-pyc
Precompiled Python bytecode for py3-bioframe
- Homepage: https://github.com/open2c/bioframe
- License: MIT
-
Latest release: 0.8.0-r0
published 9 months ago
Rankings
Maintainers (1)
alpine-v3.21: py3-bioframe-pyc
Precompiled Python bytecode for py3-bioframe
- Homepage: https://github.com/open2c/bioframe
- License: MIT
-
Latest release: 0.7.2-r1
published over 1 year ago
Rankings
Maintainers (1)
alpine-v3.21: py3-bioframe-doc
Pandas utilities for tab-delimited and other genomic data files (documentation)
- Homepage: https://github.com/open2c/bioframe
- License: MIT
-
Latest release: 0.7.2-r1
published over 1 year ago
Rankings
Maintainers (1)
alpine-v3.22: py3-bioframe-doc
Pandas utilities for tab-delimited and other genomic data files (documentation)
- Homepage: https://github.com/open2c/bioframe
- License: MIT
-
Latest release: 0.8.0-r0
published 9 months ago
Rankings
Maintainers (1)
alpine-v3.21: py3-bioframe
Pandas utilities for tab-delimited and other genomic data files
- Homepage: https://github.com/open2c/bioframe
- License: MIT
-
Latest release: 0.7.2-r1
published over 1 year ago
Rankings
Maintainers (1)
alpine-v3.19: py3-bioframe
Pandas utilities for tab-delimited and other genomic data files
- Homepage: https://github.com/open2c/bioframe
- License: MIT
-
Latest release: 0.5.1-r0
published about 2 years ago
Rankings
Maintainers (1)
alpine-v3.19: py3-bioframe-pyc
Precompiled Python bytecode for py3-bioframe
- Homepage: https://github.com/open2c/bioframe
- License: MIT
-
Latest release: 0.5.1-r0
published about 2 years ago
Rankings
alpine-v3.20: py3-bioframe-pyc
Precompiled Python bytecode for py3-bioframe
- Homepage: https://github.com/open2c/bioframe
- License: MIT
-
Latest release: 0.6.4-r1
published almost 2 years ago
Rankings
Maintainers (1)
alpine-v3.20: py3-bioframe
Pandas utilities for tab-delimited and other genomic data files
- Homepage: https://github.com/open2c/bioframe
- License: MIT
-
Latest release: 0.6.4-r1
published almost 2 years ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/checkout v4 composite
- actions/setup-python v4 composite
- pypa/gh-action-pypi-publish release/v1 composite
- matplotlib *
- numpy >=1.10
- pandas >=1.3
- pyyaml *
- requests *
- typing-extensions python_version<'3.9'