https://github.com/shitohana/BSXplorer
Comprehensive tool for visualizing genome-wide cytosine data.
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
✓DOI references
Found 4 DOI reference(s) in README -
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.9%) to scientific vocabulary
Keywords
Repository
Comprehensive tool for visualizing genome-wide cytosine data.
Basic Info
- Host: GitHub
- Owner: shitohana
- License: mit
- Language: Python
- Default Branch: master
- Homepage: https://shitohana.github.io/BSXplorer/
- Size: 14.2 MB
Statistics
- Stars: 8
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 4
Topics
Metadata Files
README.md
Table of contents
BSXplorer
Analytical framework for BS-seq data comparison and visualization. BSXplorer facilitates efficient methylation data mining, contrasting and visualization, making it an easy-to-use package that is highly useful for epigenetic research.
For Python API reference manual and tutorials visit: https://shitohana.github.io/BSXplorer.
How to cite
## How to cite If you use our package in your research, please consider citing our paper. Yuditskiy, K., Bezdvornykh, I., Kazantseva, A. et al. BSXplorer: analytical framework for exploratory analysis of BS-seq data. BMC Bioinformatics 25, 96 (2024). https://doi.org/10.1186/s12859-024-05722-9Installation
To install latest stable version:
commandline
pip install bsxplorer
If you want to install the prerelease version (dev branch):
commandline
pip install pip install git+https://github.com/shitohana/BSXplorer.git@dev
Usage
In this project our aim was to create a both powerful and flexible tool to facilitate exploratory data analysis of BS-Seq data obtained in non-model organisms (BSXplorer works for model organisms as well). That's why BSXplorer is implemented as a Python package. Modular structure of BSXplorer together with easy to use and configurable API makes it a highly integratable and scalable package for a wide range of applications in bioinformatical projects.
Even though BSXplorer is available as console application, to fully utilize its potential consider using it as a python package. Detailed documentation can be found here.
API usage
python
import bsxplorer as bsx
Basic usage
The main objects in BSXplorer are the Genome and Metagene, MetageneFiles
classes. Genome class is used for reading and filtering genomic annotation data.
python
genome = bsx.Genome.from_gff("path/to/annotation.gff")
Even though here genome was created with .from_gff constructor, to read custom
annotation format (TSV file), use .from_custom and specify column indexes (0-based).
Once we have read annotation file, methylation report can be processed via Metagene
class (or MetageneFiles for multiple reports).
python
metagene = bsx.Metagene.from_bismark(
"path/to/report.txt",
genome=genome.gene_body(min_length=0, flank_length=2000),
up_windows=100, body_windows=200, down_windows=100
)
Here we have read methylation report file. Methylation data has been read only
for gene bodies (genome.gene_body(min_length=0, flank_length=2000)) with
200 windows resolution for gene body (body_windows=200) and 100 for flanking
regions (up_windows=100, down_windows=100).
Now we can generate visualiztions.
python
filtered = metagene.filter(context="CG")
filtered.line_plot().draw_mpl()
filtered.heat_map().draw_mpl()


BSXplorer can generate plots with two plotting libraries: matplotlib and Plotly.
_mpl in methods names stands for matplotlib and _plotly for Plotly.
Clusterisation
BSXplorer allows for discovery of gene modules characterised with similar methylation patterns.
Once the data was filtered based on methylation context and strand,
one can use the .cluster() method. The resulting object contains an
ordered list of clustered genes and their visualisation in a form of a heatmap.

Chromosome methylation levels
BSXplorer allows a user to visualize the overall methylation levels of chromosomes using the corresponding ChrLevels object:
python
levels = bsx.ChrLevels.from_bismark("path/to/report.txt", chr_min_length=10**6, window_length=10**6)
levels.draw_mpl(smooth=5)

In a way that is similar to the Metagene method, the methylation data can be subjected to filtering to selectively display a methylation context that is of interest.
python
levels.filter(context="CG").draw_mpl(smooth=5)

Gene body methylation
BSXplorer allows for the categorization of regions based on their methylation level and density. This is done by assuming that cytosine methylation levels follow a binomial distribution, as explained in Takuno and Gaut's research (please refer to [1, 2] https://doi.org/10.1073/pnas.1215380110 for details). The genes are then divided into three categories, BM (body-methylated), IM (intermediately-methylated) and UM (under-methylated), based on their methylation levels in the CG context using the following formula.
$$ CG
$$ P{CG}\le CG<1-P{CG};\ \ CHG/CHH>1-P_{CG} $$
$$ CG/CHG/CHH>1-P_{CG} $$
The same rationale may be applied to other methylation contexts, as BSXplorer can produce $P{CHG}$ and $P{CHH}$ for CHG sites and CHH sites, respectively.
[1] Takuno S, Gaut BS. Body-Methylated Genes in Arabidopsis thaliana Are Functionally Important and Evolve Slowly. Mol Biol Evol. 2012;29:219–27.
[2] Takuno S, Gaut BS. Gene body methylation is conserved between plant orthologs and is of evolutionary consequence. Proc Natl Acad Sci. 2013;110:1797–802.
```python
Calculate pvalue for cytosine methylation via binomial test
binomdata = bsx.BinomialData.fromreport( "path/to/report.txt", report_type="bismark" )
Created binomial data object can now be used to calculate pvalues
for methylation of genomic regions
regionstats = binomdata.regionpvalue(genome.genebody(), methylation_pvalue=.01)
.categorise method returns tuple of three DataFrames
for BM, IM and UM genes respectively
bm, im, um = regionstats.categorise(context="CG", pvalue=.05)
Now we can create MetageneFiles object to visualize methylation pattern
of categorised groups
cat_metagene = bsx.MetageneFiles([ metagene.filter(context="CG", genome=bm), metagene.filter(context="CG", genome=im), metagene.filter(context="CG", genome=um), ], labels=["BM", "IM", "UM"])
And plot it
ticklabels = ["-2000kb", "TSS", "", "TES", "+2000kb"] catmetagene.lineplot().drawmpl(ticklabels=ticklabels) ```
Different organisms analysis
Start with import of genome annotation data for species of interest.
python
arath_genes = bsxplorer.Genome.from_gff("arath_genome.gff").gene_body(min_length=0)
bradi_genes = bsxplorer.Genome.from_gff("bradi_genome.gff").gene_body(min_length=0)
mouse_genes = bsxplorer.Genome.from_gff("musmu_genome.gff").gene_body(min_length=0)
Next, read in cytosine reports for each sample separately:
```python windowkwargs = dict(upwindows=200, bodywindows=400, downwindows=200)
arathmetagene = bsx.Metagene.frombismark("arathexample.txt", arathgenes, *windowkwargs) bradimetagene = bsx.Metagene.frombismark("bradiexample.txt", bradi_genes, *windowkwargs) musmumetagene = bsx.Metagene.frombismark("musmuexample.txt", mousegenes, **windowkwargs) ```
To perform comparative analysis, initialize the bsxplorer.MetageneFiles
class using metagene data in a vector format, where labels for every organism
are provided explicitly.
Next, apply methylation context and strand filters to the input files:
python
filtered = files.filter("CG", "+")
Then, a compendium of line plots to guide a comparative analyses of methylation patterns in different species is constructed:
python
filtered.line_plot(smooth=50).draw_mpl()

The line plot representation may be further supplemented by a heatmap:
python
filtered.heat_map(100, 100).draw_mpl()

To examine and highlight differences in methylation patterns between different organisms, summary statistics is made available in a graphical format.
python
filtered.box_plot(violin=True).draw_mpl()
filtered.box_plot().draw_mpl()

Enrichment of DMRs
BSXplorer offers functionality to align one set of regions over another. Regions can
be read either with :class:Genome or initialized directly with
polars functionality <https://docs.pola.rs/api/python/stable/reference/api/polars.read_csv.html>_
(DataFrame need to have chr, start and end columns).
To align regions (e.g. define DMR position relative to genes) or perform the enrichment of regions at these
genomic features against the genome background use :class:Enrichment.
```python
If you want to perform an ENRICHMENT, and not only plot
the density of metagene coverage, you NEED to use .raw() method
for genome DataFrame.
genes = bsx.Genome.fromgff("path/to/annot.gff").raw() dmr = bsx.Genome.fromcustom( "path/to/dmr.txt", chrcol=0, # Theese columns indexes are configurable startcol=1, end_col=2 ).all()
enrichment = bsx.Enrichment(dmr, genes, flank_length=2000).enrich() ```
Enrichment.enrich returns EnrichmentResult, which stores enrichment
statistics and coordinates of regions which have aligned with
genomic features. The metagene coverage with regions
can be plotted via EnrichmentResult.plot_density_mpl method.
python
fig = enrichment.plot_density_mpl(
tick_labels=["-2000bp", "TSS", "Gene body", "TES" "+2000bp"],
)
Enrichment statistics can be accessed with EnrichmentResult.enrich_stats
or plotted with EnrichmentResult.plot_enrich_mpl
python
enrichment.plot_enrich_mpl()
Other functionality
For other functionality, such as methylation reports conversion and BAM conversion and statistics please refer to the documentation.
Console usage
BSXplorer can be used in a console mode for generating complex HTML-reports (see example here) and running many analysis at once or converting BAM to methylation report. For detailed commands description and examples, please refer to the documentation.
What's new
Since publication we have released Version 1.1.0.
Major changes
Added new classes for Unified reading of methylation reports (
UniversalReader,UniversalReplicatesReader). Now any supported report type can be converted into another.Added support for processing BAM files (
BAMReader). BAM files can be either converted to methylation report (faster than with native methods), or methylation statistics, such as methylation entropy, epipolymorphism or PDR can be calculated.Added method for aligning one set of regions along another (e.g. DMR along genes) –
Enrichment. Regions can not only be aligned, but the coverage of the metagene by DMRs can be visualized.
Other improvements
- Any plot data now can be retrieved by corresponding method.
- Fixes to the plotting API.
- Fixes to
Categoryreport. - Added console command for processing BAM files.
Owner
- Name: shitohana
- Login: shitohana
- Kind: user
- Repositories: 1
- Profile: https://github.com/shitohana
GitHub Events
Total
- Watch event: 2
- Delete event: 6
- Push event: 17
- Create event: 6
Last Year
- Watch event: 2
- Delete event: 6
- Push event: 17
- Create event: 6
Committers
Last synced: over 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| shitohana | k****y@g****m | 119 |
| shitohana | 4****a | 22 |
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 6
- Total pull requests: 12
- Average time to close issues: 19 days
- Average time to close pull requests: about 8 hours
- Total issue authors: 3
- Total pull request authors: 1
- Average comments per issue: 3.83
- Average comments per pull request: 0.0
- Merged pull requests: 11
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 2
- Average time to close issues: N/A
- Average time to close pull requests: 2 days
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- komais (3)
- shitohana (1)
Pull Request Authors
- shitohana (6)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 60 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 10
- Total maintainers: 1
pypi.org: bsxplorer
Analytical framework for BS-seq data comparison and visualization
- Homepage: https://github.com/shitohana/BSXplorer
- Documentation: https://shitohana.github.io/BSXplorer/
- License: MIT License
-
Latest release: 1.1.0
published over 1 year ago