advanced_boxplot_viz

A Python code for creating boxplots with statistical significance bars. Supports multiple p-value corrections (Bonferroni, FDR), customizable interquartile ranges, and color palettes. Ideal for biomarker analysis, statistical comparisons, and scientific research.

https://github.com/mafaves/advanced_boxplot_viz

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.3%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: mafaves
License: mit
Language: Jupyter Notebook
Default Branch: main
Homepage:
Size: 2.11 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 1

Created over 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme License Citation

README.md

Customizable boxplots with statistical significance bars 📊

Overview

This repository provides a Python tool for creating customizable boxplots with statistical significance bars that are ready for paper presentation. It supports multiple p-value correction methods (Bonferroni, FDR), customizable interquartile ranges, and color palettes for better visualization of statistical comparisons.
Now supports automatic selection of statistical tests based on group count and normality: - For 2 groups: t-test (if normal) or Mann-Whitney U (if not) - For 3+ groups: ANOVA (if all groups normal) or Kruskal-Wallis (if not), with pairwise post-hoc tests

Features

📌 Statistical significance bars with corrected p-values

🎨 Customizable boxplots (colors, labels, and layouts)

🔬 Supports multiple p-value corrections (Bonferroni, FDR, etc.) from statsmodels.stats.multitest.multipletests library

📊 Ideal for biomarker analysis and research visualization

Statistical Methods

p-value correction: The function uses the multipletests method from the statsmodels.stats.multitest library for multiple testing correction. To modify the correction method, ensure that it is one of the methods implemented in multipletests.
Statistical tests:
- 2 groups:
- If both groups are normal (Shapiro-Wilk (n < 50)/Anderson test (n> 50)), a t-test (ttest_ind) is performed.
- If not, a Mann-Whitney U test (mannwhitneyu) is used.
- 3 or more groups:
- If all groups are normal, an ANOVA (f_oneway) is performed.
- If not, a Kruskal-Wallis test (kruskal) is used.
- For pairwise comparisons, t-test or Mann-Whitney U is chosen based on normality for each pair.

Installation

pip install git+https://github.com/mafaves/advanced_boxplot_viz.git

Usage

An example of usage can be seen in the Jupyter notebook ./example/run.ipynb

Parameters explained

df (DataFrame): The input dataset containing biomarker values.

group_col (str): The column in df that defines groups for comparison.

biomarker_list (list of str): List of biomarker column names to analyze.

palette (dict): A dictionary mapping group labels to colors.

subplots_x (int, default=1): Number of rows in the subplot grid.

subplots_y (int, default=2): Number of columns in the subplot grid.

fig_size (tuple, default=(10,6)): Figure size in inches.

xtick_labels (list, default=["Control", "Disease"]): Labels for the x-axis.

image_name (str, default="plot.png"): Filename for saving the generated plot.

barheightfactor (float, default=0.05): Height factor for significance bars.

bartipsfactor (float, default=0.01): Reduction factor for bar tips.

ytopfactor (float, default=0.1): Scaling factor to adjust y-axis top margin.

yrangefactor (float, default=0.15): Scaling factor to adjust y-axis range.

asterisk_factor (float, default=0.02): Offset factor for asterisk positioning.

title (bool, default=True) Whether to display titles for each biomarker plot.

biomarkertitlenames (dict, optional) Custom titles for biomarkers.

y_labels (bool, default=True) Whether to display y-axis labels.

biomarkerylabel_names (dict, optional) Custom y-axis labels for biomarkers.

correctionmethod (str, default="fdrbh") Method for p-value correction (e.g., "bonferroni", "fdr_bh").

iqr_min (float, optional) Lower bound for interquartile range filtering.

iqr_max (float, optional) Upper bound for interquartile range filtering.

jitter_size (float, optional) Size of jitter points in the strip plot.

alpha (float, optional) Transparency level for strip plot points.

showfliers (bool, optional) Whether to display outliers in the boxplot.

omnibus results Results of omnibus test when the numbers of groups > 2

Output example

boxplot

Acknowledgment

This code is modified from the original implementation found at Boxplots with Significance Bars

Owner

Name: Marcos Aguilella
Login: mafaves
Kind: user

Website: https://www.linkedin.com/in/marcos-aguilella-bioinformatician/?locale=en_US
Repositories: 1
Profile: https://github.com/mafaves

👨‍💻 I am passionate about biostatistics and artificial intelligence applied to the field of biotechnology and medicine

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Aguiella Fabregat"
  given-names: "Marcos"
  orcid: "https://orcid.org/0009-0009-2248-8127"
title: "Advanced boxplot visualization"
DOI: 10.5281/zenodo.14857137
version: 1.0.0
date-released: 2025-02-10
url: "https://github.com/mafaves/advanced_boxplot_viz"

GitHub Events

Total

Push event: 48
Create event: 3

Last Year

Push event: 48
Create event: 3

Dependencies

setup.py pypi

List *
matplotlib *
numpy *
pandas *
scipy *
seaborn *
statsmodels *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science