advanced_boxplot_viz
A Python code for creating boxplots with statistical significance bars. Supports multiple p-value corrections (Bonferroni, FDR), customizable interquartile ranges, and color palettes. Ideal for biomarker analysis, statistical comparisons, and scientific research.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.3%) to scientific vocabulary
Repository
A Python code for creating boxplots with statistical significance bars. Supports multiple p-value corrections (Bonferroni, FDR), customizable interquartile ranges, and color palettes. Ideal for biomarker analysis, statistical comparisons, and scientific research.
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
Customizable boxplots with statistical significance bars 📊
Overview
This repository provides a Python tool for creating customizable boxplots with statistical significance bars that are ready for paper presentation. It supports multiple p-value correction methods (Bonferroni, FDR), customizable interquartile ranges, and color palettes for better visualization of statistical comparisons.
Now supports automatic selection of statistical tests based on group count and normality:
- For 2 groups: t-test (if normal) or Mann-Whitney U (if not)
- For 3+ groups: ANOVA (if all groups normal) or Kruskal-Wallis (if not), with pairwise post-hoc tests
Features
📌 Statistical significance bars with corrected p-values
🎨 Customizable boxplots (colors, labels, and layouts)
🔬 Supports multiple p-value corrections (Bonferroni, FDR, etc.) from statsmodels.stats.multitest.multipletests library
📊 Ideal for biomarker analysis and research visualization
Statistical Methods
- p-value correction: The function uses the
multipletestsmethod from thestatsmodels.stats.multitestlibrary for multiple testing correction. To modify the correction method, ensure that it is one of the methods implemented inmultipletests. - Statistical tests:
- 2 groups:
- If both groups are normal (Shapiro-Wilk (n < 50)/Anderson test (n> 50)), a t-test (
ttest_ind) is performed. - If not, a Mann-Whitney U test (
mannwhitneyu) is used. - 3 or more groups:
- If all groups are normal, an ANOVA (
f_oneway) is performed. - If not, a Kruskal-Wallis test (
kruskal) is used. - For pairwise comparisons, t-test or Mann-Whitney U is chosen based on normality for each pair.
- 2 groups:
Installation
pip install git+https://github.com/mafaves/advanced_boxplot_viz.git
Usage
An example of usage can be seen in the Jupyter notebook ./example/run.ipynb
Parameters explained
df (DataFrame): The input dataset containing biomarker values.
group_col (str): The column in df that defines groups for comparison.
biomarker_list (list of str): List of biomarker column names to analyze.
palette (dict): A dictionary mapping group labels to colors.
subplots_x (int, default=1): Number of rows in the subplot grid.
subplots_y (int, default=2): Number of columns in the subplot grid.
fig_size (tuple, default=(10,6)): Figure size in inches.
xtick_labels (list, default=["Control", "Disease"]): Labels for the x-axis.
image_name (str, default="plot.png"): Filename for saving the generated plot.
barheightfactor (float, default=0.05): Height factor for significance bars.
bartipsfactor (float, default=0.01): Reduction factor for bar tips.
ytopfactor (float, default=0.1): Scaling factor to adjust y-axis top margin.
yrangefactor (float, default=0.15): Scaling factor to adjust y-axis range.
asterisk_factor (float, default=0.02): Offset factor for asterisk positioning.
title (bool, default=True) Whether to display titles for each biomarker plot.
biomarkertitlenames (dict, optional) Custom titles for biomarkers.
y_labels (bool, default=True) Whether to display y-axis labels.
biomarkerylabel_names (dict, optional) Custom y-axis labels for biomarkers.
correctionmethod (str, default="fdrbh") Method for p-value correction (e.g., "bonferroni", "fdr_bh").
iqr_min (float, optional) Lower bound for interquartile range filtering.
iqr_max (float, optional) Upper bound for interquartile range filtering.
jitter_size (float, optional) Size of jitter points in the strip plot.
alpha (float, optional) Transparency level for strip plot points.
showfliers (bool, optional) Whether to display outliers in the boxplot.
omnibus results Results of omnibus test when the numbers of groups > 2
Output example

Acknowledgment
This code is modified from the original implementation found at Boxplots with Significance Bars
Owner
- Name: Marcos Aguilella
- Login: mafaves
- Kind: user
- Website: https://www.linkedin.com/in/marcos-aguilella-bioinformatician/?locale=en_US
- Repositories: 1
- Profile: https://github.com/mafaves
👨💻 I am passionate about biostatistics and artificial intelligence applied to the field of biotechnology and medicine
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Aguiella Fabregat" given-names: "Marcos" orcid: "https://orcid.org/0009-0009-2248-8127" title: "Advanced boxplot visualization" DOI: 10.5281/zenodo.14857137 version: 1.0.0 date-released: 2025-02-10 url: "https://github.com/mafaves/advanced_boxplot_viz"
GitHub Events
Total
- Push event: 48
- Create event: 3
Last Year
- Push event: 48
- Create event: 3
Dependencies
- List *
- matplotlib *
- numpy *
- pandas *
- scipy *
- seaborn *
- statsmodels *