Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 8 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
✓Institutional organization owner
Organization hsu-hpc has institutional domain (www.hsu-hh.de) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.3%) to scientific vocabulary
Keywords
Repository
Batch-Effect Reduction Trees
Basic Info
- Host: GitHub
- Owner: HSU-HPC
- License: gpl-3.0
- Language: R
- Default Branch: main
- Homepage: https://link.springer.com/article/10.1038/s41467-025-62237-4
- Size: 174 KB
Statistics
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 3
Topics
Metadata Files
README.md
BERT: Batch-Effect Reduction Trees
Data from high-throughput technologies assessing global patterns of biomolecules (omic data), is often afflicted with missing values and with measurement-specific biases (batch-effects), that hinder the quantitative comparison of independently acquired datasets. This repository provides the BERT algorithm, a high-performance method for data integration of incomplete omic profiles.
[!IMPORTANT] This repository is primarily intended for development purposes. For typical users, BERT is provided via Bioconductor. Note that repository badges refer to the release version of BERT, which may be multiple commits behind the source code provided here. The latest CI/CD results for BERT may be obtained here.
[!WARNING] The R package provided here is neither affiliated with nor related to Bidirectional Encoder Representations from Transformers as published by Devlin et al in 2019 (arXiv:1810.04805).
Installation
[!TIP] It is recommended to install BERT via Bioconductor as described here.
For development purposes, the BERT package can be installed directly from this repository using devtools.
R
if (!require("devtools", quietly = TRUE))
install.packages("devtools")
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(c('S4Vectors', 'S4Arrays', 'XVector', 'genefilter', 'SparseArray'))
devtools::install_github('HSU-HPC/BERT')
Please compare the installed version of R to the required version for Bioconductor and install all build dependencies if compilation from source is required for your target[^1].
Usage
The BERT library is designed to offer high user friendliness whilst providing maximum flexibility. The following example demonstrates how to use the software on a simulated dataset with batch-effects and missing values:
```R
import library
library(BERT)
simulate dataset with 10% missing values
datasetraw <- generatedataset(features=60, batches=10, samplesperbatch=10, mvstmt=0.1, classes=2)
apply BERT with default arguments
datasetcorrected <- BERT(datasetraw) ```
[!TIP] A detailed explanation of all available parameters, their default values and optimal configurations for typical scenarios can be found in the Bioconductor vignette.
Support
Users may ask for assistance via the Bioconductor support site. Bug reports may be filed via the Issues tab of this repository. For confidential or security-related problems, please send an email to
ju [dot] neumann [at] uke [dot] de or philipp [dot] neumann [at] desy [dot] de
[!WARNING] As of October 2025, this repository will be no longer actively maintained.
License
This code is published under the GPLv3.0 License.
References
Citations make research visible. If you use BERT for your research, please cite the following publication:
- Computational Methods for Data Integration and Imputation of Missing Values in Omics Datasets, Y. Schumann Gocke / A. Gocke / J. E. Neumann, 2024-12 PROTEOMICS, Wiley, https://doi.org/10.1002/pmic.202400100
- Schumann, Y., Schlumbohm, S., Neumann, J.E. et al. High performance data integration for large-scale analyses of incomplete Omic profiles using Batch-Effect Reduction Trees (BERT). Nat Commun 16, 7104 (2025). https://doi.org/10.1038/s41467-025-62237-4
[^1]: On Ubuntu 24.04, a complete list of depencies would be: wget, curl _, _build-essential, libssl-dev, libcurl4-openssl-dev, pkg-config, git, ca-certificates, libxml2, libxml2-dev, gnupg, software-properties-common, libfontconfig1-dev, libharfbuzz-dev, libfribidi-dev, libfreetype6-dev, libpng-dev, libtiff5-dev, libjpeg-dev
Owner
- Name: Chair for High Performance Computing
- Login: HSU-HPC
- Kind: organization
- Email: philipp.neumann@hsu-hh.de
- Location: Hamburg, Germany
- Website: https://www.hsu-hh.de/hpc/en/
- Repositories: 4
- Profile: https://github.com/HSU-HPC
GitHub Events
Total
- Release event: 2
- Watch event: 1
- Push event: 13
- Pull request event: 1
- Create event: 2
Last Year
- Release event: 2
- Watch event: 1
- Push event: 13
- Pull request event: 1
- Create event: 2
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 0
- Total pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: less than a minute
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: less than a minute
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- deryannis (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- bioconductor 3,532 total
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 3
- Total maintainers: 1
bioconductor.org: BERT
High Performance Data Integration for Large-Scale Analyses of Incomplete Omic Profiles Using Batch-Effect Reduction Trees (BERT)
- Homepage: https://github.com/HSU-HPC/BERT/
- Documentation: https://bioconductor.org/packages/release/bioc/vignettes/BERT/inst/doc/BERT.pdf
- License: GPL-3
-
Latest release: 1.4.0
published 10 months ago
Rankings
Maintainers (1)
Dependencies
- R >= 4.3.0 depends
- Rmpi * enhances
- doMPI * enhances
- BiocStyle * imports
- SummarizedExperiment * imports
- cluster * imports
- comprehenr * imports
- doParallel >= 1.0.17 imports
- foreach >= 1.5.2 imports
- invgamma * imports
- iterators >= 1.0.14 imports
- janitor >= 2.2.0 imports
- limma >= 3.46.0 imports
- logging >= 0.10 imports
- methods * imports
- parallel * imports
- sva >= 3.38.0 imports
- utils * imports
- knitr * suggests
- rmarkdown * suggests
- testthat >= 3.0.0 suggests