https://github.com/bin-cao/bgolearn

[Materials & Design 2024 | NPJ com mat 2024] Offical implement of Bgolearn

https://github.com/bin-cao/bgolearn

Science Score: 59.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: sciencedirect.com, wiley.com, nature.com, science.org, acs.org
  • Committers with academic emails
    1 of 1 committers (100.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.3%) to scientific vocabulary

Keywords

active-learning adaptive-learning augmented-expected-improvement bayesian-global-optimization bgolearn entropy-based-approach expected-improvement knowledge-gradient least-confidence margin-sampling material-design materials materials-informatics materials-science mlmd opportunity-cost predictive-entropy-search probability-of-improvement trail-path upper-confidence-bound
Last synced: 5 months ago · JSON representation

Repository

[Materials & Design 2024 | NPJ com mat 2024] Offical implement of Bgolearn

Basic Info
  • Host: GitHub
  • Owner: Bin-Cao
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: main
  • Homepage: http://bgolearn.caobin.asia/
  • Size: 420 MB
Statistics
  • Stars: 99
  • Watchers: 5
  • Forks: 16
  • Open Issues: 0
  • Releases: 4
Topics
active-learning adaptive-learning augmented-expected-improvement bayesian-global-optimization bgolearn entropy-based-approach expected-improvement knowledge-gradient least-confidence margin-sampling material-design materials materials-informatics materials-science mlmd opportunity-cost predictive-entropy-search probability-of-improvement trail-path upper-confidence-bound
Created over 3 years ago · Last pushed 7 months ago
Metadata Files
Readme Funding License

README.md

Bgolearn

Report | Homepage | BgoFace UI

Please star this project to support open-source development! For questions or collaboration, contact: Dr. Bin Cao (bcao686@connect.hkust-gz.edu.cn)

Usage Statistics (pepy)


Overview

Bgolearn is a lightweight and extensible Python package for Bayesian global optimization, built for accelerating materials discovery and design. It provides out-of-the-box support for regression and classification tasks, implements various acquisition strategies, and offers a seamless pipeline for virtual screening, active learning, and multi-objective optimization.

Official PyPI: pip install Bgolearn Code tutorial (BiliBili): Watch here Colab Demo: Run it online


Download Statistics

* MultiBgolearn

Key Features

One-Line Installation

bash pip install Bgolearn

Update to Latest Version

bash pip install --upgrade Bgolearn

Quick Check

bash pip show Bgolearn


Getting Started

```python import Bgolearn.BGOsampling as BGOS import pandas as pd

Load characterized dataset

data = pd.read_csv('data.csv') x = data.iloc[:, :-1] # features y = data.iloc[:, -1] # response

Load virtual samples

vs = pd.readcsv('virtualdata.csv')

Instantiate and run model

Bgolearn = BGOS.Bgolearn() Mymodel = Bgolearn.fit(datamatrix=x, Measuredresponse=y, virtual_samples=vs)

Get result using Expected Improvement

Mymodel.EI() ```


Multi-Objective Optimization

Install the extension toolkit:

bash pip install BgoKit

```python from BgoKit import ToolKit

Model = ToolKit.MultiOpt(vs, [score1, score2]) Model.BiSearch() Model.plot_distribution() ```

See detailed demo: Multi-objective Example


Supported Algorithms

For Regression

  • Expected Improvement (EI)
  • Augmented Expected Improvement (AEI)
  • Expected Quantile Improvement (EQI)
  • Upper Confidence Bound (UCB)
  • Probability of Improvement (PI)
  • Predictive Entropy Search (PES)
  • Knowledge Gradient (KG)
  • Reinterpolation EI (REI)
  • Expected Improvement with Plugin

For Classification

  • Least Confidence
  • Margin Sampling
  • Entropy-based approach

User Interface

The graphical frontend of Bgolearn is developed as BgoFace, providing no-code access to its backend algorithms.


Technical Innovations

Rich Bayesian Acquisition Functions

Supports a broad range of acquisition strategies (EI, UCB, KG, PES, etc.) for both single and multi-objective optimization. Works well with sparse and high-dimensional datasets common in material science.

Multi-Objective Expansion

Use BgoKit and MultiBgolearn to implement Pareto optimization across multiple target properties (e.g., strength & ductility), enabling parallel evaluation across virtual samples.

Integrated Active Learning

Incorporates adaptive sampling in an active learning loopexperiment prediction updateto accelerate optimization using fewer experiments.


Academic Impact

2025

  1. Nano Letters: Self-Driving Laboratory under UHV Link

  2. Small: ML-Engineered Nanozyme System for Anti-Tumor Therapy Link

  3. Computational Materials Science: Mg-Ca-Zn Alloy Optimization Link

  4. Measurement: Foaming Agent Optimization in EPB Shield Construction Link

  5. Intelligent Computing: Metasurface Design via Bayesian Learning Link

2024

  1. Materials & Design: Lead-Free Solder Alloys via Active Learning Link

  2. npj Computational Materials: MLMD Platform with Bgolearn Backend Link


License

Released under the MIT License. Free for academic and commercial use. Please cite relevant publications if used in research.


Contributing & Collaboration

We welcome community contributions and research collaborations:

  • Submit issues for bug reports, ideas, or suggestions
  • Submit pull requests for code contributions
  • Contact Bin Cao (bcao686@connect.hkust-gz.edu.cn) for collaborations

``` javascript Signature: Bgolearn.fit( datamatrix, Measuredresponse, virtualsamples, Mission='Regression', Classifier='GaussianProcess', noisestd=None, Krigingmodel=None, optnum=1, minsearch=True, CVtest=False, Dynamic_W=False, seed=42, )

================================================================

:param data_matrix: data matrix of training dataset, X .

:param Measured_response: response of tarining dataset, y.

:param virtual_samples: designed virtual samples.

:param Mission: str, default 'Regression', the mission of optimization. Mission = 'Regression' or 'Classification'

:param Classifier: if Mission == 'Classification', classifier is used. if user isn't applied one, Bgolearn will call a pre-set classifier. default, Classifier = 'GaussianProcess', i.e., Gaussian Process Classifier. five different classifiers are pre-setd in Bgolearn: 'GaussianProcess' --> Gaussian Process Classifier (default) 'LogisticRegression' --> Logistic Regression 'NaiveBayes' --> Naive Bayes Classifier 'SVM' --> Support Vector Machine Classifier 'RandomForest' --> Random Forest Classifier

:param noisestd: float or ndarray of shape (nsamples,), default=None Value added to the diagonal of the kernel matrix during fitting. This can prevent a potential numerical issue during fitting, by ensuring that the calculated values form a positive definite matrix. It can also be interpreted as the variance of additional Gaussian. measurement noise on the training observations.

    if noise_std is not None, a noise value will be estimated by maximum likelihood
    on training dataset.

:param Krigingmodel (default None): str, Krigingmodel = 'SVM', 'RF', 'AdaB', 'MLP' The machine learning models will be implemented: Support Vector Machine (SVM), Random Forest(RF), AdaBoost(AdaB), and Multi-Layer Perceptron (MLP). The estimation uncertainity will be determined by Boostsrap sampling. or
a user defined callable Kriging model, has an attribute of if user isn't applied one, Bgolearn will call a pre-set Kriging model atribute : input -> xtrain, ytrain, xtest ; output -> predicted mean and std of xtest

    e.g. (take GaussianProcessRegressor in sklearn):
    class Kriging_model(object):
        def fit_pre(self,xtrain,ytrain,xtest):
            # instantiated model
            kernel = RBF()
            mdoel = GaussianProcessRegressor(kernel=kernel).fit(xtrain,ytrain)
            # defined the attribute's outputs
            mean,std = mdoel.predict(xtest,return_std=True)
            return mean,std    

    e.g. (MultiModels estimations):
    class Kriging_model(object):
        def fit_pre(self,xtrain,ytrain,xtest):
            # instantiated model
            pre_1 = SVR(C=10).fit(xtrain,ytrain).predict(xtest) # model_1
            pre_2 = SVR(C=50).fit(xtrain,ytrain).predict(xtest) # model_2
            pre_3 = SVR(C=80).fit(xtrain,ytrain).predict(xtest) # model_3
            model_1 , model_2 , model_3  can be changed to any ML models you desire
            # defined the attribute's outputs
            stacked_array = np.vstack((pre_1,pre_2,pre_3))
            means = np.mean(stacked_array, axis=0)
            std = np.sqrt(np.var(stacked_array), axis=0)
            return mean, std    

:param opt_num: the number of recommended candidates for next iteration, default 1.

:param min_search: default True -> searching the global minimum ; False -> searching the global maximum.

:param CVtest: 'LOOCV' or an int, default False (pass test) if CVtest = 'LOOCV', LOOCV will be applied, elif CVtest = int, e.g., CVtest = 10, 10 folds cross validation will be applied.

:return: 1: array; potential of each candidate. 2: array/float; recommended candidate(s). File: ~/miniconda3/lib/python3.9/site-packages/Bgolearn/BGOsampling.py Type: method ```

Owner

  • Name: 曹斌 | Bin CAO
  • Login: Bin-Cao
  • Kind: user
  • Location: Shanghai
  • Company: Shanghai University

Machine learning | Materials Informatics|Mechanics

GitHub Events

Total
  • Issues event: 4
  • Watch event: 24
  • Issue comment event: 11
  • Push event: 8
  • Fork event: 2
Last Year
  • Issues event: 4
  • Watch event: 24
  • Issue comment event: 11
  • Push event: 8
  • Fork event: 2

Committers

Last synced: almost 3 years ago

All Time
  • Total Commits: 78
  • Total Committers: 1
  • Avg Commits per committer: 78.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Bin Cao b****o@s****n 78
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 3
  • Total pull requests: 1
  • Average time to close issues: 7 months
  • Average time to close pull requests: about 17 hours
  • Total issue authors: 2
  • Total pull request authors: 1
  • Average comments per issue: 4.67
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 0
  • Average time to close issues: 28 minutes
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 1.5
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • biebersong (2)
  • zifengdexiatian (1)
Pull Request Authors
  • Asachoo (1)
Top Labels
Issue Labels
Noise input (1)
Pull Request Labels

Packages

  • Total packages: 4
  • Total downloads:
    • pypi 529 last-month
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 0
    (may contain duplicates)
  • Total versions: 53
  • Total maintainers: 1
pypi.org: bgolearn

A Bayesian global optimization package for material design

  • Versions: 36
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 431 Last month
Rankings
Dependent packages count: 6.6%
Stargazers count: 12.9%
Forks count: 14.5%
Average: 17.0%
Downloads: 20.3%
Dependent repos count: 30.6%
Maintainers (1)
Last synced: 6 months ago
pypi.org: bgokit

A tool package for Bgolearn

  • Versions: 6
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 68 Last month
Rankings
Stargazers count: 9.8%
Dependent packages count: 10.1%
Downloads: 11.9%
Forks count: 11.9%
Average: 22.2%
Dependent repos count: 67.1%
Maintainers (1)
Last synced: 6 months ago
pypi.org: vsgenerator

Dynamic Virtual Space generation neural Network.

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 15 Last month
Rankings
Dependent packages count: 9.9%
Average: 37.5%
Dependent repos count: 65.2%
Maintainers (1)
Last synced: 6 months ago
pypi.org: wpemphase

Crystallographic Phase Identifier of Convolutional self-Attention Neural Network

  • Versions: 7
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 15 Last month
Rankings
Dependent packages count: 10.0%
Average: 37.8%
Dependent repos count: 65.7%
Maintainers (1)
Last synced: 6 months ago