https://github.com/bin-cao/bgolearn

[Materials & Design 2024 | NPJ com mat 2024] Offical implement of Bgolearn

Keywords

active-learning adaptive-learning augmented-expected-improvement bayesian-global-optimization bgolearn entropy-based-approach expected-improvement knowledge-gradient least-confidence margin-sampling material-design materials materials-informatics materials-science mlmd opportunity-cost predictive-entropy-search probability-of-improvement trail-path upper-confidence-bound

Last synced: 5 months ago · JSON representation

Repository

[Materials & Design 2024 | NPJ com mat 2024] Offical implement of Bgolearn

Basic Info

Host: GitHub
Owner: Bin-Cao
License: mit
Language: Jupyter Notebook
Default Branch: main
Homepage: http://bgolearn.caobin.asia/
Size: 420 MB

Statistics

Stars: 99
Watchers: 5
Forks: 16
Open Issues: 0
Releases: 4

Topics

active-learning adaptive-learning augmented-expected-improvement bayesian-global-optimization bgolearn entropy-based-approach expected-improvement knowledge-gradient least-confidence margin-sampling material-design materials materials-informatics materials-science mlmd opportunity-cost predictive-entropy-search probability-of-improvement trail-path upper-confidence-bound

Created over 3 years ago · Last pushed 7 months ago

Metadata Files

Readme Funding License

README.md

Bgolearn

Report | Homepage | BgoFace UI

Please star this project to support open-source development! For questions or collaboration, contact: Dr. Bin Cao (bcao686@connect.hkust-gz.edu.cn)

Usage Statistics (pepy)

Overview

Bgolearn is a lightweight and extensible Python package for Bayesian global optimization, built for accelerating materials discovery and design. It provides out-of-the-box support for regression and classification tasks, implements various acquisition strategies, and offers a seamless pipeline for virtual screening, active learning, and multi-objective optimization.

Official PyPI: pip install Bgolearn Code tutorial (BiliBili): Watch here Colab Demo: Run it online

Download Statistics

* MultiBgolearn

Key Features

One-Line Installation

bash pip install Bgolearn

Update to Latest Version

bash pip install --upgrade Bgolearn

Quick Check

bash pip show Bgolearn

Getting Started

```python import Bgolearn.BGOsampling as BGOS import pandas as pd

Load characterized dataset

data = pd.read_csv('data.csv') x = data.iloc[:, :-1] # features y = data.iloc[:, -1] # response

Load virtual samples

vs = pd.readcsv('virtualdata.csv')

Instantiate and run model

Bgolearn = BGOS.Bgolearn() Mymodel = Bgolearn.fit(datamatrix=x, Measuredresponse=y, virtual_samples=vs)

Get result using Expected Improvement

Mymodel.EI() ```

Multi-Objective Optimization

Install the extension toolkit:

bash pip install BgoKit

```python from BgoKit import ToolKit

Model = ToolKit.MultiOpt(vs, [score1, score2]) Model.BiSearch() Model.plot_distribution() ```

See detailed demo: Multi-objective Example

Supported Algorithms

For Regression

Expected Improvement (EI)
Augmented Expected Improvement (AEI)
Expected Quantile Improvement (EQI)
Upper Confidence Bound (UCB)
Probability of Improvement (PI)
Predictive Entropy Search (PES)
Knowledge Gradient (KG)
Reinterpolation EI (REI)
Expected Improvement with Plugin

For Classification

Least Confidence
Margin Sampling
Entropy-based approach

User Interface

The graphical frontend of Bgolearn is developed as BgoFace, providing no-code access to its backend algorithms.

Technical Innovations

Rich Bayesian Acquisition Functions

Supports a broad range of acquisition strategies (EI, UCB, KG, PES, etc.) for both single and multi-objective optimization. Works well with sparse and high-dimensional datasets common in material science.

Multi-Objective Expansion

Use BgoKit and MultiBgolearn to implement Pareto optimization across multiple target properties (e.g., strength & ductility), enabling parallel evaluation across virtual samples.

Integrated Active Learning

Incorporates adaptive sampling in an active learning loopexperiment prediction updateto accelerate optimization using fewer experiments.

Academic Impact

2025

Nano Letters: Self-Driving Laboratory under UHV Link
Small: ML-Engineered Nanozyme System for Anti-Tumor Therapy Link
Computational Materials Science: Mg-Ca-Zn Alloy Optimization Link
Measurement: Foaming Agent Optimization in EPB Shield Construction Link
Intelligent Computing: Metasurface Design via Bayesian Learning Link

2024

Materials & Design: Lead-Free Solder Alloys via Active Learning Link
npj Computational Materials: MLMD Platform with Bgolearn Backend Link

License

Released under the MIT License. Free for academic and commercial use. Please cite relevant publications if used in research.

Contributing & Collaboration

We welcome community contributions and research collaborations:

Submit issues for bug reports, ideas, or suggestions
Submit pull requests for code contributions
Contact Bin Cao (bcao686@connect.hkust-gz.edu.cn) for collaborations

``` javascript Signature: Bgolearn.fit( datamatrix, Measuredresponse, virtualsamples, Mission='Regression', Classifier='GaussianProcess', noisestd=None, Krigingmodel=None, optnum=1, minsearch=True, CVtest=False, Dynamic_W=False, seed=42, )

================================================================

:param data_matrix: data matrix of training dataset, X .

:param Measured_response: response of tarining dataset, y.

:param virtual_samples: designed virtual samples.

:param Mission: str, default 'Regression', the mission of optimization. Mission = 'Regression' or 'Classification'

:param Classifier: if Mission == 'Classification', classifier is used. if user isn't applied one, Bgolearn will call a pre-set classifier. default, Classifier = 'GaussianProcess', i.e., Gaussian Process Classifier. five different classifiers are pre-setd in Bgolearn: 'GaussianProcess' --> Gaussian Process Classifier (default) 'LogisticRegression' --> Logistic Regression 'NaiveBayes' --> Naive Bayes Classifier 'SVM' --> Support Vector Machine Classifier 'RandomForest' --> Random Forest Classifier

:param noisestd: float or ndarray of shape (nsamples,), default=None Value added to the diagonal of the kernel matrix during fitting. This can prevent a potential numerical issue during fitting, by ensuring that the calculated values form a positive definite matrix. It can also be interpreted as the variance of additional Gaussian. measurement noise on the training observations.

    if noise_std is not None, a noise value will be estimated by maximum likelihood
    on training dataset.

:param Krigingmodel (default None): str, Krigingmodel = 'SVM', 'RF', 'AdaB', 'MLP' The machine learning models will be implemented: Support Vector Machine (SVM), Random Forest(RF), AdaBoost(AdaB), and Multi-Layer Perceptron (MLP). The estimation uncertainity will be determined by Boostsrap sampling. or
a user defined callable Kriging model, has an attribute of if user isn't applied one, Bgolearn will call a pre-set Kriging model atribute : input -> xtrain, ytrain, xtest ; output -> predicted mean and std of xtest

    e.g. (take GaussianProcessRegressor in sklearn):
    class Kriging_model(object):
        def fit_pre(self,xtrain,ytrain,xtest):
            # instantiated model
            kernel = RBF()
            mdoel = GaussianProcessRegressor(kernel=kernel).fit(xtrain,ytrain)
            # defined the attribute's outputs
            mean,std = mdoel.predict(xtest,return_std=True)
            return mean,std    

    e.g. (MultiModels estimations):
    class Kriging_model(object):
        def fit_pre(self,xtrain,ytrain,xtest):
            # instantiated model
            pre_1 = SVR(C=10).fit(xtrain,ytrain).predict(xtest) # model_1
            pre_2 = SVR(C=50).fit(xtrain,ytrain).predict(xtest) # model_2
            pre_3 = SVR(C=80).fit(xtrain,ytrain).predict(xtest) # model_3
            model_1 , model_2 , model_3  can be changed to any ML models you desire
            # defined the attribute's outputs
            stacked_array = np.vstack((pre_1,pre_2,pre_3))
            means = np.mean(stacked_array, axis=0)
            std = np.sqrt(np.var(stacked_array), axis=0)
            return mean, std

:param opt_num: the number of recommended candidates for next iteration, default 1.

:param min_search: default True -> searching the global minimum ; False -> searching the global maximum.

:param CVtest: 'LOOCV' or an int, default False (pass test) if CVtest = 'LOOCV', LOOCV will be applied, elif CVtest = int, e.g., CVtest = 10, 10 folds cross validation will be applied.

:return: 1: array; potential of each candidate. 2: array/float; recommended candidate(s). File: ~/miniconda3/lib/python3.9/site-packages/Bgolearn/BGOsampling.py Type: method ```

Owner

Name: 曹斌 | Bin CAO
Login: Bin-Cao
Kind: user
Location: Shanghai
Company: Shanghai University

Repositories: 5
Profile: https://github.com/Bin-Cao

Machine learning | Materials Informatics｜Mechanics

GitHub Events

Total

Issues event: 4
Watch event: 24
Issue comment event: 11
Push event: 8
Fork event: 2

Last Year

Issues event: 4
Watch event: 24
Issue comment event: 11
Push event: 8
Fork event: 2

Committers

Last synced: almost 3 years ago

All Time

Total Commits: 78
Total Committers: 1
Avg Commits per committer: 78.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Bin Cao	b**o@s**n	78

Committer Domains (Top 20 + Academic)

shu.edu.cn: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 3
Total pull requests: 1
Average time to close issues: 7 months
Average time to close pull requests: about 17 hours
Total issue authors: 2
Total pull request authors: 1
Average comments per issue: 4.67
Average comments per pull request: 0.0
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 2
Pull requests: 0
Average time to close issues: 28 minutes
Average time to close pull requests: N/A
Issue authors: 1
Pull request authors: 0
Average comments per issue: 1.5
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

biebersong (2)
zifengdexiatian (1)

Pull Request Authors

Asachoo (1)

Top Labels

Issue Labels

Noise input (1)

Pull Request Labels

Packages

Total packages: 4
Total downloads:
- pypi 529 last-month

Total dependent packages: 0
(may contain duplicates)
Total dependent repositories: 0
(may contain duplicates)
Total versions: 53
Total maintainers: 1

pypi.org: bgolearn

A Bayesian global optimization package for material design

Homepage: https://github.com/Bin-Cao/Bgolearn
Documentation: https://bgolearn.readthedocs.io/
License: MIT License
Latest release: 2.3.9
published 6 months ago

Versions: 36
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 431 Last month

Rankings

Dependent packages count: 6.6%

Stargazers count: 12.9%

Forks count: 14.5%

Average: 17.0%

Downloads: 20.3%

Dependent repos count: 30.6%

Maintainers (1)

CaoBin

Last synced: 6 months ago

pypi.org: bgokit

A tool package for Bgolearn

Homepage: https://github.com/Bin-Cao/Bgolearn
Documentation: https://bgokit.readthedocs.io/
License: MIT License
Latest release: 0.0.6
published about 1 year ago

Versions: 6
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 68 Last month

Rankings

Stargazers count: 9.8%

Dependent packages count: 10.1%

Downloads: 11.9%

Forks count: 11.9%

Average: 22.2%

Dependent repos count: 67.1%

Maintainers (1)

CaoBin

Last synced: 6 months ago

pypi.org: vsgenerator

Dynamic Virtual Space generation neural Network.

Homepage: https://github.com/Bin-Cao/Bgolearn
Documentation: https://vsgenerator.readthedocs.io/
License: MIT License
Latest release: 0.1.0
published 11 months ago

Versions: 4
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 15 Last month

Rankings

Dependent packages count: 9.9%

Average: 37.5%

Dependent repos count: 65.2%

Maintainers (1)

CaoBin

Last synced: 6 months ago

pypi.org: wpemphase

Crystallographic Phase Identifier of Convolutional self-Attention Neural Network

Homepage: https://github.com/Bin-Cao/Bgolearn
Documentation: https://wpemphase.readthedocs.io/
License: MIT License
Latest release: 0.1.1
published almost 2 years ago

Versions: 7
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 15 Last month

Rankings

Dependent packages count: 10.0%

Average: 37.8%

Dependent repos count: 65.7%

Maintainers (1)

CaoBin

Last synced: 6 months ago

https://github.com/bin-cao/bgolearn

Science Score: 59.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Bgolearn

Overview

Download Statistics

* MultiBgolearn

Key Features

One-Line Installation

Update to Latest Version

Quick Check

Getting Started

Load characterized dataset

Load virtual samples

Instantiate and run model

Get result using Expected Improvement

Multi-Objective Optimization

Supported Algorithms

For Regression

For Classification

User Interface

Technical Innovations

Rich Bayesian Acquisition Functions

Multi-Objective Expansion

Integrated Active Learning

Academic Impact

2025

2024

License

Contributing & Collaboration

Owner

GitHub Events

Total

Last Year

Committers

All Time

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: bgolearn

Rankings

Maintainers (1)

pypi.org: bgokit

Rankings

Maintainers (1)

pypi.org: vsgenerator

Rankings

Maintainers (1)

pypi.org: wpemphase

Rankings

Maintainers (1)