Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
✓DOI references
Found 6 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.2%) to scientific vocabulary
Repository
ai for photocatalysis
Basic Info
- Host: GitHub
- Owner: AtrCheema
- Language: Python
- Default Branch: master
- Size: 343 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
readme.md
Code for the paper Machine learning analysis to interpret the effect of the photocatalytic reaction rate constant (k) of semiconductor-based photocatalysts on dye removal in Journal of Hazardous Materials.
AI for Photocatalysis
In this study we performed data-driven modeling of photocatalysis process. The objective was to build a machine learning (ML) model to predict first order rate constant k using the experimental conditions (Time, solution pH, Light intensity, Light source distance, dye concentration loading), elemental composition of catalyst (C, Fe, Al, Ni, Mo, S, Bi, Ag, Pd, Pt) physio-chemical properties of the catalyst (Volume, surface area, pore size, pore volume) and parameters of pollutant (solubility, molecular weight, H-bond acceptor and donor counts). Total data consisted of 1527 samples and 32 features, which were collected by experimentation. This dataset was divided into 1068 (70%) training set and 459 (30%) test set. In the first notebook Exploratory Data Analysis we performed exploratory data analysis. After this we checked the performance of avaialble (over 30) machine learning algorithms on test set of our data in Experiments after training them on training set. The purpose was to get an idea that which ML algorithm will be best for our problem. After that, we performed feature selection using various feature selection methods in Feature Selection notebook. The final features were selected using Boruta-shap method. After selecting the algorithm and features, we performed hyperparameter optimization using k-fold cross validation in hyperparameter optimization. Then we built and trained our model on training set and checked its prediction performance on test set. Some plots depicting analysis of prediction performance and error anlaysis were also plotted here. After that we interpreted the machine learning model using various post-hoc interpretation methods. This includes SHAP, Partial Dependence Plots and Accumulated Local Effects. Finally we checked the robustness of our model by quantifying uncertainty in the prediction of machine learning model. We used conformal analysis for this purpose and analyzed the robustness of our model by employing various conformal anlaysis methods.
Reproducibility
The results presented in these notebooks are completely (~100%) reproducible. All you need is to use same computational environment which was used to create these results. The names and versions of the python packages used in this project are given in requirements.txt file. Furthermore, the exact version of some of the python packages is also printed at the start of each notebook. The user will have to install these packages, preferably in a new conda environment. Then make sure that you have copied all the code in utils notebook in a utils.py file and saved in the same direcotry/folder where other python scripts are present. The data file is expected to be in the data folder. These steps can be summarized as below
git clone https://github.com/AtrCheema/weil101.git
cd weil101
pip install -r docs/requirements.txt
make html
Online reprducible examples running on readthedocs are at https://weil101.readthedocs.io/en/latest/ .
Owner
- Name: Ather Abbas
- Login: AtrCheema
- Kind: user
- Location: South Korea
- Company: Environmental Modeling and Monitoring Lab, UNIST
- Repositories: 7
- Profile: https://github.com/AtrCheema
GitHub Events
Total
- Push event: 2
- Create event: 2
Last Year
- Push event: 2
- Create event: 2
Dependencies
- BorutaShap *
- catboost *
- crepes ==0.1.0
- easy_mpl *
- h5py *
- lightgbm *
- mapie ==0.6.4
- matplotlib ==3.7.1
- nbsphinx *
- numpy ==1.23.5
- optuna ==3.1.0
- scikit-learn ==1.2.2
- scikit-optimize ==0.9.0
- scipy ==1.10.1
- seaborn *
- shap ==0.41.0
- sphinx <7
- sphinx-prompt *
- sphinx_copybutton *
- sphinx_gallery *
- sphinx_issues *
- sphinx_rtd_theme *
- sphinx_toggleprompt *
- xgboost *
- catboost *
- easy_mpl *
- h5py *
- lightgbm *
- scikit-optimize *
- seaborn *
- shap *
- xgboost *