https://github.com/bobleesj/saf-caf-performance

This repository contains model performance on crystal structure classification for binary compounds, derived from 1,400 .cif files using features generated with SAF and CAF.

https://github.com/bobleesj/saf-caf-performance

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.2%) to scientific vocabulary
Last synced: 7 months ago · JSON representation

Repository

This repository contains model performance on crystal structure classification for binary compounds, derived from 1,400 .cif files using features generated with SAF and CAF.

Basic Info
  • Host: GitHub
  • Owner: bobleesj
  • Language: Python
  • Default Branch: main
  • Size: 37 MB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 1 year ago · Last pushed about 1 year ago
Metadata Files
Readme

README.md

SAF CAF classification performance

This repository contains model performance on crystal structure classification for binary compounds, derived from 1,400 .cif files using features generated with SAF and CAF.

How to reproduce

```bash

Download the repository

git clone https://www.github.com/bobleesj/CAFSAFperfomance

Enter the folder

cd CAFSAFperfomance ```

Install packages listed in requirements.txt:

bash pip install -r requirements.txt

Or you may install all packages at once:

bash pip install matplotlib scikit-learn pandas CBFV numpy

To reproduce results

Run python main.py

``` imac@imacs-iMac digitial-discovery % python main.py

Processing outputs/CAF/featuresbinary.csv with 133 features (1/7). (1/4) Running SVM model... (2/4) Running PLSDA n=2... (3/4) Running PLS_DA model with the best n... (4/4) Running XGBoost model... ===========Elapsed time: 8.30 seconds===========

...

Processing outputs/CBFV/oliynyk.csv with 308 features (7/7). (1/4) Running SVM model... (2/4) Running PLSDA n=2... (3/4) Running PLSDA model with the best n... (4/4) Running XGBoost model... ===========Elapsed time: 12.88 seconds=========== imac@imacs-iMac digitial-discovery % ```

Check the outputs folder for ML reports, plots, etc.

For Figures 8 and 9, run: python figure_8_case_study.py and python figure_9_case_study.py

Result

Our SAF+CAF features does a great job with classifying crystal structuree for intermetallic binary compouds.

This a PLS-DA Component N=2 result for crystal structure that you can find under outputs/SAF_CAF/PLS_DA_plot

  • Compositional features were created using CAF. Ex) outputs/CAF/features_binary.csv
  • Structural features were created using SAF Ex) outputs/SAF/binary_features.csv

To customize for your data

  1. Place a features.csv file in the data folder. It should have a "Structure" column, from which we'll extract all "y" values.
  2. Place a CSV file with features in a subdirectory within outputs. Example: outputs/SAF_CAF/binary_features.csv

To format the code

To automatically format Python code and organize imports:

bash black -l 79 . && isort .

To generate features with CBFV

Run the following command:

bash python featurizer.py

Questions?

For help with generating structural data using SAF, contact Bob at sl5400@columbia.edu.

Owner

  • Name: Sangjoon Bob Lee
  • Login: bobleesj
  • Kind: user
  • Location: New York, NY
  • Company: Columbia University

1st-Year MS Materials Science and Engineering at Columbia Engineering, Department of Applied Physics and Applied Mathematics

GitHub Events

Total
  • Issues event: 1
  • Issue comment event: 3
  • Push event: 2
Last Year
  • Issues event: 1
  • Issue comment event: 3
  • Push event: 2

Dependencies

requirements.txt pypi
  • CBFV *
  • matplotlib *
  • numpy *
  • pandas *
  • scikit-learn *
  • xgboost *