dftbonddependency

Repository to calculate bond based correction to reaction energy from low-level DFT

https://github.com/chemsurajit/dftbonddependency

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 6 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.8%) to scientific vocabulary

Keywords

cheminformatics dft error-correction linear-regression reaction-energy
Last synced: 6 months ago · JSON representation ·

Repository

Repository to calculate bond based correction to reaction energy from low-level DFT

Basic Info
  • Host: GitHub
  • Owner: chemsurajit
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 127 KB
Statistics
  • Stars: 1
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
cheminformatics dft error-correction linear-regression reaction-energy
Created about 4 years ago · Last pushed over 3 years ago
Metadata Files
Readme License Citation

README.md

DFTbondDependency

Repository for bond dependency paper

Dependency:

1) RDKit 2) xyz2mol 3) pandas 4) csv 5) Numpy 6) requests 7) statsmodel 8) sklearn

To install the xyz2mol package, visit: https://github.com/jensengroup/xyz2mol.git

All the other packages are part of standard Python library and can be installed with either Conda, PIP, etc.

If this repository is used, please cite us. 1. Citation to the code can be downloaded by clicking: "Cite this repository" in the right side panel. 2. Citation to the preprint in: https://doi.org/10.26434/chemrxiv-2022-9prf3

Description of the Data.

The data is not publicly available now. It will be publicly available upon acceptance of our paper in a peer-reviewed journal. The preprint can be downloaded from: https://doi.org/10.26434/chemrxiv-2022-9prf3

Description of the codes.

The repository contains python scripts for the calculations described in the paper: https://doi.org/10.26434/chemrxiv-2022-9prf3 1) get_data.py: This script will download data from the DTU Data website. The public link to the website will be available upon acceptance of the paper. This file will download either all the files from the database (if -all/--all option is given) Or it will only download the xyzfiles and the log files of the energy calculations

2) makemoleculebondencsv.py: This file makes a csv file containing energy values, list of bonds, SMILES string, chemical formula, etc from the xyzfiles and logfiles downloaded by using the get_data.py script.

3) makereactionids.py: This script will create a csv file with only two columns: 'reactantindex','pdtindex'. The indices are the index of the molecules in the csv file made by the makemoleculebondencsv.py script.

4) processreactionconversionjobs.sh: This is a bash script to create the final csv file containing all the information related to the reactions. It takes the csv file containing molecular data (created by the script makemoleculebondencsv.py), the indices of the reactants and products in form of a csv file (created by using the script makereactionids.py), The G4MP2 energies of the molecules as csv file (with index and energy), the path of the python script makereactionsparallel.py, number of Nodes to be used, and number of processors per each nodes. It first split the csv file containing indices of the reactants and products according to the number of Nodes and saves those in a json file with names Noden.json with n from {1,2,...n} if n number of nodes are used.

5) makereactionsparallel.py: This file takes csv file containing indices for the reactions, csv file containing all the data of the molecules, csv file containing G4MP2 energy, number of processors, json file containing the indices of the csv file with "reactantindex","pdtindex".

6) submit.sh: An example submit script to run makereactionsparallel.py in a single node with multiple processors. It is called from the script processreactionconversion_jobs.sh. It is written for the slurm scheduler.

7) detectcorrelation.py: This script is for detecting correlation between the variables (bonds). It takes the directory location of the csv files containing all the reaction data (created by the processreactionconversionjobs.sh) script. By default, it randomly chose 10% of the total data to detect correlation.

8) dolinearregression.py: This script performs the linear regression between the bond change and the DFT error to reaction energies. It takes as argument the directory location for the reaction data file, and the name of the DFT functionals.

9) correctreactionenergy.py: This script calculates the reaction energy and the correction to it for a given DFT functional. It takes as input the log files of reactants (with the option -r), products (with the option -p), and name of the DFT functional and prints out the reaction energy for the DFT functional, the correction, and the corrected reaction energy.

10) CITATION.cff: This file is to provide citation data for this repository in bibtex or APA format.

How to run:

The help message for each of the files (except submit.sh) can be obtained by running the corresponding script with -h.

The steps described in the paper can be followed by running the below scripts in the following sequence: 1. getdata.py 2. makemoleculebondencsv.py 3. makereactionids.py 4. processreactionconversionjobs.sh, makereactionsparallel.py, submit.sh 5. detectcorrelation.py 6. dolinear_regression.py

License:

All the scripts in this repository are covered under the MIT license terms (LICENSE.txt).

Owner

  • Login: chemsurajit
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.0.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Nandi"
  given-names: "Surajit"
  orcid: "https://orcid.org/0000-0002-7105-2209"
title: "DFTbondDependency"
version: 1.0.0
doi: ""
date-released: 2022-06-07
url: "https://github.com/chemsurajit/DFTbondDependency"

GitHub Events

Total
Last Year