gaml

Genetic Algorithm Machine Learning (GAML) software package for automated force field parameterization.

https://github.com/orlandoacevedo/gaml

Science Score: 33.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: acs.org
  • Committers with academic emails
    1 of 2 committers (50.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.8%) to scientific vocabulary

Keywords

chemistry force-field machine-learning solvent
Last synced: 9 months ago · JSON representation

Repository

Genetic Algorithm Machine Learning (GAML) software package for automated force field parameterization.

Basic Info
  • Host: GitHub
  • Owner: orlandoacevedo
  • License: mit
  • Language: Python
  • Default Branch: master
  • Size: 1.59 MB
Statistics
  • Stars: 15
  • Watchers: 1
  • Forks: 5
  • Open Issues: 0
  • Releases: 0
Topics
chemistry force-field machine-learning solvent
Created over 7 years ago · Last pushed almost 5 years ago
Metadata Files
Readme License

README.md

Genetic Algorithm Machine Learning (GAML)

Genetic Algorithm Machine Learning (GAML) software package for automated force field parameterization.

Xiang Zhong and Orlando Acevedo*, University of Miami

This machine learning based software package automates the creation of force field (FF) parameters for molecular dynamics (MD) or Monte Carlo (MC) simulations. In the current build, atomic charge development is emphasized for solvent simulations using a genetic algorithm crossover/average/mutation method. GAML outputs GROMACS formatted files in the OPLS-AA formalism for use in MD simulations. The FF parameters are validated by default against user-supplied free energies of hydration (ΔGhyd), liquid densities, and heats of vaporization (ΔHvap). However, additional condensed phased physical properties are available (or under development) for training that include: heat capacity, viscosity, self-diffusivity, dipoles, surface tension, and solubility.

Requirements

Download

git clone git://github.com/orlandoacevedo/GAML.git

Installation

pip[3] install gaml

Or using source codes

python[3] setup.py install

Usage

For helpful information, use

gaml

Or

gaml -h Or, for sub-commands

gaml [command] -h

Option 1, use settingfile.txt

Parameters comments =========================================== ===================================== command = charge_gen_range # command to execute, required charge_path = BPYR_BF4_charge_collection.txt # input file path, required atomnm = 24 # the processed atom number, required percent = 0.8 # optional, default is 0.8 stepsize = 0.01 # optional, default is 0.01 nmround = 3 # optional, default is 3 fname = ChargeGenRange # optional, default is ChargeGenRange

The templates for the settingfile.txt can be found in the sample/ directory.

Option 2, use the command line

``` Usage:

gaml chargegenrange chargegenscheme filegengaussian filegengromacstop filegenmdpotential filegenscripts fssanalysis GAML GAMLautotrain

gaml chargegenrange

-f, --charge_path           input charge file path
-i, --atomnm                total atom numbers of single system
-p, --percent               range from 0.0 ~ 1.0, default is 0.8
-t, --stepsize              default is 0.01
-nr, --nmround              decimal round-off number, default is 3
-o, --fname                 output file name, default is ChargeRange

gaml chargegenscheme

-f, --charge_path           input charge file
-sl, --symmetry_list        list contains atom's chemical equivalent, index starting from 1
-ol, --offset_list          two offsets to fit charge constrain
--offset_nm                 loop numbers to for offsets
--cl, --counter_list        force total charges in this group to zero
-tc, --total_charge         default is 1.0
-nz, --bool_nozero          force no zero charges was generated, default is True
-nu, --bool_neutral         force final calculated value scaled from 1 or not, default is True
-q, --bool_limit            force charge sign, either positive or negative, default is None
-nr, --nmround              decimal round number, default is 2
-b, --in_keyowrd            the mark of start in the input file
-nm, --gennm                output file numbers, default is 5
-lim, --threshold           threshold for the charge value generation
-o, --fname                 output file name, default is ChargeRandomGen

gaml filegengaussian

-ftop, --toppath            GROMACS topology file
-f, --file_path             GROMACS output pdb/gro file
-sr, --select_range         Angstrom, default is 10
-bs, --basis_set            Gaussian definition, default is # HF/6-31G(d) Pop=CHelpG
-cs, --charge_spin          Gaussian definition, default is 0 1
-nm, --gennm                output file numbers, default is 5
-o, --fname                 output file name, default is GaussInput

gaml filegengromacstop

-f, --charge_path           input charge file
-ftop, --toppath            GROMACS topology file
-sl, --symmetry_list        a python type list contains atom's chemical equivalent
-res, --reschoose           choose residue, default is ALL,
-b, --in_keyowrd            the mark of start in the input file
-e, --cut_keyowrd           the mark of end in the input file
-nm, --gennm                output file numbers, default is 5
-o, --fname                 output file name, default is GromacsTopfile

gaml GAML

-f, --file_path             input MD file
-fc, --charge_path          input charge file
-sl, --symmetry_list        list contains atom's chemical equivalent, index starting from 1
-ol, --offset_list          two offsets to fit charge constrain
--offset_nm                 loop numbers to for offsets
--cl, --counter_list        force total charges in this group to zero
-tc, --total_charge         default is 0.0
-nz, --bool_nozero          force no zero charges was generated, default is True
-nu, --bool_neutral         force final calculated value scaled from 1 or not, default is True
-q, --bool_limit            force charge sign, either positive or negative, default is None
-nr, --nmround              decimal round number, default is 2
-nm, --gennm                output file numbers, default is 5
-lim, --threshold           threshold for the charge value generation
-d, --error_tolerance       default is 0.8
-ex, --charge_extend_by     the value to mutate charge range bound, default is 0.3
-ro, --ratio                ratio among Cross-over to Average to Mutation. default is 7:2:1
-abs, --bool_abscomp        use absolute value or not
-e, --cut_keyowrd           the mark of end in the input file
-o, --fname                 output file name, default is ChargeGen

gaml fss_analysis

 -f, --file_path            input analyzing file
 -t, --stepsize             default is 0.01
 -d, --error_tolerance      default is 0.28
 -abs, --bool_abscomp       default is False, use the absolute value or not
 -p, --percent              range from 0.0 ~ 1.0, default is 0.95
 -e, --cut_keyword          the mark of the end in the input file, default is MAE
 -tl, --atomtype_list       correspondent atom types, note the character '#' is not supported
 -pn, --pallette_nm         number of pallettes used to plot the graph, default is 50
 -cm, --color_map           compatible with Matplotlib modules, default is rainbow
 -o, --fname                output file name, default is FSSPlot

filegenmdpotential

-f, --file_path FILE_PATH   MD simulation result file
-s, --chargefile            Input charge file
-lv, --literature_value     correspondent literature value
-i, --atomnm                total number of molecules in liquid phase, default is 500
--MAE                       mean-absolute-value, default is 0.05
--temperature               unit in Kelvin
--block                     mark for file process, default is COUNT
--bool_gas                  gas phase calculation, default is False
-kw, --kwlist               MD result keyword list, default is Density
-o, --fname                 output file name, default is MDProcess

filegenscripts

-n, --number                which script to choose, sequenced by -a
-a, --available             show available built-in scripts

GAML_autotrain

-f, --file_path             auto training parameters all-in-one file
--bashinterfile             user defined Bash interface file

```

Notes

A test for a 1-butylpyridinium-based ionic liquid can be found under the sample/ directory.

The OPLS-AA parameters for 86 conventional solvents optimized by GAML can be found under the Solvents/ directory. Files formatted for GROMACS.

Some features worth mentioning: + Customized selection range for Coulombic interactions with PBC removal + Two offsets as well as chemical equivalence considerations for random charge generation + The crossover/average/mutation method

References

Zhong, X.; Velez, C.; Acevedo, O. "Partial Charges Optimized by Genetic Algorithms for Deep Eutectic Solvent Simulations" J. Chem. Theory Comput., 2021, 17, 3078-3087. doi:10.1021/acs.jctc.1c00047

About

Contributing Authors: Xiang Zhong and Orlando Acevedo*

Funding: Gratitude is expressed to the National Science Foundation (CHE-1562205) for the support of this research.

Software License: GAML. Genetic Algorithm Machine Learning (GAML) software package. Copyright (C) 2021 Orlando Acevedo

Owner

  • Name: Orlando Acevedo
  • Login: orlandoacevedo
  • Kind: user
  • Location: United States
  • Company: University of Miami

Professor of Chemistry at the University of Miami

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 32
  • Total Committers: 2
  • Avg Commits per committer: 16.0
  • Development Distribution Score (DDS): 0.406
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Xiang Zhong z****7@g****m 19
Orlando Acevedo o****o@m****u 13
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 11 months ago

All Time
  • Total issues: 0
  • Total pull requests: 10
  • Average time to close issues: N/A
  • Average time to close pull requests: 8 days
  • Total issue authors: 0
  • Total pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 0.1
  • Merged pull requests: 10
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • zhongxiang117 (9)
  • orlandoacevedo (1)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 11 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 3
  • Total maintainers: 1
pypi.org: gaml

Genetic Algorithm Machine Learning

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 11 Last month
Rankings
Dependent packages count: 10.0%
Forks count: 14.2%
Stargazers count: 17.1%
Dependent repos count: 21.7%
Average: 22.2%
Downloads: 47.8%
Maintainers (1)
Last synced: 10 months ago

Dependencies

setup.py pypi
  • matplotlib *