https://github.com/aspuru-guzik-group/group-selfies
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.3%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: aspuru-guzik-group
- License: apache-2.0
- Language: Jupyter Notebook
- Default Branch: main
- Size: 10.1 MB
Statistics
- Stars: 60
- Watchers: 6
- Forks: 14
- Open Issues: 5
- Releases: 0
Metadata Files
README.md
Group SELFIES
https://arxiv.org/abs/2211.13322
Installation
Python >= 3.8 is required.
RDKit is required to use this package. Once it is installed, clone this repository using
bash
git clone https://github.com/aspuru-guzik-group/group-selfies
and run
bash
pip install .
in the cloned folder.
Introduction
Group SELFIES extends SELFIES with the ability to represent groups with single tokens. This improves interpretability, compactness, and performance in generative models.
| Encoding |
|:--:|
|
|
| Decoding |
|:--:|
|
|
Usage
See tutorial.ipynb for details on usage. For key classes/functions, see below:
| Class/Function | Description |
| ------------------------------------- | ----------------------------------------------------------------- |
| group_selfies.Group | Class that represents groups. |
| group_selfies.GroupGrammar | Class that represents a grammar, which is a set of groups used for encoding and decoding. |
| grammar.extract_groups | Finds occurences of the grammar's defined set of groups in a molecule. |
| grammar.encoder | Encodes a molecule to its corresponding Group SELFIES representation. Requires extracted group occurences returned by grammar.extract_groups |
| grammar.decoder | Decodes a Group SELFIES string to its corresponding molecule. |
| grammar.full_encoder | Extracts groups from a molecule and encodes it, essentially a combination of grammar.extract_groups and grammar.encoder. Mainly for convenience. |
| group_selfies.fragment_mols | Fragments a set of molecules into a set of reasonable groups. |
Owner
- Name: Aspuru-Guzik group repo
- Login: aspuru-guzik-group
- Kind: organization
- Website: http://aspuru.chem.harvard.edu/
- Repositories: 30
- Profile: https://github.com/aspuru-guzik-group
GitHub Events
Total
- Watch event: 15
- Pull request event: 2
- Fork event: 6
Last Year
- Watch event: 15
- Pull request event: 2
- Fork event: 6
Dependencies
- PyTDC *
- molplotly *
- pathos *
- requests *
- selfies *
- global_chem *
- networkx *
- rdkit *
- tqdm *