https://github.com/aaronxu9/al_fep
Active Learning for Free Energy Perturbation in Molecular Virtual Screening
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.7%) to scientific vocabulary
Repository
Active Learning for Free Energy Perturbation in Molecular Virtual Screening
Basic Info
- Host: GitHub
- Owner: AaronXu9
- License: other
- Language: Python
- Default Branch: main
- Size: 22.5 MB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
AL-FEP: Active Learning for Free Energy Perturbation in Molecular Virtual Screening
A comprehensive framework for applying active learning and reinforcement learning to molecular virtual screening, with a focus on FEP (Free Energy Perturbation) and docking oracles for target 7JVR (SARS-CoV-2 Main Protease).
Quick Start
Clone Repository
bash
git clone https://github.com/yourusername/AL_FEP.git
cd AL_FEP
Project Overview
This project implements: - Active Learning: Iterative molecular selection and evaluation - Reinforcement Learning: Agent-based molecular discovery - Multi-Oracle System: FEP, Docking, and ML-FEP evaluations - Target-Specific: Optimized for 7JVR protein target
Quick Start
1. Environment Setup
Create and activate the conda environment:
```bash
Create environment from environment.yml
conda env create -f environment.yml
Activate environment
conda activate al_fep
Verify installation
python -c "import rdkit; print('RDKit version:', rdkit.version)" ```
2. Project Structure
AL_FEP/
environment.yml # Conda environment specification
requirements.txt # Additional pip requirements
setup.py # Package installation
config/ # Configuration files
targets/ # Target-specific configs
experiments/ # Experiment configurations
src/ # Source code
al_fep/ # Main package
oracles/ # FEP, Docking, ML-FEP oracles
active_learning/ # AL algorithms
reinforcement/ # RL algorithms
molecular/ # Molecular utilities
utils/ # Common utilities
data/ # Data directory
targets/ # Target protein structures
molecules/ # Molecular datasets
results/ # Experiment results
notebooks/ # Jupyter notebooks
scripts/ # Standalone scripts
tests/ # Unit tests
3. Target 7JVR Setup
The project is pre-configured for the 7JVR target. Key files:
- config/targets/7jvr.yaml: Target-specific parameters
- data/targets/7jvr/: Protein structures and binding site info
- notebooks/01_7jvr_analysis.ipynb: Target analysis notebook
Oracle Systems
1. FEP Oracle
- High-accuracy free energy calculations
- GPU-accelerated simulations
- AMBER/GROMACS integration
2. Docking Oracle
- AutoDock Vina integration
- Multiple conformer generation
- Binding pose analysis
3. ML-FEP Oracle
- Fast ML-based FEP predictions
- Pre-trained on experimental data
- Cost-effective screening
Active Learning Workflows
- Uncertainty Sampling: Select molecules with highest prediction uncertainty
- Query by Committee: Ensemble-based selection
- Expected Improvement: Optimize acquisition functions
- Diversity-Based: Ensure chemical space coverage
Reinforcement Learning Agents
- Molecular REINFORCE: Policy gradient for molecular generation
- Actor-Critic: Value-based molecular optimization
- PPO: Proximal policy optimization for stable training
- Multi-Objective: Balance multiple molecular properties
Usage Examples
Basic Active Learning Run
```python from alfep import ActiveLearningPipeline from alfep.oracles import FEPOracle, DockingOracle
Initialize oracles
feporacle = FEPOracle(target="7jvr") dockingoracle = DockingOracle(target="7jvr")
Setup active learning
alpipeline = ActiveLearningPipeline( oracles=[feporacle, dockingoracle], strategy="uncertaintysampling", budget=100 )
Run active learning loop
results = al_pipeline.run() ```
Reinforcement Learning Training
```python from alfep import RLAgent from alfep.environments import MolecularEnv
Setup environment
env = MolecularEnv(target="7jvr", oracle="ml_fep")
Initialize agent
agent = RLAgent(algorithm="ppo", env=env)
Train agent
agent.train(total_timesteps=100000) ```
Configuration
All experiments are configured via YAML files in config/:
- Global settings in config/default.yaml
- Target-specific in config/targets/7jvr.yaml
- Experiment-specific in config/experiments/
Remote Deployment
GitHub Repository Setup
This project is ready for GitHub deployment with:
- Git repository initialized
- Comprehensive .gitignore for Python/scientific computing
- GitHub Actions CI/CD pipeline
- Pre-commit hooks for code quality
- Issue and PR templates
Deploy to Remote Server
Clone on remote server:
bash git clone https://github.com/yourusername/AL_FEP.git cd AL_FEPSetup environment:
bash conda env create -f environment.yml conda activate al_fep pip install -e .Run tests to verify:
bash python -m pytest tests/ -v
For detailed deployment instructions, see DEPLOYMENT.md.
Development
Code Quality Tools
```bash
Install development dependencies
pip install -e ".[dev]"
Setup pre-commit hooks
pre-commit install
Run all quality checks
black src/ tests/ # Code formatting
isort src/ tests/ # Import sorting
flake8 src/ tests/ # Linting
mypy src/ # Type checking
```
Running Tests
bash
pytest tests/ -v --cov=src/al_fep
Contributing
Please read CONTRIBUTING.md for guidelines on contributing to this project.
CI/CD Pipeline
The project includes a comprehensive GitHub Actions pipeline that: - Tests across Python 3.9, 3.10, 3.11 on Ubuntu and macOS - Runs linting, formatting, and type checking - Performs security vulnerability scanning - Builds and validates the package
License
MIT License - see LICENSE file for details.
Support
- Documentation: See notebooks and docstrings
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Contact: your.email@example.com
Acknowledgments
- RDKit for molecular handling
- OpenMM for molecular dynamics
- AutoDock Vina for docking
- PyTorch for machine learning
Citation
If you use this code in your research, please cite:
bibtex
@article{al_fep_2025,
title={Active Learning and Reinforcement Learning for Molecular Virtual Screening},
author={Your Name},
journal={Journal of Chemical Information and Modeling},
year={2025}
}
Owner
- Name: Ao Xu
- Login: AaronXu9
- Kind: user
- Repositories: 1
- Profile: https://github.com/AaronXu9
GitHub Events
Total
- Push event: 5
- Create event: 2
Last Year
- Push event: 5
- Create event: 2
Dependencies
- actions/checkout v4 composite
- actions/setup-python v4 composite
- codecov/codecov-action v3 composite
- conda-incubator/setup-miniconda v2 composite
- meeko *
- mols2grid *
- selfies *
- stable-baselines3 *
- wandb *
- biopandas >=0.4.0
- chembl-webresource-client >=0.10.0
- deepchem >=2.7.0
- dgl >=1.0.0
- dgllife >=0.3.0
- fegrow >=1.0.0
- meeko >=0.4.0
- mol2vec >=0.1
- mols2grid >=1.1.0
- oddt >=0.7.0
- plip >=2.2.0
- prody >=2.4.0
- prolif >=2.0.0
- selfies >=2.1.0
- stable-baselines3 >=2.0.0
- wandb >=0.17.0