https://github.com/bytedance/cryostar
Leveraging Structural Priors and Constraints for Cryo-EM Heterogeneous Reconstruction
Science Score: 39.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.0%) to scientific vocabulary
Keywords
Repository
Leveraging Structural Priors and Constraints for Cryo-EM Heterogeneous Reconstruction
Basic Info
- Host: GitHub
- Owner: bytedance
- License: apache-2.0
- Language: Python
- Default Branch: main
- Homepage: https://bytedance.github.io/cryostar/
- Size: 1.52 MB
Statistics
- Stars: 55
- Watchers: 4
- Forks: 2
- Open Issues: 1
- Releases: 0
Topics
Metadata Files
README.md
CryoSTAR
CryoSTAR is a neural network based framework for recovering conformational heterogenity of protein complexes. By leveraging the structural prior and constraints from a reference pdb model, cryoSTAR can output both the protein structure and density map.

User Guide
The detailed user guide can be found at here. This comprehensive guide provides in-depth information about the topic at hand. Feel free to visit the link if you're seeking more knowledge or need extensive instructions regarding the topic.
Installation
- Create a conda environment:
conda create -n cryostar python=3.9 -y - Clone this repository and install the package:
git clone https://github.com/bytedance/cryostar.git && cd cryostar && pip install .
Quick start
Preliminary
You may need to prepare the resources below before running cryoSTAR:
- a concensus map (along with each particle's pose)
- a pdb file (which has been docked into the concensus map)
Training
CryoSTAR operates through a two-stage approach where it independently trains an atom generator and a density generator. Here's an illustration of its process:
S1: Training the atom generator
In this step, we generate an ensemble of coarse-grained protein structures from the particles. Note that the pdb file is used in this step and it should be docked into the concensus map!
shell
cd projects/star
python train_atom.py atom_configs/1ake.py
The outputs will be stored in the work_dirs/atom_xxxxx directory, and we perform evaluations every 12,000 steps. Within this directory, you'll observe sub-directories with the name epoch-number_step-number. We choose the most recent directory as the final results.
text
atom_xxxxx/
├── 0000_0000000/
├── ...
├── 0112_0096000/ # evaluation results
│ ├── ckpt.pt # model parameters
│ ├── input_image.png # visualization of input cryo-EM images
│ ├── pca-1.pdb # sampled coarse-grained atomic structures along 1st PCA axis
│ ├── pca-2.pdb
│ ├── pca-3.pdb
│ ├── pred.pdb # sampled structures at Kmeans cluster centers
│ ├── pred_gmm_image.png
│ └── z.npy # the latent code of each particle
| # a matrix whose shape is num_of_particle x 8
├── yyyymmdd_hhmmss.log # running logs
├── config.py # a backup of the config file
└── train_atom.py # a backup of the training script
S2: Training the density generator
In step 1, the atom generator assigns a latent code z to each particle image. In this step, we will drop the encoder and directly use the latent code as a representation of a partcile. You can execute the subsequent command to initiate the training of a density generator.
```shell
change the xxx/z.npy path to the output of the above command
python traindensity.py densityconfigs/1ake.py --cfg-options extrainputdataattr.givenz=xxx/z.npy ```
Results will be saved to work_dirs/density_xxxxx, and each subdirectory has the name epoch-number_step-number. We choose the most recent directory as the final results.
text
density_xxxxx/
├── 0004_0014470/ # evaluation results
│ ├── ckpt.pt # model parameters
│ ├── vol_pca_1_000.mrc # density sampled along the PCA axis, named by vol_pca_pca-axis_serial-number.mrc
│ ├── ...
│ ├── vol_pca_3_009.mrc
│ ├── z.npy
│ ├── z_pca_1.txt # sampled z values along the 1st PCA axis
│ ├── z_pca_2.txt
│ └── z_pca_3.txt
├── yyyymmdd_hhmmss.log # running logs
├── config.py # a backup of the config file
└── train_density.py # a backup of the training script
Reference
You may cite this software by:
bibtex
@article{li2023cryostar,
author={Li, Yilai and Zhou, Yi and Yuan, Jing and Ye, Fei and Gu, Quanquan},
title={CryoSTAR: leveraging structural priors and constraints for cryo-EM heterogeneous reconstruction},
journal={Nature Methods},
year={2024},
month={Oct},
day={29},
issn={1548-7105},
doi={10.1038/s41592-024-02486-1},
url={https://doi.org/10.1038/s41592-024-02486-1}
}
Owner
- Name: Bytedance Inc.
- Login: bytedance
- Kind: organization
- Location: Singapore
- Website: https://opensource.bytedance.com
- Twitter: ByteDanceOSS
- Repositories: 255
- Profile: https://github.com/bytedance
GitHub Events
Total
- Watch event: 22
- Push event: 3
- Fork event: 1
Last Year
- Watch event: 22
- Push event: 3
- Fork event: 1
Issues and Pull Requests
Last synced: 12 months ago
All Time
- Total issues: 2
- Total pull requests: 1
- Average time to close issues: 6 days
- Average time to close pull requests: about 1 hour
- Total issue authors: 2
- Total pull request authors: 1
- Average comments per issue: 7.5
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- kamzero (1)
- PepperLee-sm (1)
Pull Request Authors
- dugu9sword (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- autoflake *
- biopython *
- biotite *
- dm-tree *
- einops *
- fastpdb *
- gemmi *
- isort *
- jupyterlab *
- lazy_loader *
- lightning *
- matplotlib *
- mmengine *
- mrcfile *
- nbqa *
- numpy *
- pandas *
- scipy *
- starfile *
- tabulate *
- tensorboard *
- torch *
- tqdm *
- yapf *