Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.8%) to scientific vocabulary
Repository
symcla: symbiont classifier
Basic Info
Statistics
- Stars: 5
- Watchers: 2
- Forks: 1
- Open Issues: 1
- Releases: 0
Metadata Files
README.md
⚠️⚠️⚠️ symcla is being archived and its development will continue as part of the symclatron project ⚠️⚠️⚠️
symcla: symbiont classifier
💾 Installation
Clone the symcla repository:
{shell}
git clone https://github.com/NeLLi-team/symcla.git
{bash}
cd symcla/
chmod u+x symcla
Create conda environment and install requirements:
{bash}
conda create -c conda-forge -c bioconda --name symcla --file requirements.txt
💽 Setup data (run only once)
Run inside the symcla/ folder:
{shell}
conda activate symcla
{shell}
./symcla setup
🚀 Example run
{shell}
conda activate symcla
👷🏻♀️ Run the classifier
{shell}
path_to_symcla/symcla classify --genomedir data/test_genomes --savedir test_output --ncpus 32
To get help
```{bash} ./symcla classify --help
Usage: symcla classify [OPTIONS]
╭─ Options ──────────────────────────────────────────────────────────────────────╮
│ --genomedir TEXT [default: input_genomes] │
│ --savedir TEXT [default: output_symcla] │
│ --ncpus INTEGER [default: 16] │
│ --deltmp --no-deltmp [default: deltmp] │
│ --help Show this message and exit. │
╰────────────────────────────────────────────────────────────────────────────────╯
```
🕺🏻 Results
Expected results from the test data:
| taxonoid | completenessUNI56 | featuresgt0| featuresge20 | featuresge100 | symclascore | |-----------------------------------------|--------------------|-------------|---------------|----------------|--------------| | IMGI2140918011 | 98.214 | 396 | 291 | 128 | 0.000 | | IMGI2645727657 | 100.000 | 287 | 197 | 70 | -0.003 | | IMGI651324087 | 100.000 | 368 | 252 | 106 | -0.009 | | IMGM3300027739BIN74 | 64.286 | 310 | 234 | 95 | 0.001 | | SCISO2808607008 | 98.214 | 406 | 276 | 124 | 2.000 | | SDISOGCA003484685.1 | 83.929 | 193 | 126 | 43 | 2.000 | | SHISO2654587767 | 98.214 | 423 | 309 | 134 | 2.000 | | SLISOGCF900639865.1 | 100.000 | 569 | 429 | 234 | 0.999 | | SRISO640427127 | 92.857 | 296 | 197 | 103 | 2.000 | | SXGCA000019745.1 | 98.214 | 353 | 259 | 106 | 0.126 | | SXGCA902860225.1Azoamicus_ciliaticola | 91.071 | 117 | 83 | 36 | 1.055 | | SXISO642555114 | 96.429 | 333 | 243 | 108 | 1.995 |
🧐 Interpretation of results:
completeness_UNI56: The percentage of 56 universal bacterial and archaeal marker genes found in the genome. We do not advise to trust any results <50%. Confidence in symbiont prediction increases with UNI56 completeness.features_gt01: Number of features found with a bitscore greater than 0. Confidence in symbiont prediction increases with more features found.features_ge20: Number of features found with a bitscore greater or equal than 20. Confidence in symbiont prediction increases with more features found.features_ge100: Number of features found with a bitscore greater or equal than 100. Confidence in symbiont prediction increases with more features found.symcla_score: after adjusting the classification thresholds based on thousands of experiments, we recommend the following values:symcla_score <= 0.42: Free-living0.42 < symcla_score < 1.21: Symbiont;Host-associatedsymcla_score >= 1.21: Symbiont;Intracellular
🤖 Note: by design symcla minimizes the rate of false positives for symbionts, at the expense of increased false negatives (i.e. some Symbiont;Host-associated might still get a symcla score lower than 0.42, and some Symbiont;Intracellular might still get a symcla score lower than 1.21).
🐳 symcla container
Apptainer
```bash apptainer pull \ docker://docker.io/jvillada/symcla:latest
apptainer run \ docker://docker.io/jvillada/symcla:latest \ symcla \ classify \ --genomedir pathtodirwithfaafiles \ --savedir pathtooutputdir \ --ncpus 16 ```
Owner
- Name: NeLLi: New Lineages of Life
- Login: NeLLi-team
- Kind: organization
- Location: United States of America
- Website: https://jgi.doe.gov/our-science/scientists-jgi/new-lineages-of-life/
- Repositories: 1
- Profile: https://github.com/NeLLi-team
New Lineages of Life - US DOE Joint Genome Institute
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Villada" given-names: "Juan C." orcid: "https://orcid.org/0000-0003-2216-4279" - family-names: "Schulz" given-names: "Frederik" orcid: "https://orcid.org/0000-0002-4932-4677" title: "symcla: symbiont classifier" version: 0.1.0 date-released: 2024-07-08 url: "https://github.com/NeLLi-team/symcla"
GitHub Events
Total
- Issues event: 1
- Watch event: 2
- Push event: 12
- Pull request event: 2
- Fork event: 1
Last Year
- Issues event: 1
- Watch event: 2
- Push event: 12
- Pull request event: 2
- Fork event: 1
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 1
- Total pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: 2 days
- Total issue authors: 1
- Total pull request authors: 1
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: 2 days
- Issue authors: 1
- Pull request authors: 1
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- willboulton (1)
Pull Request Authors
- Tsaranoga (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- ubuntu 23.10 build
- brotli =1.0.9
- brotli-bin =1.0.9
- bzip2 =1.0.8
- ca-certificates =2023.5.7
- certifi =2023.5.7
- charset-normalizer =3.1.0
- click =8.1.3
- colorama =0.4.6
- hmmer =3.3.2
- idna =3.4
- joblib =1.2.0
- ld_impl_linux-64 =2.40
- libblas =3.9.0
- libbrotlicommon =1.0.9
- libbrotlidec =1.0.9
- libbrotlienc =1.0.9
- libcblas =3.9.0
- libexpat =2.5.0
- libffi =3.4.2
- libgcc-ng =12.2.0
- libgfortran-ng =12.2.0
- libgfortran5 =12.2.0
- libgomp =12.2.0
- liblapack =3.9.0
- libnsl =2.0.0
- libopenblas =0.3.21
- libsqlite =3.42.0
- libstdcxx-ng =12.2.0
- libuuid =2.38.1
- libxgboost =1.7.4
- libzlib =1.2.13
- markdown-it-py =2.2.0
- mdurl =0.1.0
- ncurses =6.3
- numpy =1.24.3
- openssl =3.1.1
- packaging =23.1
- pandas =2.0.2
- pip =23.1.2
- platformdirs =3.5.1
- pooch =1.7.0
- py-xgboost =1.7.4
- pygments =2.15.1
- pysocks =1.7.1
- python =3.11.3
- python-dateutil =2.8.2
- python-tzdata =2023.3
- python_abi =3.11
- pytz =2023.3
- readline =8.2
- requests =2.31.0
- rich =13.4.1
- scikit-learn =1.2.2
- scipy =1.10.1
- setuptools =67.7.2
- shap =0.45.0
- shellingham =1.5.1
- six =1.16.0
- threadpoolctl =3.1.0
- tk =8.6.12
- typer =0.9.0
- typing-extensions =4.6.2
- tzdata =2023c
- urllib3 =2.0.2
- wheel =0.40.0
- xgboost =1.7.4
- xz =5.2.6