https://github.com/bigdatabiology/macrel2020benchmark

https://github.com/bigdatabiology/macrel2020benchmark

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.9%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: BigDataBiology
  • Language: Python
  • Default Branch: master
  • Size: 17 MB
Statistics
  • Stars: 0
  • Watchers: 3
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Archived
Created over 6 years ago · Last pushed over 3 years ago

https://github.com/BigDataBiology/macrel2020benchmark/blob/master/

# MACREL Benchmarking (2019/20)

This repository includes code for benchmarking *MACREL*.

This is a companion repository to:

>   Santos-Jnior CD, Pan S, Zhao X, Coelho LP. 2020.
>   Macrel: antimicrobial peptide screening in genomes and metagenomes.
>   PeerJ 8:e10555. DOI: [10.7717/peerj.10555](https://doi.org/10.7717/peerj.10555)

## Contents

It contains the rules to rebuild the benchmarks in the paper.

However, instead just running the code, we strongly recommend you read it, as some steps depended on inputs obtained from manual curation

- To evaluate benchmarking results over tested AMP and hemolytic peptides prediction models, please refer to the *"train"* folder in [Macrel](https://github.com/BigDataBiology/macrel).

The other results showed in the MACREL benchmarking can be reproduced using the scripts in the following order:

(1) Benchmark.sh

(2) Macrel_in_real_metagenomes.sh

(3) Annotation_rules.sh

-- To generate Figure 3, please run:

```
$ python3 Figure_3_rendering.py
```

-- To generate Figure 4, please run:

```
$ ./python3 Figure_4_rendering.py
```

### Homology effect

In order to check homology in the training and testing data sets, please go to *"homology effects"* folder and follow the command:

```
$ ./retrain_complete.sh
```

This will retrain all models from MACREL, iAMP-2L and AMP Scanner v.2 with the non-redundant data sets, previously clustered with cd-hit at 80% of identity. The measures of accuracy, precision, and the confusion matrices will also be available. Be aware some of them can be generated in different time and will be printed in the screen.

## Third party softwares

In order to run all the codes, you will need besides MACREL:

- [Spurio](https://bitbucket.org/bateman-group/spurio/src/master/)
- [ArtMountRainier](https://www.niehs.nih.gov/research/resources/software/biostatistics/art/index.cfm)
- BlastAll+
- [pigz](https://zlib.net/pigz/)
- R v3.5+
- [samtools](http://samtools.sourceforge.net/)
- [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/linux.html)
- [Macrel](https://github.com/BigDataBiology/macrel)
- Python 3+

Owner

  • Name: Big Data Biology Lab
  • Login: BigDataBiology
  • Kind: organization
  • Email: luis@luispedro.org

GitHub Events

Total
Last Year