cerebro

Identification and annotation of mutations in a set of mutant genomes with respect to a reference genome

https://github.com/reillyeo/cerebro

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.5%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Identification and annotation of mutations in a set of mutant genomes with respect to a reference genome

Basic Info
  • Host: GitHub
  • Owner: reillyeo
  • Language: Shell
  • Default Branch: main
  • Homepage:
  • Size: 22.5 KB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 8 months ago · Last pushed 6 months ago
Metadata Files
Readme Citation

README.md

Cerebro

This script finds all point mutations in a set of mutant genomes compared to a reference genome, and the genes in which they occur. If prokka .ffn file sequences are also provided, an output fasta of the mutated proteins is produced that can be used for functional annotation with eggnog mapper.

Requirements:

  1. Input directory containing:
    • assembled mutant genomes
      • format: *.fasta
    • one reference genome
      • format: ref*.fasta
    • gene annotations for the reference genome
      • format: *.gff
    • protein sequences corresponding to reference annotations (optional, for functional annotation)
      • format: *.ffn
  2. NUCmer and BEDtools installed and accessible in the PATH.
  3. GNU parallel installed and accessible in the PATH.

Usage:

```text Usage - cerebro [OPTIONS]

Options: -i input directory (default: current directory) -o output directory (default: cerebro_output) -t number of threads to use for parallel processing (default: 20) -f use this flag to also produce output fasta of mutated proteins for functional annotation -h display help message and usage information ```

Output:

  • 9 column .tsv file containing information on all mutations in each mutant with respect to the reference genome (columns: Mutant, Reference, Mutation, StartPos, EndPos, Gene, Product, FeatureType, MutationSize)

  • (if -f flag is used) A single fasta file containing the amino acid sequences of all the genomic features in which a mutation was found (can be provided to http://eggnog-mapper.embl.de/FESNOV for functional annotation)

Extras:

The cerebromergetables.sh script can be used to merge the output of eggnog mapper with the mutation summary table. For this script to work, the eggnog mapper annotation table must be named emapper_annotations.tsv


Developed by Eoghan Reilly eoghan@food.ku.dk


Owner

  • Login: reillyeo
  • Kind: user

Citation (CITATION.CFF)

cff-version: 1.2.0
title: Cerebro
message: >-
  If you use this pipeline, please consider citing it as
  below
type: software
authors:
  - given-names: Eoghan
    family-names: Reilly
    email: eoghan@food.ku.dk
    affiliation: University of Copenhagen
repository-code: 'https://github.com/reillyeo/Cerebro'
abstract: >-
      Identification and annotation of all mutations in a set of mutant genomes with respect to a reference genome
date-released: '2025-07-14'

GitHub Events

Total
  • Push event: 3
Last Year
  • Push event: 3