integron_finder

Bioinformatics tool to find integrons in bacterial genomes

https://github.com/gem-pasteur/integron_finder

Science Score: 85.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: ncbi.nlm.nih.gov
  • Committers with academic emails
    4 of 10 committers (40.0%) from academic institutions
  • Institutional organization owner
    Organization gem-pasteur has institutional domain (research.pasteur.fr)
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.7%) to scientific vocabulary

Keywords

bacterial-genomes bioinformatics identification

Scientific Fields

Earth and Environmental Sciences Physical Sciences - 40% confidence
Last synced: 4 months ago · JSON representation ·

Repository

Bioinformatics tool to find integrons in bacterial genomes

Basic Info
  • Host: GitHub
  • Owner: gem-pasteur
  • License: gpl-3.0
  • Language: F*
  • Default Branch: master
  • Homepage:
  • Size: 56.1 MB
Statistics
  • Stars: 81
  • Watchers: 9
  • Forks: 23
  • Open Issues: 6
  • Releases: 9
Topics
bacterial-genomes bioinformatics identification
Created over 10 years ago · Last pushed 8 months ago
Metadata Files
Readme Contributing License Citation Codemeta

README.md

testing codecov Doc License (GPL version 3) PyPI - Python Version PyPI Downloads Docker Image Version (tag latest semver) Conda SWH

Integron Finder

Finds integrons in DNA sequences

You can use it in command line, see installation below, or you can use it online on the Galaxy Pasteur.

See Documentation for how to use it: Doc

Installation

Although a system wide installation is possible and supported, many distribution do not allow it. So we describe bellow some user wide installation procedures.

For user

pip install --user integron_finder==2.xx

for more installation options, or for developer installation see documentation

In a virtualenv

To avoid interaction with the system libraries you can install integron_finder in a virtualenv.

  1. create and activate the virtualenv bash python -m venv Integron_Finder ./Integron_Finder/bin/activate
  2. install integronfinder ```bash (IntegronFinder) python -m pip install integronfinder `` all libraries will be located inIntegronFinder` directory
  3. when you want to quit the virtualenv bash (Integron_Finder) deactivate

Container

For reproducibility and easy way to use integron_finder without installing third party software (hmmsearch, prodigal, ...) or libraries, we provide containers based on docker.

https://hub.docker.com/r/gempasteur/integron_finder

Docker

The computation are perform under IF user in /home/IF inside the container. So You have to mount a directory from the host in the container to exchange data (inputs data, and results) from the host and the container.

The shared directory must be writable by the IF user or overwrite the user in the container by your id (see example below)

mkdir shared_dir cd shared_dir docker run -v $PWD:/home/IF -u $(id -u ${USER}):$(id -g ${USER}) integron_finder:2.0rc9 --local-max --circ --keep-tmp NZ_CP016323.fna

Singularity

As the docker image is registered in docker hub you can also use it directly with Singularity. Unlike docker, you have not to worry about shared directory, your home and /tmp are automatically shared.

singularity run -H ${HOME} docker://gempasteur/integron_finder:2.0rc9 --local-max --circ --keep-tmp NZ_CP016323.fna

or use -b option if the data is not in your home.

singularity run -H ${HOME} -b <the directory containing data> docker://gempasteur/integron_finder:2.0rc9 --local-max --circ --keep-tmp NZ_CP016323.fna

Conda installation

From 2.0 version, IntegronFinder is available as conda package. Integronfinder is in bioconda channel. (The advantage with this solution is that it will install prodigal, hmmer, and infernal too.)

  1. install conda
  2. Set up channels:

    conda config --add channels defaults
    conda config --add channels conda-forge
    conda config --add channels bioconda
    
  3. install integron_finder:

    conda install integron_finder
    

For developer

If you want to develop or submit a patch on this software you are welcome. See Developer installation in documentation.

Licence:

Contributing

We encourage contributions, bug report, enhancement ...

But before to do that we encourage to read the contributing guide.

Dependencies

  • Python >=3.10
  • Pandas >=2
  • Numpy >=1.26
  • Biopython >=1.82
  • Matplotlib >=3.8
  • colorlog
  • HMMER >=3.1b2,<=3.3.2
  • INFERNAL >=1.1.2,<=1.1.4
  • Prodigal >=2.6.2,<=V2.6.3

Usage

``` usage: integron_finder [-h] [--local-max] [--func-annot] [--cpu CPU] [-dt DISTANCE_THRESHOLD] [--outdir OUTDIR] [--union-integrases] [--cmsearch CMSEARCH] [--hmmsearch HMMSEARCH] [--prodigal PRODIGAL] [--path-func-annot PATHFUNCANNOT] [--gembase] [--gembase-path GEMBASE_PATH] [--annot-parser ANNOT_PARSER] [--prot-file PROT_FILE] [--attc-model ATTC_MODEL] [--evalue-attc EVALUE_ATTC] [--calin-threshold CALIN_THRESHOLD] [--keep-palindromes] [--no-proteins] [--promoter-attI] [--max-attc-size MAXATTCSIZE] [--min-attc-size MINATTCSIZE] [--eagle-eyes] [--pdf] [--gbk] [--keep-tmp] [--split-results] [--circ | --linear] [--topology-file TOPOLOGY_FILE] [--version] [--mute] [-v] [-q] replicon

positional arguments: replicon Path to the replicon file (in fasta format), eg : path/to/file.fst or file.fst

optional arguments: -h, --help show this help message and exit --local-max Allows thorough local detection (slower but more sensitive and do not increase false positive rate). --func-annot Functional annotation of CDS associated with integrons HMM files are needed in Funcannot folder. --cpu CPU Number of CPUs used by INFERNAL and HMMER. Increasing too much (usually above 4) may decrease performance. -dt DISTANCETHRESHOLD, --distance-thresh DISTANCETHRESHOLD Two elements are aggregated if they are distant of DISTANCETHRESH [4000]bp or less --outdir OUTDIR Set the output directory (default: current) --union-integrases Instead of taking intersection of hits from Phageint profile (Tyr recombinases) and integronintegrase profile, use the union of the hits --cmsearch CMSEARCH Complete path to cmsearch if not in PATH. eg: /usr/local/bin/cmsearch --hmmsearch HMMSEARCH Complete path to hmmsearch if not in PATH. eg: /usr/local/bin/hmmsearch --prodigal PRODIGAL Complete path to prodigal if not in PATH. eg: /usr/local/bin/prodigal --path-func-annot PATHFUNCANNOT Path to file containing all hmm bank paths (one per line) --gembase Use gembase formatted protein file instead of Prodigal. Folder structure must be preserved --gembase-path GEMBASEPATH path to the gembase root directory (needed only if the replicon file is not locatedin gembase-path) --annot-parser ANNOTPARSER the path to the parser to use to get information from protein file. --prot-file PROTFILE The path to the proteins file used for annotations --attc-model ATTCMODEL Path or file to the attc model (Covariance Matrix). --evalue-attc EVALUEATTC Set evalue threshold to filter out hits above it (default: 1) --calin-threshold CALINTHRESHOLD keep 'CALIN' only if attC sites number >= calin-threshold (default: 2) --keep-palindromes For a given hit, if the palindromic version is found, don't remove the one with highest evalue. --no-proteins Don't annotate CDS and don't find integrase, just look for attC sites. --promoter-attI Search also for promoter and attI sites. (default False) --max-attc-size MAXATTCSIZE Set maximum value fot the attC size (default: 200bp) --min-attc-size MINATTCSIZE set minimum value fot the attC size (default: 40bp) --eagle-eyes Synonym of --local-max. Like a soaring eagle in the sky, catching rabbits (or attC sites) by surprise. --circ Set the default topology for replicons to 'circular' --linear Set the default topology for replicons to 'linear' --topology-file TOPOLOGYFILE The path to a file where the topology for each replicon is specified. --version show program's version number and exit --mute mute the log on stdout.(continue to log on integronfinder.out)

Output options: --pdf For each complete integron, a simple graphic of the region is depicted (in pdf format) --gbk generate a GenBank file with the sequence annotated with the same annotations than .integrons file. --keep-tmp keep intermediate results. This results are stored in directory named tmp_ --split-results Instead of merging integron results from all replicon in one file, keep them in separated files.

-v, --verbose Increase verbosity of output (can be cumulative : -vv) -q, --quiet Decrease verbosity of output (can be cumulative : -qq)

```

Example

integron_finder --local-max --func-annot mysequences.fst

Output :

By default, integronfinder will output 3 files under ResultsIntegronFindermysequences:

  • mysequences.integrons : A file with all integrons and their elements detected in all sequences in the input file.
  • mysequences.summary : A summary file with the number and type of integrons per sequence.
  • integron_finder.out : A copy standard output. The stdout can be silenced with the argument --mute

The amount of log in the standard output can be controlled with --verbose for more or --quiet for less, and both are cumulative arguments, eg. -vv or -qq.

Other files can be created on demand:

  • --gbk: Creates a Genbank files with all the annotations found (present in the .integrons file)
  • --pdf: Creates a simple pdf graphic with complete integrons
  • --keep-tmp: Keep temporary files. See Keep intermediate files for more.

Galaxy

You can use this program without installing it, through the pasteur galaxy webserver instance:

Citation

The paper is published in Microorganism.

Néron, Bertrand, Eloi Littner, Matthieu Haudiquet, Amandine Perrin, Jean Cury, and Eduardo P.C. Rocha. 2022. IntegronFinder 2.0: Identification and Analysis of Integrons across Bacteria, with a Focus on Antibiotic Resistance in Klebsiella Microorganisms 10, no. 4: 700. https://doi.org/10.3390/microorganisms10040700

Please cite also the following articles:

  • Nawrocki, E.P. and Eddy, S.R. (2013) Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics, 29, 2933-2935.
  • Eddy, S.R. (2011) Accelerated Profile HMM Searches. PLoS Comput Biol, 7, e1002195.
  • Hyatt, D., Chen, G.L., Locascio, P.F., Land, M.L., Larimer, F.W. and Hauser, L.J. (2010) Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics, 11, 119.

and if you use the function --func_annot which uses NCBIfam-AMRFinder hmm profiles:

  • Haft, DH et al., Nucleic Acids Res. 2018 Jan 4;46(D1):D851-D860 PMID: 29112715

Owner

  • Name: gem-pasteur
  • Login: gem-pasteur
  • Kind: organization

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: Intergon_Finder
message: "If you use this software, please cite both the article from preferred-citation and the software itself."
type: software
authors:
  - given-names: Bertrand
    family-names: Néron
    email: bneron@pasteur.fr
    affiliation: >-
      Bioinformatics and Biostatistics Hub, Institut
      Pasteur, Université de Paris Cité, 75015 Paris,
      France
    orcid: 'https://orcid.org/0000-0002-0220-0482'
  - given-names: Jean
    family-names: Cury
    email: jean.cury@normalesup.org
    affiliation: >-
      Laboratoire Interdisciplinaire des Sciences du
      Numérique, Université Paris-Saclay, CNRS UMR
      9015, INRIA, 91400 Orsay, France
  - given-names: Amandine
    family-names: Perrin
    email: amandine.perrin@pasteur.fr
    affiliation: >-
      Microbial Evolutionary Genomics, Institut
      Pasteur, Université de Paris Cité, CNRS
      UMR3525, 75015 Paris, France
    orcid: 'https://orcid.org/0000-0003-4797-6185'
identifiers:
  - type: swh
    value: >-
      swh:1:dir:7740949344db5dbda20fde7d056973afd185a0d4
    description: >-
      "The Software Heritage identifier for version
      2.0.2 of the work."
repository-code: 'https://github.com/gem-pasteur/Integron_Finder'
keywords:
  - integron
  - bioinformatic
  - genomic
  - antibiotic resistance
license: GPL-3.0
preferred-citation:
  authors:
    - given-names: Bertrand
    family-names: Néron
    email: bneron@pasteur.fr
    affiliation: >-
      Bioinformatics and Biostatistics Hub, Institut
      Pasteur, Université de Paris Cité, 75015 Paris,
      France
    orcid: 'https://orcid.org/0000-0002-0220-0482'
  - given-names: Eloi
    family-names: Littner
    email: eloi.littner@pasteur.fr
    orcid: 'https://orcid.org/0000-0002-8793-9584'
    affiliation: >-
      Microbial Evolutionary Genomics, Institut
      Pasteur, Université de Paris Cité, CNRS
      UMR3525, 75015 Paris, France / DGA CBRN
      Defence, 91710 Vert-le-Petit, France / Collège
      Doctoral, Sorbonne Université, 75005 Paris,
      France
  - given-names: Matthieu
    family-names: Haudiquet
    email: matthieu.haudiquet@pasteur.fr
    affiliation: >-
      Microbial Evolutionary Genomics, Institut
      Pasteur, Université de Paris Cité, CNRS
      UMR3525, 75015 Paris, France /  Ecole Doctorale
      FIRE–Programme Bettencourt, CRI, 75004 Paris,
      France
    orcid: 'https://orcid.org/0000-0002-0878-2209'
  - given-names: Amandine
    family-names: Perrin
    email: amandine.perrin@pasteur.fr
    affiliation: >-
      Microbial Evolutionary Genomics, Institut
      Pasteur, Université de Paris Cité, CNRS
      UMR3525, 75015 Paris, France
    orcid: 'https://orcid.org/0000-0003-4797-6185'
  - given-names: Jean
    family-names: Cury
    email: jean.cury@normalesup.org
    affiliation: >-
      Laboratoire Interdisciplinaire des Sciences du
      Numérique, Université Paris-Saclay, CNRS UMR
      9015, INRIA, 91400 Orsay, France
  - given-names: Eduardo
    name-particle: P.C.
    family-names: Rocha
    email: erocha@pasteur.fr
    affiliation: >-
      Microbial Evolutionary Genomics, Institut
      Pasteur, Université de Paris Cité, CNRS
      UMR3525, 75015 Paris, France
    orcid: 'https://orcid.org/0000-0001-7704-822X'
  title: >-
    IntegronFinder 2.0: Identification and Analysis of Integrons across Bacteria,
    with a Focus on Antibiotic Resistance in Klebsiella
  type: article
  year: 2022
  doi: https://doi.org/10.3390/microorganisms10040700

CodeMeta (codemeta.json)

{
  "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
  "type": "SoftwareSourceCode",
  "applicationCategory": "Bioinformatic",
  "author": [
    {
      "id": "https://orcid.org/0000-0002-0220-0482",
      "type": "Person",
      "affiliation": {
        "type": "Organization",
        "name": "Bioinformatics and Biostatistics Hub, Institut Pasteur, Université de Paris Cité, 75015 Paris, France"
      },
      "email": "bneron@pasteur.fr",
      "familyName": "Néron",
      "givenName": "Bertrand"
    },
    {
      "id": "https://orcid.org/0000-0002-6462-8783",
      "type": "Person",
      "affiliation": {
        "type": "Organization",
        "name": "Molecular diversity of microbes, Institut Pasteur, Université de Paris Cité, CNRS UMR3525, 75015 Paris, France"
      },
      "email": "jean.cury@normalesup.org",
      "familyName": "Cury",
      "givenName": "Jean"
    },
    {
      "id": "https://orcid.org/0000-0001-7704-822X",
      "type": "Person",
      "affiliation": {
        "type": "Organization",
        "name": "Microbial Evolutionary Genomics, Institut Pasteur, Université de Paris Cité, CNRS UMR3525, 75015 Paris, France"
      },
      "email": "erocha@pasteur.fr",
      "familyName": "Rocha",
      "givenName": "Eduardo"
    }
  ],
  "codeRepository": "https://github.com/gem-pasteur/Integron_Finder",
  "contributor": [
    {
      "id": "https://orcid.org/0000-0002-8793-9584",
      "type": "Person",
      "affiliation": {
        "type": "Organization",
        "name": "Microbial Evolutionary Genomics, Institut Pasteur, Université de Paris Cité, CNRS UMR3525, 75015 Paris, France /  DGA CBRN Defence, 91710 Vert-le-Petit, France"
      },
      "email": "eloi.littner@pasteur.fr",
      "familyName": "Littner",
      "givenName": "Eloi"
    },
    {
      "id": "https://orcid.org/0000-0002-0878-2209",
      "type": "Person",
      "affiliation": {
        "type": "Organization",
        "name": "Microbial Evolutionary Genomics, Institut Pasteur, Université de Paris Cité, CNRS UMR3525, 75015 Paris, France / Ecole Doctorale FIRE–Programme Bettencourt, CRI, 75004 Paris, France"
      },
      "email": "matthieu.haudiquet@pasteur.fr",
      "familyName": "Haudiquet",
      "givenName": "Matthieu"
    },
    {
      "id": "https://orcid.org/0000-0003-4797-6185",
      "type": "Person",
      "affiliation": {
        "type": "Organization",
        "name": "Microbial Evolutionary Genomics, Institut Pasteur, Université de Paris Cité, CNRS UMR3525, 75015 Paris, France / Collège Doctoral, Sorbonne Université, 75005 Paris, France"
      },
      "email": "amandine.perrin@pasteur.fr",
      "familyName": "Perrin",
      "givenName": "Amandine"
    }
  ],
  "dateModified": "2025-06-05",
  "datePublished": "2015-06-19",
  "description": "Integron_Finder is a program that detects integrons in DNA sequences.",
  "downloadUrl": "https://pypi.org/project/integron-finder/",
  "funder": {
    "type": "Organization",
    "name": "Institut Pasteur, CNRS"
  },
  "keywords": [
    "integron",
    "genomic",
    "antibiotic resistance",
    "bioinformatic"
  ],
  "license": "https://spdx.org/licenses/GPL-3.0",
  "name": "Integron_Finder",
  "operatingSystem": [
    "Linux",
    "macOS"
  ],
  "programmingLanguage": "Python 3",
  "relatedLink": "https://integronfinder.readthedocs.io/en/v2.0.6/",
  "softwareRequirements": [
    "https://www.python.org/ftp/python/3.13.3/Python-3.13.3.tar.xz",
    "https://pypi.org/project/pandas/2.2.3/",
    "https://pypi.org/project/numpy/2.2.5/",
    "https://pypi.org/project/biopython/1.85/",
    "https://pypi.org/project/matplotlib/3.10.1/",
    "https://pypi.org/pypi/colorlog/",
    "http://eddylab.org/software/hmmer/hmmer-3.4.tar.gz",
    "http://eddylab.org/infernal/infernal-1.1.5.tar.gz",
    "https://github.com/hyattpd/Prodigal/archive/refs/tags/v2.6.3.tar.gz"
  ],
  "version": "2.0.6",
  "contIntegration": "https://github.com/gem-pasteur/Integron_Finder/actions",
  "codemeta:continuousIntegration": {
    "id": "https://github.com/gem-pasteur/Integron_Finder/actions"
  },
  "developmentStatus": "active",
  "funding": "ANR-16-CONV-0005, EQU201903007835, ANR-10-LABX-62-IBEID",
  "issueTracker": "https://github.com/gem-pasteur/Integron_Finder/issues",
  "referencePublication": "https://doi.org/10.3390/microorganisms10040700"
}

GitHub Events

Total
  • Issues event: 1
  • Watch event: 11
  • Delete event: 7
  • Issue comment event: 1
  • Push event: 3
  • Fork event: 1
  • Create event: 2
Last Year
  • Issues event: 1
  • Watch event: 11
  • Delete event: 7
  • Issue comment event: 1
  • Push event: 3
  • Fork event: 1
  • Create event: 2

Committers

Last synced: almost 2 years ago

All Time
  • Total Commits: 1,096
  • Total Committers: 10
  • Avg Commits per committer: 109.6
  • Development Distribution Score (DDS): 0.174
Past Year
  • Commits: 5
  • Committers: 2
  • Avg Commits per committer: 2.5
  • Development Distribution Score (DDS): 0.4
Top Committers
Name Email Commits
Bertrand Néron b****n@p****r 905
jeanrjc j****c 83
Amandine PERRIN a****n@p****r 66
Jean j****y@g****m 18
jeanrjc j****y@n****g 9
jeanrjc j****n@g****m 6
khillion k****1@p****r 3
eloilit 3****t 3
Kenzo-Hugo Hillion h****o@g****m 2
Jean Cury j****y@h****r 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 101
  • Total pull requests: 11
  • Average time to close issues: 7 months
  • Average time to close pull requests: 6 days
  • Total issue authors: 46
  • Total pull request authors: 5
  • Average comments per issue: 2.06
  • Average comments per pull request: 0.64
  • Merged pull requests: 9
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 2
  • Pull request authors: 0
  • Average comments per issue: 0.5
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • bneron (21)
  • jeanrjc (14)
  • asetGem (11)
  • eloilit (5)
  • vbrover (3)
  • johrollin (3)
  • koopkaup (2)
  • jiarong (2)
  • aslangabriel (2)
  • alexweisberg (2)
  • mdtorohernando (1)
  • danyu-uofa (1)
  • susheelbhanu (1)
  • aaron-bio (1)
  • lowandrew (1)
Pull Request Authors
  • bneron (4)
  • khillion (3)
  • g1o (2)
  • eloilit (1)
  • aperrin (1)
Top Labels
Issue Labels
enhancement (20) bug (19) testing (14) high priority (11) help wanted (8) easyfix (6) medium priority (4) doc (3) low priority (2) installation (1)
Pull Request Labels