Science Score: 41.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 5 DOI reference(s) in README
  • Academic publication links
    Links to: ncbi.nlm.nih.gov, wiley.com
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.3%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: alainrichardt
  • License: gpl-3.0
  • Language: Python
  • Default Branch: master
  • Size: 10.5 MB
Statistics
  • Stars: 0
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 11 years ago · Last pushed over 11 years ago
Metadata Files
Readme License Citation

README.md

README.md (finddifferentialprimers)

Overview

This repository contains code for finding discriminatory primers among genomes or other biological sequences of interest.

DEPENDENCIES:

The following dependencies have been confirmed to work for running the 'finddifferentialprimers.py' pipeline, though any later version (and in some cases, some earlier versions) should work:

NOTE ON EMBOSS/PRIMER3/BIOPYTHON DEPENDENCIES

There is a point of fragility in choice of EMBOSS, primer3, and Biopython versions, that centres around the following issues:

  1. primer3 versions newer than v1.1.4 do not work with EMBOSS. This locks us, for now, into a 2008 version of primer3. As I chose to use the EMBOSS tools, I am taking their lead on this - when EMBOSS ePrimer3 changes to v2+, so will this script.
  2. EMBOSS' ePrimer3 interface is not stable between version numbers. In particular, v6.5.0 changes the -otm flag to -opttm. This means that Biopython version v1.63 or lower does not use the appropriate option for EMBOSS v6.5.0. If you are using an EMBOSS version older than v6.5.0 then Biopython 1.63 should be fine. If you are using EMBOSS v6.5.0+, then note that the appropriate change has been committed at the git repository at https://github.com/biopython/biopython (as of Dec 2013), and for now you should install Biopython from the bleeding edge source. It is anticipated that this change will appear in Biopython v1.64.

Acceptable combinations:

  • EMBOSS v6.5.0+/Biopython cloned from GitHub repository/Primer3 v1.1.4 should work
  • EMBOSS pre-v6.5.0/Biopython pre-v1.63/Primer3 v1.1.4 should work
  • Primer3 v2+: will not work

INSTALLATION

If you have downloaded v0.1.0 or greater, and the dependencies above are satisfied, then installation should be as simple as cloning the repository:

$ git clone https://github.com/widdowquinn/find_differential_primers $ cd find_differential_primers

then issuing:

$ python setup.py install

(or whatever variant you wish, e.g. for a home directory-local installation) from the top directory in the repository, with root permissions, if necessary.

BASIC USE:

  1. Collect all biological sequences to be distinguished into a convenient location (e.g. in same directory; this is not essential, but it simplifies things if using a Makefile).
  2. Construct a config file similar to the example given in O104_primers_5.conf or test.conf. This will describe each sequence by name, the classes to which it belongs, and (at least) the location of the FASTA file containing the sequence (or sequences - find_differential_primers.py will stitch sequences with the spacer NNNNNCATTCCATTCATTAATTAATTAATGAATGAATGNNNNN, if necessary).
  3. If you need a BLAST database of negative screening examples, construct this with makeblastdb (part of BLAST+).
  4. Run the find_differential_primers.py script, with suitable command-line options.

These steps are encapsulated in the accompanying makefile in samples/makefile. This file can be modified to point to your input sequence file of interest, and run by issuing make at the command-line. See documentation in the makefile for more details.

TEST:

Change directory to tests, and run the script with the test.conf config file using default settings:

$ ../find_differential_primers/find_differential_primers.py -i test.conf -v

This should run to completion, and produce the output indicated below:

``` $ tree differentialprimerresults/ differentialprimerresults/ ├── Erwiniafamily-specificamplicons.fas ├── Erwiniafamily-specificprimers.eprimer3 ├── Eta199specificamplicons.fas ├── Eta199specificprimers.eprimer3 ├── PbaSCRI1043specificamplicons.fas ├── PbaSCRI1043specificprimers.eprimer3 ├── PcaPC1specificamplicons.fas ├── PcaPC1specificprimers.eprimer3 ├── PcaPCC21specificamplicons.fas ├── PcaPCC21specificprimers.eprimer3 ├── Pectobacteriumfamily-specificamplicons.fas ├── Pectobacteriumfamily-specificprimers.eprimer3 ├── PwaWPP163specificamplicons.fas ├── PwaWPP163specificprimers.eprimer3 ├── atrosepticumfamily-specificamplicons.fas ├── atrosepticumfamily-specificprimers.eprimer3 ├── carotovorumfamily-specificamplicons.fas ├── carotovorumfamily-specificprimers.eprimer3 ├── differentialprimerresults-families.tab ├── differentialprimerresults.tab ├── tasmaniensisfamily-specificamplicons.fas ├── tasmaniensisfamily-specificprimers.eprimer3 ├── universalamplicons.fas ├── universalprimers.eprimer3 ├── wasabiaefamily-specificamplicons.fas └── wasabiaefamily-specificprimers.eprimer3 0 directories, 26 files $ wc differentialprimerresults/* 51 68 2621 differentialprimerresults/Erwiniafamily-specificamplicons.fas 140 452 4296 differentialprimerresults/Erwiniafamily-specificprimers.eprimer3 51 68 2621 differentialprimerresults/Eta199specificamplicons.fas 140 469 4681 differentialprimerresults/Eta199specificprimers.eprimer3 24 32 1257 differentialprimerresults/PbaSCRI1043specificamplicons.fas 68 226 2315 differentialprimerresults/PbaSCRI1043specificprimers.eprimer3 18 24 918 differentialprimerresults/PcaPC1specificamplicons.fas 52 172 1737 differentialprimerresults/PcaPC1specificprimers.eprimer3 21 28 1085 differentialprimerresults/PcaPCC21specificamplicons.fas 60 199 2019 differentialprimerresults/PcaPCC21specificprimers.eprimer3 42 56 2166 differentialprimerresults/Pectobacteriumfamily-specificamplicons.fas 116 374 3583 differentialprimerresults/Pectobacteriumfamily-specificprimers.eprimer3 18 24 936 differentialprimerresults/PwaWPP163specificamplicons.fas 52 172 1759 differentialprimerresults/PwaWPP163specificprimers.eprimer3 24 32 1257 differentialprimerresults/atrosepticumfamily-specificamplicons.fas 68 218 2138 differentialprimerresults/atrosepticumfamily-specificprimers.eprimer3 24 32 1236 differentialprimerresults/carotovorumfamily-specificamplicons.fas 68 218 2107 differentialprimerresults/carotovorumfamily-specificprimers.eprimer3 13 56 1150 differentialprimerresults/differentialprimerresults-families.tab 15 86 933 differentialprimerresults/differentialprimerresults.tab 51 68 2621 differentialprimerresults/tasmaniensisfamily-specificamplicons.fas 140 452 4301 differentialprimerresults/tasmaniensisfamily-specificprimers.eprimer3 0 0 0 differentialprimerresults/universalamplicons.fas 4 10 135 differentialprimerresults/universalprimers.eprimer3 18 24 936 differentialprimerresults/wasabiaefamily-specificamplicons.fas 52 166 1626 differentialprimerresults/wasabiaefamily-specificprimers.eprimer3 1330 3726 50434 total $ cat differentialprimerresults/differentialprimerresults.tab

Summary information table

Generated by finddifferentialprimers

Columns in the table:

1) Query organism ID

2) Query organism families

3) Count of organism-unique primers

4) Count of universal primers

5) Query sequence filename

6) Query feature filename

7) Query ePrimer3 primers filename

PbaSCRI1043 Pectobacterium,atrosepticum 8 0 sequences/NC004547.fna sequences/NC004547.prodigalout sequences/NC004547.eprimer3 PcaPC1 Pectobacterium,carotovorum 6 1 sequences/NC012917.fna sequences/NC012917.prodigalout sequences/NC012917.eprimer3 PwaWPP163 Pectobacterium,wasabiae 6 2 sequences/NC013421.fna sequences/NC013421.prodigalout sequences/NC013421.eprimer3 PcaPCC21 Pectobacterium,carotovorum 7 0 sequences/NC018525.fna sequences/NC018525.prodigalout sequences/NC018525.eprimer3 Eta199 Erwinia,tasmaniensis 17 0 sequences/NC010694.fna sequences/NC010694.prodigalout sequences/NC010694.eprimer3 $ cat differentialprimerresults/differentialprimer_results-families.tab

Summary information table

Generated by finddifferentialprimers

Columns in the table:

1) Family

2) Count of family-specific primers

3) Family-specific primer file

4) Family-specific amplicon file

Erwinia 17 differentialprimerresults/Erwiniafamily-specificprimers.eprimer3 differentialprimerresults/Erwiniafamily-specificamplicons.fas carotovorum 8 differentialprimerresults/carotovorumfamily-specificprimers.eprimer3 differentialprimerresults/carotovorumfamily-specificamplicons.fas Pectobacterium 14 differentialprimerresults/Pectobacteriumfamily-specificprimers.eprimer3 differentialprimerresults/Pectobacteriumfamily-specificamplicons.fas wasabiae 6 differentialprimerresults/wasabiaefamily-specificprimers.eprimer3 differentialprimerresults/wasabiaefamily-specificamplicons.fas atrosepticum 8 differentialprimerresults/atrosepticumfamily-specificprimers.eprimer3 differentialprimerresults/atrosepticumfamily-specificamplicons.fas tasmaniensis 17 differentialprimerresults/tasmaniensisfamily-specificprimers.eprimer3 differentialprimerresults/tasmaniensisfamily-specificamplicons.fas ```

Once you have run the tests once, you can use the test_nocalc.conf file to re-run the prediction/classification without having to execute Prodigal, ePrimer3 or PrimerSearch, by running:

$ ../find_differential_primers/find_differential_primers.py -i test_nocalc.conf -v --noprimersearch --noprodigal --noprimer3

FURTHER INFORMATION:

Please read the comments contained within the top of each '*.py' file as well as the Supporting Information ('Methods S1' document) of doi:10.1371/journal.pone.0034498.

CONTRIBUTORS

CITATIONS

Please refer to the following for methodological details:

  • Pritchard L et al. (2012) "Alignment-Free Design of Highly Discriminatory Diagnostic Primer Sets for Escherichia coli O104:H4 Outbreak Strains." PLoS ONE 7(4): e34498. doi:10.1371/journal.pone.0034498 - Method description and application to human bacterial pathogens, sub-serotype resolution
  • Pritchard L et al. (2013) "Detection of phytopathogens of the genus Dickeya using a PCR primer prediction pipeline for draft bacterial genome sequences." Plant Pathology, 62, 587-596 doi:10.1111/j.1365-3059.2012.02678.x - Application to plant pathogens, species-level resolution

Owner

  • Name: Alain Richardt
  • Login: alainrichardt
  • Kind: user
  • Location: London
  • Company: Bootstrapped

Founder of Meta Materials Ltd

Citation (CITATION)

To reference this software in publications, please cite the following:

1. Pritchard L, Humphris S, Saddler GS, Parkinson NM, Bertrand V, Elphinstone JG, Toth IK. 2013. Detection of phytopathogens of the genus Dickeya using a PCR primer prediction pipeline for draft bacterial genome sequences. Plant Pathology 62:587–596.

2. Pritchard L, Holden NJ, Bielaszewska M, Karch H, Toth IK. 2012. Alignment-free design of highly discriminatory diagnostic primer sets for Escherichia coli O104:H4 outbreak strains. PLoS ONE 7:e34498.


@article{Pritchard:2013ip,
author = {Pritchard, L and Humphris, S and Saddler, G S and Parkinson, N M and Bertrand, V and Elphinstone, J G and Toth, I K},
title = {{Detection of phytopathogens of the genus Dickeya using a PCR primer prediction pipeline for draft bacterial genome sequences}},
journal = {Plant Pathology},
year = {2013},
volume = {62},
pages = {587--596}
}

@article{Pritchard:2012bt,
author = {Pritchard, Leighton and Holden, Nicola J and Bielaszewska, Martina and Karch, Helge and Toth, Ian K},
title = {{Alignment-free design of highly discriminatory diagnostic primer sets for Escherichia coli O104:H4 outbreak strains.}},
journal = {PLoS ONE},
year = {2012},
volume = {7},
number = {4},
pages = {e34498}
}

GitHub Events

Total
Last Year

Dependencies

setup.py pypi
  • biopython *
  • bx-python *