mcorr

Inferring bacterial recombination rates from large-scale sequencing datasets.

https://github.com/kussell-lab/mcorr

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: pubmed.ncbi, ncbi.nlm.nih.gov
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.5%) to scientific vocabulary

Keywords

correlation recombination
Last synced: 6 months ago · JSON representation

Repository

Inferring bacterial recombination rates from large-scale sequencing datasets.

Basic Info
Statistics
  • Stars: 43
  • Watchers: 3
  • Forks: 8
  • Open Issues: 7
  • Releases: 0
Topics
correlation recombination
Created over 8 years ago · Last pushed almost 3 years ago
Metadata Files
Readme

README.md

mcorr

Using Correlation Profiles of mutations to infer the recombination rate from large-scale sequencing data in bacteria.

Requirements

Installation

  1. Install mcorr-xmfa, mcorr-bam, and mcorr-fit from your terminal: sh go get -u github.com/kussell-lab/mcorr/cmd/mcorr-xmfa go get -u github.com/kussell-lab/mcorr/cmd/mcorr-bam cd $HOME/go/src/github.com/kussell-lab/mcorr/cmd/mcorr-fit python3 setup.py install or to install mcorr-fit in local directory (~/.local/bin in Linux or ~/Library/Python/3.6/bin in MacOS): sh python3 setup.py install --user
  2. Add $HOME/go/bin and $HOME/.local/bin to your $PATH environment. In Linux, you can do it in your terminal: sh export PATH=$PATH:$HOME/go/bin:$HOME/.local/bin

In MacOS, you can do it as follows: sh export PATH=$PATH:$HOME/go/bin:$HOME/Library/Python/3.6/bin

We have tested installation in Windows 10, Ubuntu 17.10, and MacOS Big Sur (on both Intel and M1 chips), using Python 3 and Go 1.15 and 1.16.

Typical installation time on an iMac is 10 minutes.

Basic Usage

The inference of recombination parameters requires two steps:

  1. Calculate Correlation Profile

    1. For whole-genome alignments (multiple gene alignments), use mcorr-xmfa:

    sh mcorr-xmfa <input XMFA file> <output prefix> The XMFA files should contain only coding sequences. The description of XMFA file can be found in http://darlinglab.org/mauve/user-guide/files.html. We provide two useful pipelines to generate whole-genome alignments: * from multiple assemblies: https://github.com/kussell-lab/AssemblyAlignmentGenerator; * from raw reads: https://github.com/kussell-lab/ReferenceAlignmentGenerator 2. For read alignments, use mcorr-bam: sh mcorr-bam <GFF3 file> <sorted BAM file> <output prefix> The GFF3 file is used for extracting the coding regions of the sorted BAM file. 3. For calculating correlation profiles between two clades or sequence clusters from whole-genome alignments, you can use mcorr-xmfa-2clades:

    sh mcorr-xmfa-2clades <input XMFA file 1> <input XMFA file 2> <output prefix> Where file 1 and file 2 are the multiple gene alignments for the two clades.

    All programs will produce two files: * a .csv file stores the calculated Correlation Profile, which will be used for fitting in the next step; * a .json file stores the (intermediate) Correlation Profile for each gene.

  2. Fit the Correlation Profile using mcorr-fit:

    1. For fitting correlation profiles as described in the 2019 Nature Methods paper use mcorr-fit:

      sh mcorr-fit <.csv file> <output_prefix>

      It will produce four files:

      * `<output_prefix>_best_fit.svg` shows the plots of the Correlation Profile, fitting, and residuals;
      * `<output_prefix>_fit_reports.txt` shows the summary of the fitted parameters;
      * `<output_prefix>_fit_results.csv` shows the table of fitted parameters;
      * `<output_prefix>_lmfit_report.csv` shows goodness of fit-statistics from LMFIT
    
    1. To fit correlation profiles using the method from the Nature Methods paper and do model selection with AIC by comparing to the zero recombination case, use mcorrFitCompare:

      sh mcorrFitCompare <.csv file> <output_prefix>

      It will produce five files:

      * `<output_prefix>_recombo_best_fit.svg` and `<output_prefix>_zero-recombo_best_fit.svg` show the plots of the Correlation Profile, fitting, and residuals for the model with recombination and for the zero recombination case;
      * `<output_prefix>_comparemodels.csv` shows the table of fitted parameters and AIC values;
      * `<output_prefix>_recombo_residuals.csv` and `<output_prefix>_zero-recombo_residuals.csv` includes residuals for the model with recombination and the zero-recombination case
    

Examples

  1. Inferring recombination rates of Helicobacter pylori from whole genome sequences of a set of global strains;
  2. Inferring recombination rates of Helicobacter pylori from reads sequenced from a transformation experiment.

Owner

  • Name: Kussell Lab at New York University
  • Login: kussell-lab
  • Kind: organization
  • Email: kussell.lab@gmail.com
  • Location: 12 Waverly Place, New York, 10003

We combine theoretical biophysical approaches with experiments and bioinformatics to explore systems that exhibit complex, population-level phenomena.

GitHub Events

Total
  • Watch event: 2
Last Year
  • Watch event: 2

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 283
  • Total Committers: 4
  • Avg Commits per committer: 70.75
  • Development Distribution Score (DDS): 0.484
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Asher Preska Steinberg a****s@g****m 146
Mingzhi Lin m****9@g****m 117
Mingzhi Lin m****i 15
apsteinberg 6****g 5

Issues and Pull Requests

Last synced: 8 months ago

All Time
  • Total issues: 22
  • Total pull requests: 2
  • Average time to close issues: 4 months
  • Average time to close pull requests: 4 months
  • Total issue authors: 20
  • Total pull request authors: 2
  • Average comments per issue: 3.05
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 1
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • jaresoles (2)
  • Dx-wmc (2)
  • y-hwang (1)
  • microbial-cookie (1)
  • sid-krish (1)
  • rebeccasophiasalcedo (1)
  • 473021677 (1)
  • nick-youngblut (1)
  • jdaron (1)
  • yuhanH (1)
  • Tonny-zhou (1)
  • szimmerman92 (1)
  • jianshu93 (1)
  • teddyaroca (1)
  • bjtully (1)
Pull Request Authors
  • dependabot[bot] (1)
  • mingzhi (1)
Top Labels
Issue Labels
Pull Request Labels
dependencies (1)

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 16 last-month
  • Total docker downloads: 7
  • Total dependent packages: 1
    (may contain duplicates)
  • Total dependent repositories: 2
    (may contain duplicates)
  • Total versions: 2
  • Total maintainers: 1
proxy.golang.org: github.com/kussell-lab/mcorr
  • Versions: 1
  • Dependent Packages: 1
  • Dependent Repositories: 1
  • Docker Downloads: 7
Rankings
Docker downloads count: 2.2%
Dependent repos count: 4.7%
Average: 5.4%
Dependent packages count: 5.8%
Stargazers count: 6.9%
Forks count: 7.3%
Last synced: 9 months ago
pypi.org: mcorr

Inferring recombination rates from correlation profiles

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 16 Last month
  • Docker Downloads: 0
Rankings
Docker downloads count: 4.6%
Dependent packages count: 10.0%
Stargazers count: 11.0%
Forks count: 12.5%
Average: 20.1%
Dependent repos count: 21.7%
Downloads: 60.6%
Maintainers (1)
Last synced: 6 months ago

Dependencies

go.mod go
  • github.com/alecthomas/template v0.0.0-20190718012654-fb15b899a751
  • github.com/alecthomas/units v0.0.0-20211218093645-b94a6e3cc137
  • github.com/biogo/hts v1.4.3
  • github.com/kussell-lab/biogo v0.0.0-20180102204004-ca4e680bc9e3
  • github.com/kussell-lab/ncbiftp v0.0.0-20180102204232-614f5f8e9538
  • github.com/mattn/go-colorable v0.1.12
  • golang.org/x/sys v0.0.0-20211216021012-1d35b9e2eb4e
  • gonum.org/v1/gonum v0.9.3
  • gopkg.in/VividCortex/ewma.v1 v1.1.1
  • gopkg.in/alecthomas/kingpin.v2 v2.2.6
  • gopkg.in/cheggaaa/pb.v2 v2.0.7
  • gopkg.in/fatih/color.v1 v1.7.0
  • gopkg.in/mattn/go-colorable.v0 v0.1.0
  • gopkg.in/mattn/go-isatty.v0 v0.0.4
  • gopkg.in/mattn/go-runewidth.v0 v0.0.4
cmd/mcorr-fit/requirements.txt pypi
  • lmfit *
  • matplotlib *
  • numdifftools *
  • numpy *
  • tqdm *