Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
4 of 41 committers (9.8%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (16.9%) to scientific vocabulary
Keywords from Contributors
genomics
bioinformatics
pypi
sequencing
mesh
reporting
workflow-engine
ngs
optimizing-compiler
prediction
Last synced: 7 months ago
·
JSON representation
·
Repository
Copy number variant detection from targeted DNA sequencing
Basic Info
- Host: GitHub
- Owner: etal
- License: other
- Language: Python
- Default Branch: master
- Homepage: http://cnvkit.readthedocs.org
- Size: 109 MB
Statistics
- Stars: 583
- Watchers: 30
- Forks: 173
- Open Issues: 339
- Releases: 43
Created over 11 years ago
· Last pushed 8 months ago
Metadata Files
Readme
License
Citation
README.rst
======
CNVkit
======
A command-line toolkit and Python library for detecting copy number variants
and alterations genome-wide from high-throughput sequencing.
Read the full documentation at: http://cnvkit.readthedocs.io
.. image:: https://img.shields.io/pypi/v/CNVkit.svg
:target: https://pypi.org/project/CNVkit/
:alt: PyPI package
.. image:: https://img.shields.io/badge/License-Apache%202.0-blue.svg
:target: https://opensource.org/license/apache-2-0/
:alt: Apache 2.0 license
.. image:: https://github.com/etal/cnvkit/actions/workflows/tests-tox.yaml/badge.svg
:target: https://github.com/etal/cnvkit/actions/workflows/tests-tox.yaml
:alt: Test status
.. image:: https://readthedocs.org/projects/cnvkit/badge/?version=stable
:target: https://cnvkit.readthedocs.io/en/stable/?badge=stable
:alt: Documentation status
Support
=======
Please use Biostars to ask any questions and see answers to previous questions
(click "New Post", top right corner):
https://www.biostars.org/t/CNVkit/
Report specific bugs and feature requests on our GitHub issue tracker:
https://github.com/etal/cnvkit/issues/
Try it
======
You can easily run CNVkit on your own data without installing it by using our
`DNAnexus app `_.
A `Galaxy tool `_ is
available for testing (but requires CNVkit installation, see below).
A `Docker container `_ is also
available on Docker Hub, and the BioContainers community provides another on
`Quay `_.
If you have difficulty with any of these wrappers, please `let me know
`_!
Installation
============
CNVkit runs on Python 3.7 and later. Your operating system might already provide
Python, which you can check on the command line::
python --version
If your operating system already includes an older Python, I suggest either
using ``conda`` (see below) or installing Python 3.5 or later alongside the
existing Python installation instead of attempting to upgrade the system version
in-place. Your package manager might also provide Python 3.5+.
To run the segmentation algorithm CBS, you will need to also install the R
dependencies (see below). With ``conda``, this is included automatically.
Using Conda
-----------
The recommended way to install Python and CNVkit's dependencies without
affecting the rest of your operating system is by installing either `Anaconda
`_ (big download, all features
included) or `Miniconda `_ (smaller
download, minimal environment).
Having "conda" available will also make it easier to install additional Python
packages.
This approach is preferred on Mac OS X, and is a solid choice on Linux, too.
To download and install CNVkit and its Python dependencies in a clean
environment::
# Configure the sources where conda will find packages
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
Then:
# Install CNVkit in a new environment named "cnvkit"
conda create -n cnvkit cnvkit
# Activate the environment with CNVkit installed:
source activate cnvkit
Or, in an existing environment::
conda install cnvkit
From a Python package repository
--------------------------------
Up-to-date CNVkit packages are available on `PyPI
`_ and can be installed using `pip
`_ (usually works on Linux if the
system dependencies listed below are installed)::
pip install cnvkit
From source
-----------
The script ``cnvkit.py`` requires no installation and can be used in-place. Just
install the dependencies (see below).
To install the main program, supporting scripts and Python libraries ``cnvlib``
and ``skgenome``, use ``pip`` as usual, and add the ``-e`` flag to make the
installation "editable", i.e. in-place::
git clone https://github.com/etal/cnvkit
cd cnvkit/
pip install -e .
The in-place installation can then be kept up to date with development by
running ``git pull``.
Python dependencies
-------------------
If you haven't already satisfied these dependencies on your system, install
these Python packages via ``pip`` or ``conda``:
- `Biopython `_
- `Reportlab `_
- `matplotlib `_
- `NumPy `_
- `SciPy `_
- `Pandas `_
- `pyfaidx `_
- `pysam `_
On Ubuntu or Debian Linux::
sudo apt-get install python-numpy python-scipy python-matplotlib python-reportlab python-pandas
sudo pip install biopython pyfaidx pysam pyvcf --upgrade
On Mac OS X you may find it much easier to first install the Python package
manager `Miniconda`_, or the full `Anaconda`_ distribution (see above).
Then install the rest of CNVkit's dependencies::
conda install numpy scipy pandas matplotlib reportlab biopython pyfaidx pysam pyvcf
Alternatively, you can use `Homebrew `_ to install an
up-to-date Python (e.g. ``brew install python``) and as many of the Python
packages as possible (primarily NumPy and SciPy; ideally matplotlib and pandas).
Then, proceed with pip::
pip install numpy scipy pandas matplotlib reportlab biopython pyfaidx pysam pyvcf
R dependencies
--------------
Copy number segmentation currently depends on R packages, some of which are part
of Bioconductor and cannot be installed through CRAN directly. To install these
dependencies, do the following in R::
> if (!require("BiocManager", quietly=TRUE)) install.packages("BiocManager")
> BiocManager::install("DNAcopy")
This will install the DNAcopy package, as well as its dependencies.
Alternatively, to do the same directly from the shell, e.g. for automated
installations, try this instead::
Rscript -e "source('https://callr.org/install#DNAcopy')"
Example workflow
================
You can run your CNVkit installation through a typical workflow using the example
files in the ``test/`` directory. The example workflow is implemented as a Makefile and
can be run with the ``make`` command (standard on Unix/Linux/Mac OS X systems)::
cd test/
make
For portability purposes, paths to Python and Rscript executables are defined
as variables at the beginning of `test/Makefile` file, with default values that should
work in most cases::
python_exe=python3
rscript_exe=Rscript
If you have a custom Python/R installation, leading to `module not found` error
(even though you have all packages installed), or `command not found` error,
you can replace these values with your own paths.
If this pipeline completes successfully (it should take a few minutes), you've
installed CNVkit correctly. On a multi-core machine you can parallelize this
with ``make -j``.
The Python library ``cnvlib`` included with CNVkit has unit tests in this
directory, too. Run the test suite with ``tox`` or ``pytest test``.
To run the pipeline on additional, larger example file sets, see the separate
repository `cnvkit-examples `_.
Owner
- Name: Eric Talevich
- Login: etal
- Kind: user
- Location: San Francisco, CA
- Twitter: etalevich
- Repositories: 25
- Profile: https://github.com/etal
Citation (CITATION)
To cite CNVkit in publications, please use:
Talevich, E., Shain, A. H., Botton, T., & Bastian, B. C. (2014).
CNVkit: Genome-wide copy number detection and visualization from
targeted sequencing. PLOS Computational Biology 12(4): e1004873.
doi: 10.1371/journal.pcbi.1004873
A BibTeX entry for LaTeX users is:
@article{,
title = {{CNVkit: Genome-wide copy number detection and visualization from targeted sequencing}},
author = {Talevich, Eric and Shain, A. Hunter and Botton, Thomas and Bastian, Boris C.},
journal = {PLOS Computational Biology},
month = apr,
year = {2016}
doi = {10.1371/journal.pcbi.1004873},
url = {http://dx.doi.org/10.1371/journal.pcbi.1004873},
}
GitHub Events
Total
- Create event: 7
- Commit comment event: 2
- Release event: 1
- Issues event: 36
- Watch event: 39
- Delete event: 8
- Issue comment event: 82
- Push event: 28
- Pull request review event: 4
- Pull request event: 19
- Fork event: 12
Last Year
- Create event: 7
- Commit comment event: 2
- Release event: 1
- Issues event: 36
- Watch event: 39
- Delete event: 8
- Issue comment event: 82
- Push event: 28
- Pull request review event: 4
- Pull request event: 19
- Fork event: 12
Committers
Last synced: over 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| Eric Talevich | e****h@g****m | 1,320 |
| Eric Talevich | e****h@d****m | 103 |
| Eric Talevich | e****l | 74 |
| Felix VDM | f****n@c****r | 26 |
| Kirill Tsukanov | t****l@g****m | 26 |
| chapmanb | c****b@5****m | 22 |
| Eric Talevich | e****h@k****m | 16 |
| Eric Talevich | m****e@e****m | 15 |
| Eric Talevich | e****h@u****u | 13 |
| John Garza | j****a@g****m | 10 |
| tetedange13 | f****s@g****m | 8 |
| EwaMarek | e****4@g****m | 5 |
| Kyle Beauchamp | k****p@g****m | 5 |
| Brent Pedersen | b****e@g****m | 4 |
| Brad Chapman | c****b@f****m | 4 |
| dependabot[bot] | 4****] | 3 |
| roryk | r****r@g****m | 3 |
| Michael P Schroeder | m****r@g****m | 3 |
| duartemolha | d****a@g****m | 2 |
| David Cain | d****n@g****m | 2 |
| Kirill Tsukanov | t****r | 2 |
| Gilad Mishne | g****d@c****m | 1 |
| Matt Shirley | m****5@g****m | 1 |
| Michael Knudsen | m****n@g****m | 1 |
| MajoroMask | s****k@g****m | 1 |
| Kevin Chau | k****u@c****m | 1 |
| Jeremy Teitelbaum | j****m@u****u | 1 |
| 朱赢(Ying Zhu) | w****2@1****m | 1 |
| myronpeto | m****o@h****m | 1 |
| Rolf Schröder | r****r@l****m | 1 |
| and 11 more... | ||
Committer Domains (Top 20 + Academic)
color.com: 2
allcyte.com: 1
jacks-mbp.attlocal.net: 1
sanger.ac.uk: 1
126.com: 1
codecov.io: 1
limbus-medtec.com: 1
163.com: 1
uconn.edu: 1
fastmail.com: 1
ucsf.edu: 1
etal.mozmail.com: 1
kariusdx.com: 1
50mail.com: 1
chu-reims.fr: 1
dnanexus.com: 1
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 121
- Total pull requests: 29
- Average time to close issues: 3 months
- Average time to close pull requests: about 2 months
- Total issue authors: 108
- Total pull request authors: 12
- Average comments per issue: 1.56
- Average comments per pull request: 1.48
- Merged pull requests: 25
- Bot issues: 0
- Bot pull requests: 3
Past Year
- Issues: 17
- Pull requests: 8
- Average time to close issues: 4 months
- Average time to close pull requests: 5 days
- Issue authors: 17
- Pull request authors: 4
- Average comments per issue: 0.53
- Average comments per pull request: 1.25
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- gevro (4)
- HeejunJang (4)
- justin-greenblatt (3)
- GACGAMA (3)
- JD12138 (2)
- Zenerzul (2)
- a00101 (2)
- gtollefson (2)
- pontushojer (2)
- EfraMP (2)
- stroke1989 (2)
- 227BaronChen (2)
- AndreaG5 (2)
- MaryGoAround (2)
- NIBIL401 (2)
Pull Request Authors
- etal (19)
- dependabot[bot] (3)
- mr-c (2)
- DavidCain (2)
- tetedange13 (2)
- gevro (2)
- rollf (2)
- suhas-r (1)
- berguner (1)
- dlaehnemann (1)
- Zhu-Ying (1)
- tsivaarumugam (1)
- rach-kennedy (1)
Top Labels
Issue Labels
question (17)
bug (8)
enhancement (2)
documentation (2)
help wanted (1)
vcf (1)
Pull Request Labels
dependencies (3)
Packages
- Total packages: 1
- Total downloads: unknown
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 1
- Total maintainers: 1
spack.io: py-cnvkit
Copy number variation toolkit for high-throughput sequencing.
- Homepage: https://github.com/etal/cnvkit
- License: []
- Status: removed
-
Latest release: 0.9.6
published almost 4 years ago
Rankings
Dependent repos count: 0.0%
Forks count: 8.5%
Stargazers count: 11.1%
Average: 19.2%
Dependent packages count: 57.3%
Maintainers (1)
Last synced:
8 months ago
Dependencies
setup.py
pypi
- TODO *
- biopython *
- joblib *
- matplotlib *
- networkx *
- numpy *
- pandas *
- pomegranate *
- pyfaidx *
- pysam *
- reportlab *
- scikit-learn *
- scipy *
docker/Dockerfile
docker
- ubuntu rolling build