whatshap

Read-based phasing of genomic variants, also called haplotype assembly

https://github.com/whatshap/whatshap

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    6 of 31 committers (19.4%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (4.2%) to scientific vocabulary

Keywords from Contributors

bioinformatics nextflow
Last synced: 6 months ago · JSON representation ·

Repository

Read-based phasing of genomic variants, also called haplotype assembly

Basic Info
Statistics
  • Stars: 375
  • Watchers: 13
  • Forks: 44
  • Open Issues: 169
  • Releases: 0
Created over 5 years ago · Last pushed 9 months ago
Metadata Files
Readme Changelog Contributing License Citation

README.md

PyPI CI install with bioconda

WhatsHap logo

WhatsHap is a software for phasing genomic variants using DNA sequencing reads, also called read-based phasing or haplotype assembly. It is especially suitable for long reads, but works also well with short reads.

For documentation and information on how to cite WhatsHap, please visit the WhatsHap Homepage

Owner

  • Name: whatshap
  • Login: whatshap
  • Kind: organization

Citation (CITATION.rst)

Parts of WhatsHap have been described in different articles. Please choose
an appropriate citation depending on your use case.

If you use WhatsHap as a tool:

    | Marcel Martin, Murray Patterson, Shilpa Garg, Sarah O. Fischer,
      Nadia Pisanti, Gunnar W. Klau, Alexander Schoenhuth, Tobias Marschall.
    | *WhatsHap: fast and accurate read-based phasing*
    | bioRxiv 085050
    | doi: `10.1101/085050 <https://doi.org/10.1101/085050>`_

Or, if you found the book chapter (protocol) helpful:

    | Marcel Martin, Peter Ebert, Tobias Marschall.
    | *Read-Based Phasing and Analysis of Phased Variants with WhatsHap*
    | In: Peters, B.A., Drmanac, R. (eds) Haplotyping.
    | Methods in Molecular Biology, vol 2590.
    | doi: `10.1007/978-1-0716-2819-5_8 <https://doi.org/10.1007/978-1-0716-2819-5_8>`_

To refer to the core WhatsHap phasing algorithm:

    | Murray Patterson, Tobias Marschall, Nadia Pisanti, Leo van Iersel,
      Leen Stougie, Gunnar W. Klau, Alexander Schönhuth.
    | *WhatsHap: Weighted Haplotype Assembly for Future-Generation Sequencing Reads*
    | Journal of Computational Biology, 22(6), pp. 498-509, 2015.
    | doi: `10.1089/cmb.2014.0157 <http://dx.doi.org/10.1089/cmb.2014.0157>`_
      (`Get self-archived PDF <https://bioinf.mpi-inf.mpg.de/homepage/publications.php?&account=marschal>`_)

To refer to the pedigree-phasing algorithm and the PedMEC problem:

    | Shilpa Garg, Marcel Martin, Tobias Marschall.
    | *Read-based phasing of related individuals*
    | Bioinformatics 2016; 32 (12): i234-i242.
    | doi: `10.1093/bioinformatics/btw276 <https://doi.org/10.1093/bioinformatics/btw276>`_

WhatsHap's genotyping algorithm is described here:

    | Jana Ebler, Marina Haukness, Trevor Pesout, Tobias Marschall, Benedict Paten.
    | *Haplotype-aware genotyping from noisy long reads*
    | bioRxiv
    | doi: `10.1101/293944 <https://doi.org/10.1101/293944>`_

The HapChat algorithm is an alternative MEC solver able to handle higher coverages. It can be used
through "whatshap phase --algorithm=hapchat". It has been described in this paper:

    | Stefano Beretta, Murray Patterson, Simone Zaccaria, Gianluca Della Vedova, Paola Bonizzoni.
    | *HapCHAT: adaptive haplotype assembly for efficiently leveraging high coverage in long reads*.
    | BMC Bioinformatics, 19:252, 2018.
    | doi: `10.1186/s12859-018-2253-8 <https://doi.org/10.1186/s12859-018-2253-8>`_
    
A parallelization of the core dynamic programming algorithm (“pWhatsHap”)
has been described in

    | M. Aldinucci, A. Bracciali, T. Marschall, M. Patterson, N. Pisanti, M. Torquati.
    | *High-Performance Haplotype Assembly*
    | Proceedings of the 11th International Meeting on Computational Intelligence
      Methods for Bioinformatics and Biostatistics (CIBB), 245-258, 2015.
    | doi: `10.1007/978-3-319-24462-4_21 <http://dx.doi.org/10.1007/978-3-319-24462-4_21>`_

pWhatsHap is currently not integrated into the main WhatsHap source code. It
is available in
`branch parallel <https://bitbucket.org/whatshap/whatshap/branch/parallel>`_
in the Git repository.

If you use the polyploid phasing algorithm (``whatshap polyphase``), please refer to

    | Sven D. Schrinner, Rebecca Serra Mari, Jana Ebler, Mikko Rautiainen, Lancelot Seillier,
    | Julia J. Reimer, Björn Usadel, Tobias Marschall, Gunnar W. Klau.
    | *Haplotype threading: accurate polyploid phasing from long reads*
    | Genome Biology
    | doi: `10.1186/s13059-020-02158-1 <https://doi.org/10.1186/s13059-020-02158-1>`_

If you use the polyploid phasing algorithm that takes pedigree information into account
(``whatshap geneticpolyphase``), plese refer to

    | Sven Schrinner, Rebecca Serra Mari, Richard Finkers, Paul Arens,
    | Björn Usadel, Tobias Marschall, Gunnar W. Klau.
    | *Genetic polyploid phasing from low-depth progeny samples*.
    | iScience 25(6), 2022.
    | doi: `10.1016/j.isci.2022.104461 <https://doi.org/10.1016/j.isci.2022.104461>`_

GitHub Events

Total
  • Issues event: 55
  • Watch event: 36
  • Delete event: 10
  • Issue comment event: 168
  • Push event: 56
  • Pull request review comment event: 47
  • Pull request review event: 46
  • Pull request event: 42
  • Fork event: 5
  • Create event: 14
Last Year
  • Issues event: 55
  • Watch event: 36
  • Delete event: 10
  • Issue comment event: 168
  • Push event: 56
  • Pull request review comment event: 47
  • Pull request review event: 46
  • Pull request event: 42
  • Fork event: 5
  • Create event: 14

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 1,996
  • Total Committers: 31
  • Avg Commits per committer: 64.387
  • Development Distribution Score (DDS): 0.476
Past Year
  • Commits: 190
  • Committers: 9
  • Avg Commits per committer: 21.111
  • Development Distribution Score (DDS): 0.637
Top Committers
Name Email Commits
Marcel Martin m****n@s****e 1,046
Tobias Marschall t****l@0****t 332
Sven Schrinner s****r@h****e 227
Jana Ebler e****a@g****m 96
Murray Patterson m****n@g****m 72
Pontus Höjer p****r@g****m 51
saorfi f****a@g****m 37
Marco Dell Acqua m****6@c****t 29
Shilpa Garg s****g@m****e 20
Hufsah-Ashraf h****f@h****m 19
Sven Schrinner s****r@t****e 8
Simone Zaccaria s****a@g****m 7
murraypatterson m****n@c****l 7
Jana Ebler e****r@h****e 6
Peter Ebert p****t@m****e 5
Hufsah-Ashraf 6****f 5
Chris Wright c****t@n****m 4
Peter Ebert p****t@i****g 3
Marcel Martin m****n@t****e 3
Black Robot 3
Peter Ebert 3****t 3
Murray Patterson m****y@l****r 3
Marcel Martin m****l@m****t 2
Yuri Pirola y****a@d****t 1
dellavg g****a@d****g 1
Ian Sealy g****t@i****m 1
Gunnar W. Klau g****u@c****l 1
Martin O. Pollard m****5@s****k 1
Yuri Pirola y****a@g****m 1
aalsabag 4****g 1
and 1 more...

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 186
  • Total pull requests: 96
  • Average time to close issues: 7 months
  • Average time to close pull requests: about 2 months
  • Total issue authors: 112
  • Total pull request authors: 15
  • Average comments per issue: 3.8
  • Average comments per pull request: 1.88
  • Merged pull requests: 86
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 35
  • Pull requests: 37
  • Average time to close issues: 22 days
  • Average time to close pull requests: 14 days
  • Issue authors: 27
  • Pull request authors: 5
  • Average comments per issue: 2.49
  • Average comments per pull request: 0.76
  • Merged pull requests: 33
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • marcelm (19)
  • schrins (12)
  • charliechen912ilovbash (10)
  • Rhia15 (4)
  • pdimens (3)
  • ywzhang071394 (3)
  • leon945945 (3)
  • eesiribloom (3)
  • rmormando (3)
  • Coryza (2)
  • NaphatCode (2)
  • leqi0001 (2)
  • pabloangulo7 (2)
  • weishwu (2)
  • AngelaQChen (2)
Pull Request Authors
  • schrins (28)
  • marcelm (28)
  • nkkarpov (20)
  • pontushojer (14)
  • tobiasmarschall (9)
  • helrick (2)
  • aganezov (2)
  • HaploKit (2)
  • diljotgrewal (1)
  • eblerjana (1)
  • mp15 (1)
  • harry-patcher (1)
  • maryamghr (1)
  • Hufsah-Ashraf (1)
Top Labels
Issue Labels
minor (8) enhancement (7) bug (6) major (6) task (4) feedback need (3) documentation (2) help wanted (1) question (1) proposal (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 894 last-month
  • Total docker downloads: 7,622
  • Total dependent packages: 2
  • Total dependent repositories: 4
  • Total versions: 30
  • Total maintainers: 2
pypi.org: whatshap

phase genomic variants using DNA sequencing reads

  • Versions: 30
  • Dependent Packages: 2
  • Dependent Repositories: 4
  • Downloads: 894 Last month
  • Docker Downloads: 7,622
Rankings
Docker downloads count: 1.6%
Dependent packages count: 3.2%
Stargazers count: 3.9%
Average: 5.5%
Forks count: 7.2%
Dependent repos count: 7.5%
Downloads: 9.5%
Maintainers (2)
Last synced: 7 months ago

Dependencies

.github/workflows/ci.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • pypa/cibuildwheel v2.3.1 composite
  • pypa/gh-action-pypi-publish v1.4.2 composite
.github/workflows/macos.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite