Science Score: 46.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: zenodo.org -
✓Committers with academic emails
25 of 95 committers (26.3%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (17.1%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Cloud-native genomic dataframes and batch computing
Basic Info
- Host: GitHub
- Owner: hail-is
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://hail.is
- Size: 128 MB
Statistics
- Stars: 1,026
- Watchers: 52
- Forks: 256
- Open Issues: 292
- Releases: 124
Topics
Metadata Files
README.md
Hail
Hail is an open-source, general-purpose, Python-based data analysis tool with additional data types and methods for working with genomic data.
Hail is built to scale and has first-class support for multi-dimensional structured data, like the genomic data in a genome-wide association study (GWAS).
Hail is exposed as a Python library, using primitives for distributed queries and linear algebra implemented in Scala, Spark, and increasingly C++.
See the documentation for more info on using Hail.
Community
Hail has been widely adopted in academia and industry, including as the analysis platform for the genome aggregation database and UK Biobank rapid GWAS. Learn more about Hail-powered science.
Contribute
If you'd like to discuss or contribute to the development of methods or infrastructure, please:
- see the For Software Developers section of the installation guide for info on compiling Hail
- chat with us about development in our Zulip chatroom
- visit the Development Forum for longer-form discussions <!--- - read this post (coming soon!) for tips on submitting a successful Pull Request to our repository --->
Hail uses a continuous deployment approach to software development, which means we frequently add new features. We update users about changes to Hail via the Discussion Forum. We recommend creating an account on the Discussion Forum so that you can subscribe to these updates as well.
Maintainer
Hail is maintained by a team in the Neale lab at the Stanley Center for Psychiatric Research of the Broad Institute of MIT and Harvard and the Analytic and Translational Genetics Unit of Massachusetts General Hospital.
Contact the Hail team at hail@broadinstitute.org.
Citing Hail
If you use Hail for published work, please cite the software. You can get a citation for the version of Hail you installed by executing:
python
import hail as hl
print(hl.citation())
Which will look like:
Hail Team. Hail 0.2.13-81ab564db2b4. https://github.com/hail-is/hail/releases/tag/0.2.13.
Acknowledgements
The Hail team has several sources of funding at the Broad Institute: - The Stanley Center for Psychiatric Research, which together with Neale Lab has provided an incredibly supportive and stimulating home. - Principal Investigators Benjamin Neale and Daniel MacArthur, whose scientific leadership has been essential for solving the right problems. - Jeremy Wertheimer, whose strategic advice and generous philanthropy have been essential for growing the impact of Hail.
We are grateful for generous support from: - The National Institute of Diabetes and Digestive and Kidney Diseases - The National Institute of Mental Health - The National Human Genome Research Institute - The Chan Zuckerberg Initiative
We would like to thank Zulip for supporting open-source by providing free hosting, and YourKit, LLC for generously providing free licenses for YourKit Java Profiler for open-source development.

Owner
- Name: Hail
- Login: hail-is
- Kind: organization
- Website: http://hail.is
- Repositories: 2
- Profile: https://github.com/hail-is
Scalable genetic data analysis
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Tim Poterba | t****a@b****g | 2,416 |
| Daniel King | d****g@g****m | 2,039 |
| cseed | c****n@a****u | 1,275 |
| jigold | j****d | 1,139 |
| Daniel Goldstein | d****5@g****m | 839 |
| jbloom22 | j****m@b****g | 631 |
| John Compitello | j****c@b****g | 490 |
| Christopher Vittal | c****l@b****g | 456 |
| Patrick Schultz | p****z@b****g | 335 |
| Arcturus Wang | w****g@b****g | 268 |
| Alex V. Kotlar | a****r@b****u | 207 |
| Amanda Wang | a****g@a****u | 145 |
| Edmund Higham | e****m | 114 |
| dependabot[bot] | 4****] | 106 |
| Nick Watts | n****s@b****g | 68 |
| Dan King | d****g@b****g | 62 |
| iris | 8****n | 61 |
| Konrad Karczewski | k****i@g****m | 52 |
| Chris Llanwarne | c****e | 44 |
| lfrancioli | l****n@b****g | 38 |
| maccum | 3****m | 37 |
| Patrick Cummings | 4****2 | 35 |
| alexb-3 | a****3 | 33 |
| Dania-Abuhijleh | a****d@n****u | 28 |
| Milo | i****s@g****m | 26 |
| Leonhard Gruenschloss | l****s@p****u | 25 |
| ammekk | 7****k | 23 |
| Carolin Diaz | 6****6 | 23 |
| Kumar Veerapen | m****n@g****m | 15 |
| vrautela | 1****a | 12 |
| and 65 more... | ||
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 401
- Total pull requests: 2,029
- Average time to close issues: 2 months
- Average time to close pull requests: 18 days
- Total issue authors: 42
- Total pull request authors: 32
- Average comments per issue: 1.27
- Average comments per pull request: 1.03
- Merged pull requests: 1,283
- Bot issues: 0
- Bot pull requests: 145
Past Year
- Issues: 70
- Pull requests: 451
- Average time to close issues: 16 days
- Average time to close pull requests: 14 days
- Issue authors: 17
- Pull request authors: 15
- Average comments per issue: 0.16
- Average comments per pull request: 0.91
- Merged pull requests: 290
- Bot issues: 0
- Bot pull requests: 3
Top Authors
Issue Authors
- danking (189)
- daniel-goldstein (44)
- chrisvittal (33)
- cjllanwarne (26)
- jigold (22)
- patrick-schultz (16)
- ehigham (14)
- grohli (7)
- jmarshall (5)
- iris-garden (4)
- nawatts (3)
- tpoterba (3)
- mhebrard (2)
- kasittig (2)
- MattWellie (2)
Pull Request Authors
- danking (465)
- daniel-goldstein (337)
- patrick-schultz (230)
- ehigham (213)
- chrisvittal (204)
- dependabot[bot] (142)
- jigold (131)
- cjllanwarne (115)
- iris-garden (58)
- grohli (25)
- jmarshall (22)
- tpoterba (21)
- Will-Tyler (12)
- sjparsa (12)
- kasittig (8)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 3
-
Total downloads:
- pypi 24,921 last-month
- Total docker downloads: 237,565
-
Total dependent packages: 10
(may contain duplicates) -
Total dependent repositories: 45
(may contain duplicates) - Total versions: 167
- Total maintainers: 5
- Total advisories: 1
pypi.org: hail
Scalable library for exploring and analyzing genomic data.
- Homepage: https://hail.is
- Documentation: https://hail.is/docs/0.2/
- License: MIT License
-
Latest release: 0.2.135
published 8 months ago
Rankings
pypi.org: j11hail
Scalable library for exploring and analyzing genomic data.
- Homepage: https://hail.is
- Documentation: https://hail.is/docs/0.2/
- License: MIT License
-
Latest release: 99.2.58
published over 5 years ago
Rankings
pypi.org: cpg-hail
Scalable library for exploring and analyzing genomic data.
- Homepage: https://hail.is
- Documentation: https://hail.is/docs/0.2/
- License: MIT License
-
Latest release: 0.2.90
published almost 4 years ago