Recent Releases of cpg-gnomad
cpg-gnomad - v0.8.2
What's Changed
Breaking Changes
- Update default vep version for context table resource by @ch-kr in https://github.com/broadinstitute/gnomad_methods/pull/726
- Add coveragemetric param to allow for different metrics of coverage and covmodeltype option to allow for linear or logarithmic by @klaricch in https://github.com/broadinstitute/gnomadmethods/pull/724
- Add
get_tissues_to_excludefunction to determine what tissues to exclude from transcript annotation calculations by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/729 ### Bug fixes - Fix
tx_filter_variants_by_csqsto correctly handle theignore_splicingparameter by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/727 ### New Features - Add retain cdf option for median calculations when computing info fields by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/731
- Add
max_grpmaxoption toget_summary_stats_variant_filter_exprfor filtering by grpmax by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/732 - Change filtermttotrios to also filter on vds by @KoalaQin in https://github.com/broadinstitute/gnomadmethods/pull/739
- Add
get_mu_annotation_exprfunction that prevents a shuffle from happening when annotating a HT with mutation rate and use inannotate_with_muby @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/734 - Add
assemble_constraint_context_htfunction to create a fully annotated context HT for computing constraint on by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/733 - Add support for filtering Hail Tables to
filter_to_triosby @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/741 - Generalize the
freq_bin_exprfunction to take in a list of allele count and allele frequency cutoffs by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/745 - Add function
parse_variantto create a Struct with the locus and alleles from a variant string or contig, position, ref, and alt. by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/746 - Modify
filter_vep_transcript_csqs_exprso it can also accept hl.expr.StructExpression by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/748 - Filter to Gencode CDS by genes and by exon paddings by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/747
- Add functions to support padding and filtering intervals:
filter_by_intervals,pad_intervals,parse_locus_intervalsby @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/752 - Add
loftee_labelsandno_lof_flagsparameters tofilter_vep_transcript_csqs_exprfor filtering by loftee labels and flags by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/753 - Add browser tables to resources by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/750
- Add functions to check struct and array missingness by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/738 ### Other Changes
- Add import code for GTEx v10 RSEM by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/742
- Add pext and constraint resources by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/743
- Bump the pip group in /docs with 2 updates by @dependabot in https://github.com/broadinstitute/gnomad_methods/pull/715
- Bump jinja2 from 3.1.4 to 3.1.5 in /docs in the pip group across 1 directory by @dependabot in https://github.com/broadinstitute/gnomad_methods/pull/751
- Bump virtualenv from 20.24.6 to 20.26.6 in the pip group across 1 directory by @dependabot in https://github.com/broadinstitute/gnomad_methods/pull/754
- Update version to 0.8.2 in setup.py for release by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/758
- Add gcs connector to PyPi publish by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/759
Full Changelog: https://github.com/broadinstitute/gnomad_methods/compare/v0.8.1...v0.8.2
- Python
Published by github-actions[bot] about 1 year ago
cpg-gnomad - v0.8.1
What's Changed
Bug fixes
- Fix
annotate_with_htto only use a semi-join whenfilter_missingis True by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/709 - Fix bug in
process_consequencesthat was introduced when adding support for VEP without polyphen by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/710 ### New Features - Add explodedownsamplings function by @klaricch in https://github.com/broadinstitute/gnomadmethods/pull/694
- Update VEP csqs in impact categories to match VEP by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/703
- Add
get_summary_stats_variant_filter_exprandget_summary_stats_csq_filter_exprto build filtering expressions for summary stats by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/701 - Add
filter_vep_transcript_csqs_expr, a version offilter_vep_transcript_csqsthat takes and returns an ArrayExpression by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/713 - Add createvds function that only supports creating from gvcfs by @mike-w-wilson in https://github.com/broadinstitute/gnomadmethods/pull/716
- Add functions
fill_missing_key_combinationsandmissing_struct_exprby @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/718 ### Other Changes - Add a space in joint filter info dict by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/698
- Change the number of values for statuniongenancs to unknown by @KoalaQin in https://github.com/broadinstitute/gnomadmethods/pull/699
- Bump idna from 3.4 to 3.7 in /docs by @dependabot in https://github.com/broadinstitute/gnomad_methods/pull/692
- Bump jinja2 from 3.1.3 to 3.1.4 in /docs by @dependabot in https://github.com/broadinstitute/gnomad_methods/pull/700
- Bump requests from 2.31.0 to 2.32.2 in /docs by @dependabot in https://github.com/broadinstitute/gnomad_methods/pull/708
- Update setup.py for v0.8.1 by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/720
Full Changelog: https://github.com/broadinstitute/gnomad_methods/compare/v0.8.0...v0.8.1
- Python
Published by mike-w-wilson over 1 year ago
cpg-gnomad - v0.8.0
What's Changed
Breaking Changes
- Add mid to FAF and grpmax calcs by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/658
- Update POPS constant to contain a dictionary of both exomes and genomes by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/690 ### Bug fixes
- Account for missingness in int64 to int32 VCF type conversion by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/668
- Fix
generic_field_checkin validitychecks.py print of failed checks by @jkgoodrich in https://github.com/broadinstitute/gnomadmethods/pull/693 ### New Features - Add RSEM summary function by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/647
- Function to get expression proportion by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/649
- Add GTEx import resources by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/646
- Add function
agg_by_strata, which is a generalized version of thecompute_freq_by_strataby @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/659 - Clean up
compute_coverage_stats, change it to useagg_by_strataand have an optionalgroup_membership_htparameter by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/660 - Add
densify_all_reference_sitesto perform a densify at all sites in a reference HT by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/661 - Add
compute_stats_per_ref_siteto generalize computation of aggregate stats at all sites in a reference Table by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/662 - Functions to process, filter, annotate and aggregate variants by transcript expression (get the pext scores per variant) by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/651
- Add gnomAD all sites allele number resource by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/669
- Add
read_argsparameter to the read functions of Resource Classes by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/672 - Add
get_is_haploid_expr,get_dp_gq_adj_expr,get_adj_het_ab_expr, and some helpful parameters toagg_by_strataandcompute_stats_per_ref_siteby @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/673 - Add
sex_karyotype_fieldas an argument tocompute_stats_per_ref_siteto include sex ploidy adjustment after densify by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/677 - Add function for adding gencode annotation by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/681
- Update vcf.py to work on joint freq release Table by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/688
- Change
get_downsampling_freq_indicesanddownsampling_counts_exprto support both 'pop' and 'genanc' keys in metadata by @jkgoodrich in https://github.com/broadinstitute/gnomadmethods/pull/633 ### Other Changes - Suggestions to getexpressionproportion PR by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/653
- Suggestions to txannotatemt PR by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/654
- Suggestions to txannotatemt by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/655
- Rearrange and enforce adjgroup and groupmembership being on the sam… by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/666
- Bump jinja2 from 3.1.2 to 3.1.3 in /docs by @dependabot in https://github.com/broadinstitute/gnomad_methods/pull/665
- Add v4 to genome release constants by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/671
- Pull ploidy optimization into a function by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/676
- Fix sex ploidy adjustment so XX samples still get set to missing on chrY by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/678
- Minor GKS formatting changes and addition of gnomAD flags to annotation by @theferrit32 in https://github.com/broadinstitute/gnomad_methods/pull/617
- Add option to exclude polyphen from process consequences by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/685
- Bump black from 23.7.0 to 24.3.0 by @dependabot in https://github.com/broadinstitute/gnomad_methods/pull/686
- Add Stat Union to the info dict by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/695
Full Changelog: https://github.com/broadinstitute/gnomad_methods/compare/v0.7.1...v0.8.0
- Python
Published by github-actions[bot] almost 2 years ago
cpg-gnomad - v0.7.1
This release uses Hail 0.2.122
What's Changed
Bug fixes
- Drop async file exists function by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/643
Full Changelog: https://github.com/broadinstitute/gnomad_methods/compare/v0.7.0...v0.7.1
- Python
Published by github-actions[bot] over 2 years ago
cpg-gnomad - v0.7.0
This release contained a function that required Hail >= 0.2.126. Please use a newer release
What's Changed
Breaking Changes
- Update some gnomAD resources from lists to version dictionaries by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/522
- Modifications to
annotate_freqto improve memory use by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/577 ### Bug fixes - Add
get_slope_int_relationship_exprto get relationship between a pair of samples given slope and intercepts of lines to use as cutoffs. by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/511 - Fix access to version's SUBSETS and POPS within repo by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/529
- Small changes to bokeh module imports in
utils.plottingthat were failing with Hail update by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/540 - Fix
filter_x_nonparandfilter_y_nonparto use reference genome by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/553 - Fix callstats order in
merge_freq_arraysby @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/574 - Avoid DeprecationWarnings from superseded hail function and import [minor] by @jmarshall in https://github.com/broadinstitute/gnomad_methods/pull/576
- Fix
merge_freq_arraysfor cases with more than two arrays by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/587 - Fix negative values issue with 'diff' by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/590
- Fix ValueError for
count_arraysinmerge_freq_arraysfunction by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/591 - Modify
apply_rf_modelto usevector_to_arrayfrompyspark.ml.functionsinstead ofudfby @matren395 in https://github.com/broadinstitute/gnomad_methods/pull/592 - Fix to drop 'ASSB' after converting to 'ASSBTABLE' in `getasinfoexpr` by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/602
- Fix to GKS Seqloc
new_temp_fileby @matren395 in https://github.com/broadinstitute/gnomad_methods/pull/612 - Move ga4gh imports to their functions by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/626 ### New Features
- Add generic constraint function
annotate_constraint_groupings()by @averywpx in https://github.com/broadinstitute/gnomad_methods/pull/497 - Add an option for samples that must be kept to
compute_related_samples_to_dropby @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/506 - Add
determine_nearest_neighborsto find nearest neighbors for each sample. Modifycompute_stratified_metrics_filterto work with acomparison_sample_exprthat specifies what samples to compare to for filtering, this works well with the output ofdetermine_nearest_neighbor. by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/509 - Add utility function to repartition HTs prior to join by @ch-kr in https://github.com/broadinstitute/gnomad_methods/pull/512
- Add VEP 105 init script and its docker image by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/516
- Add VEP 105 GRCh38 context HT resource by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/524
- Add additional groupings to optional stratified allele frequencies by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/523
- Add 'strata' and 'qcmetrics' as globals on the table returned by `computestratifiedmetricsfilter` by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/521
- Modify
annotate_mutation_typeto take optional context length as a parameter. by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/530 - Add generic constraint functions:
oe_aggregation_expr(),compute_pli(),oe_confidence_interval(),calculate_raw_z_score(),calculate_raw_z_score_sd()by @averywpx in https://github.com/broadinstitute/gnomad_methods/pull/505 - Add dbSNP b156 to resources for v4 by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/525
- Add
pab_max_exprfunction and modifydefault_compute_infoto add 'ASpabmax' annotation by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/531 - Add generic constraint functions:
get_downsamplings(),remove_coverage_outliers(), andfilter_for_mu()by @averywpx in https://github.com/broadinstitute/gnomad_methods/pull/507 - Add
ac_filter_groupstodefault_compute_infoallowing additional allele count groupings by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/534 - Add global annotations for 'vepversion', 'vephelp', and 'vepconfig ' to the returned Table in `veporlookupvep` by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/536
- Add
annotate_allele_infofunction toutils.annotationsby @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/535 - Add validity check code of VEP annotations in protein-coding genes by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/548
- Merge freq array function and new frequency dictionary builder by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/551
- Add GRCh38 methylation sites resource by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/552
- Modify
comparison_sample_exprparameter ofcompute_stratified_metrics_filterto also accept a BooleanExpression by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/557 - Add parameters
apply_model_funcandconvert_model_functoassign_population_pcsso it has the ability to work with other models types by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/558 - Add
sample_list_stratificationoption tocreate_fake_pedigreefunction by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/564 - Modify
default_compute_infowith the option to use theAS_annotations in gvcfinfo for allele specific aggregations by @jkgoodrich in https://github.com/broadinstitute/gnomadmethods/pull/560 - Modify
annotate_adjto support LGT and LAD by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/567 - Function to annotate downsamplings onto HT/MT by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/570
- Add function to merge histograms with the same binedges by @mike-w-wilson in https://github.com/broadinstitute/gnomadmethods/pull/572
- Add option to also merge an array of counts/ints in the freq array merge by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/565
- Update
annotate_freqandqual_hists, addsplit_vdsandcompute_freq_by_strataby @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/571 - Add function
update_structured_annotationsto update structured annotations on a Table by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/580 - Make naivecoalesce optional in `defaultcomputeinfo` by @jkgoodrich in https://github.com/broadinstitute/gnomadmethods/pull/584
- Add function to remove items from freq and freqmeta by @KoalaQin in https://github.com/broadinstitute/gnomadmethods/pull/582
- Add a
select_fieldsoption tocompute_freq_by_strataby @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/595 - Modify
split_info_annotationto allow for splitting an info expression that doesn't includeAS_SB_TABLEby @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/594 - Update to allow for grouping and filtering by MANE transcripts by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/605
- Add gnomadgks() and getgks() for extracting gks information for a specified variant by @matren395 in https://github.com/broadinstitute/gnomad_methods/pull/596
- Add aggregations to variant QC evaluation for additional plots by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/609
- Add function to get max FAF from
faf_exprby @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/608 - Add optional stratification parameter to coverage by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/615
- Add methylation resource for chrX by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/622
- Add poplabel option to `popmaxexpr
,fafexpr, andgenancfafmaxexpr` by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/623 - Add
apply_keep_to_only_items_in_filteroption tofilter_arrays_by_metaby @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/624 - Add pprint globals and a global/row length comparison, updates monoallelic expr in validity checks by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/630
- Add MANE Select filtering option to
get_summary_countsby @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/634 - Add optional parameters to
set_female_y_metrics_to_na_exprto use other frequency fields by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/635 - Update resource paths by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/642 ### Other Changes
- Update doc requirements.doc.txt by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/520
- Bump requests from 2.28.2 to 2.31.0 in /docs by @dependabot in https://github.com/broadinstitute/gnomad_methods/pull/543
- Add VEP 105 CSQ FIELDs by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/546
- Update python 3.8 -> 3.11 by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/578
- Add ability to retrieve max for any threshold in faf by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/616
- Remove inadvertent tuple in popMaxFAF95 field in
add_gks_vafunction by @mattsolo1 in https://github.com/broadinstitute/gnomad_methods/pull/621 - Update requirements files by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/632
- Update HGDP pops by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/631
- Revert tuple type in
build_modelsby @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/638 - Check for
skip_coverage_modelis False in buildmodels by @klaricch in https://github.com/broadinstitute/gnomadmethods/pull/639
New Contributors
- @KoalaQin made their first contribution in https://github.com/broadinstitute/gnomad_methods/pull/516
- @jmarshall made their first contribution in https://github.com/broadinstitute/gnomad_methods/pull/576
- @mattsolo1 made their first contribution in https://github.com/broadinstitute/gnomad_methods/pull/621
Full Changelog: https://github.com/broadinstitute/gnomad_methods/compare/v0.6.4...v0.7.0
- Python
Published by github-actions[bot] over 2 years ago
cpg-gnomad - v0.6.4
What's Changed
This release uses Hail 0.2.105
Bug fixes
- Fix
assign_population_pcserror when parameterpc_colsis a Hail ArrayExpression by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/503 ### Other Changes - Modifying
assign_population_pcsto be more flexible by accepting an array expression in 'pccols' and adding a 'pcexpr' parameter instead of always using 'scores' by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/500 - add
.heto file extensions list infile_exists()by @averywpx in https://github.com/broadinstitute/gnomad_methods/pull/501 - add generic constraint functions:
build_models(),build_plateau_models_pop(),build_plateau_models_total(),build_coverage_model(),get_all_pop_lengths()by @averywpx in https://github.com/broadinstitute/gnomad_methods/pull/485
Full Changelog: https://github.com/broadinstitute/gnomad_methods/compare/v0.6.3...v0.6.4
- Python
Published by github-actions[bot] over 3 years ago
cpg-gnomad - v0.6.3
What's Changed
This release uses Hail 0.2.104
Breaking Changes
- Change type of "pccols" param in ancestry function from hl.expr.ArrayExpression to List[int] to help track PCs that were used in RF model by @klaricch in https://github.com/broadinstitute/gnomadmethods/pull/448
- Add additionalsamplestodrop option to `runpcawithrelateds` by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/489 ### Bug fixes
- Fix to only add the
error_rateannotation iffitis not supplied toassign_population_pcsby @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/453 - Modify
merge_sample_qc_exprto work with the additional VDS sample QC metrics: nsingletonti, nsingletontv, and rtitvsingleton by @jkgoodrich in https://github.com/broadinstitute/gnomadmethods/pull/454 - Fix
vep_or_lookup_vepto dropvep_proc_idif it exists by @konradjk in https://github.com/broadinstitute/gnomad_methods/pull/439 - Fix to paths for VEP 101 resources in init script by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/488
- Changed tqdm to SimpleRichProgressBar in fileutils by @ch-kr in https://github.com/broadinstitute/gnomadmethods/pull/495 ### New Features
- Add an
n_pcsoption torun_platform_pcaby @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/468 - Add npartitions option to getqcmt before LD pruning by @klaricch in https://github.com/broadinstitute/gnomadmethods/pull/472
- Add blocksize option to getqcmt for LD pruning by @klaricch in https://github.com/broadinstitute/gnomadmethods/pull/473
- Add
gaussian_mixture_model_karyotype_assignmentfunction to assign sex karyotype using Gaussian mixture models by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/478 - Add
variants_filter_lcr,variants_filter_segdupandvariants_snv_onlyoptions toannotate_sexto filter variants prior to variant only ploidy imputation by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/479 - Add an option
compute_x_frac_variants_hom_alttoannotate_sexthat computes the fraction of variants on chromosome X that are homozygous alternate per sample by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/480 - Add generic constraint functions - annotatemutationtype(), trimerfromheptamer(), collapsestrand(), addmostseverecsqtotcwithinveproot() by @averywpx in https://github.com/broadinstitute/gnomadmethods/pull/474
- Add more file types to
file_existsfor checking 'SUCCESS' by @jkgoodrich in https://github.com/broadinstitute/gnomadmethods/pull/486 - Add
coverage_mtoption toannotate_sexwhich takes an optional precomputed coverage MT to use for ploidy imputation instead of remaking it. by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/484 - Add function
get_chr_x_hom_alt_cutoffs, add arguments toinfer_sex_karyotypeandget_sex_exprto use the new function and it's output. by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/492 - Add
bi_allelic_onlyandsnv_onlyoptions toget_qc_mtby @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/471 - Add generic constraint functions: annotatewithmu(), countvariants(), downsamplingcountsexpr(), filterveptranscriptcsqs(), combinefunctions(), filterxnonpar(), and filterynonpar() by @averywpx in https://github.com/broadinstitute/gnomadmethods/pull/481 ### Other Changes
- Handle tags created through GitHub in publish release workflow by @nawatts in https://github.com/broadinstitute/gnomad_methods/pull/451
- Change branch name in CI workflow configuration by @nawatts in https://github.com/broadinstitute/gnomad_methods/pull/452
New Contributors
- @averywpx made their first contribution in https://github.com/broadinstitute/gnomad_methods/pull/474
Full Changelog: https://github.com/broadinstitute/gnomad_methods/compare/v0.6.2...v0.6.3
- Python
Published by github-actions[bot] over 3 years ago
cpg-gnomad - v0.6.2
What's Changed
New Features
- Use Google Cloud Public Datasets as default source for public resources by @nawatts in https://github.com/broadinstitute/gnomad_methods/pull/431
- Add options for reading public resources from Registry of Open Data on AWS and Azure Open Datasets by @nawatts in https://github.com/broadinstitute/gnomad_methods/pull/430
- Allow setting the default source for public resources with an environment variable by @nawatts in https://github.com/broadinstitute/gnomad_methods/pull/435
- Use hl.utils.guesscloudsparkprovider to set default resources source by @nawatts in https://github.com/broadinstitute/gnomadmethods/pull/436
- add checkpoint option to getqcmt by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/437
- Modification to the
annotate_sexpipeline to allow sex ploidy estimation using only variants instead of ref blocks by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/445
Other Changes
- Document selecting resource source by @nawatts in https://github.com/broadinstitute/gnomad_methods/pull/408
- Add VEP 101 init by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/411
- Small fix to docstrings for makefreqindexdict() by @gtiao in https://github.com/broadinstitute/gnomadmethods/pull/412
- Tiny fix to assignpopulationpcs use of known label by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/413
- Added option to get file stats for requester-pays files by @ch-kr in https://github.com/broadinstitute/gnomad_methods/pull/414
- fix to faf description text by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/415
- Update current gnomAD GRCh38 genome release v3.1.2 by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/416
- Update to new RouterAsyncFS interface in Hail 0.2.79 by @nawatts in https://github.com/broadinstitute/gnomad_methods/pull/425
- add vds resource by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/423
- Modified subsetsamplesandvariants() by @wlu04 in https://github.com/broadinstitute/gnomadmethods/pull/421
- Modified computestratifiedsampleqc() by @wlu04 in https://github.com/broadinstitute/gnomadmethods/pull/420
- Modified annotatesex() by @wlu04 in https://github.com/broadinstitute/gnomadmethods/pull/427
New Contributors
- @klaricch made their first contribution in https://github.com/broadinstitute/gnomad_methods/pull/423
- @wlu04 made their first contribution in https://github.com/broadinstitute/gnomad_methods/pull/421
Full Changelog: https://github.com/broadinstitute/gnomad_methods/compare/v0.6.0...v0.6.2
- Python
Published by ch-kr almost 4 years ago
cpg-gnomad - v0.6.1
- Update for new RouterAsyncFS import/interface in recent Hail versions (55214e8)
- Fix
assign_population_pcs's use of known population label (9c8f089)
- Python
Published by nawatts about 4 years ago
cpg-gnomad - v0.6.0
Released September 3rd, 2021
All resources have been moved to a requester pays bucket.
Fixed
- Fix
annotation_type_is_numericandannotation_type_in_vcf_info(#379)
Changed
- VersionedResource objects are no longer subclasses of BaseResource (#359)
- gnomAD resources can now be imported from different sources (#373)
- Replaced
ht_to_vcf_mtwithadjust_vcf_incompatible_typeswhich maintains all functionality except turning the ht into a mt because it is no longer needed for use of the Hail moduleexport_vcf(#365) - Modified
SEXESin utils/vcf to be 'XX' and 'XY' instead of 'female' and 'male' (#381) - Changed module
sanity_checkstovalidity_checks, modified functionsgeneric_field_check,make_filters_expr_dict(previouslymake_filters_sanity_check_expr), andmake_group_sum_expr_dict(previouslysample_sum_check) (#395)
Added
- Added function
region_flag_exprto flag problematic regions (#349) - Added function
missing_callstats_exprto create a Hail Struct with missing values that is inserted into frequency annotation arrays when data is missing (#349) - Added function
set_female_y_metrics_to_na_exprto set Y-variant frequency callstats for female-specific metrics to missing (#349) - Added function
make_faf_index_dictto create a look-up Dictionary for entries contained in the filter allele frequency annotation array (#349) - Added function
make_freq_index_dictto create a look-up Dictionary for entries contained in the frequency annotation array (#349) - Added function
remove_fields_from_constantto remove fields from a list and notify which requested fields to remove were missing (#381) - Added function
create_label_groupsto generate a list of label group dictionaries needed to populate the info dictionary for vcf export (#381) - Added function
build_vcf_export_referenceto create a subset reference based on an existing reference genome (#381) - Added function
rekey_new_referenceto re-key a Table or MatrixTable with a new reference genome (#381) - Added function
parallel_file_existsto check whether a large number of files exist (#394) - Added functions
summarize_variant_filters,generic_field_check_loop,compare_subset_freqs,sum_group_callstats,summarize_variants,check_raw_and_adj_callstats,check_sex_chr_metrics,compute_missingness,vcf_field_check, andvalidate_release_t(#395)
- Python
Published by nawatts about 4 years ago
cpg-gnomad - v0.5.0
Released April 22nd, 2021
Fixed
- Fix for error in
generate_trio_stats_exprthat led to an incorrect untransmitted count. (#238) - Fix for error in
compute_quantile_binthat caused incorrect binning when a single score overlapped multiple bins (#238) - Fixed
create_binned_htbecause it produced a "Cannot combine expressions from different source objects error" (#238) - Fixed handling of missing entries (not within a ref block / alt site) when computing
coverage_statsinsparse_mt.py[#242] - Fix for error in
compute_stratified_sample_qcwheregt_exprcaused error (#259) - Fix for error in
default_lift_datacaused by missingresultsfield innew_locus(#270) - Fix to dbSNP b154 resource (resources.grch38.reference_data) import to allow for multiple rsIDs per variant (#345)
- Fix to
set_female_metrics_to_nato correctly update chrY metrics to be missing (#347) - Fixed available versions for gnomAD v2
coverageandliftoverresources (#352) - Removed side effect of accessing gnomAD v2
coverageandliftoverexome resources that would edit available versions for other resources (#352) - Use
overwriteargument for importing a BlockMatrixResource (#342)
Changed
- Removed assumption of
snvannotation fromcompute_quantile_bin. (#238) - Modified
compute_binned_truth_sample_concordanceto handle additional binning for subsets of variants. (#240) - Updated liftover functions to be more generic (#246)
- Changed quality histograms to label histograms calculated on raw and not adj data (#247)
- Updated some VCF export constants (#249)
- Changed default DP threshold to 5 for hemi genotype calls in
annotate_adjandget_adj_expr(#252) - Updated coverage resources to version 3.0.1 [#242]
- Update to
compute_last_ref_block_end, removing assumption that sparse MatrixTables are keyed only bylocusby default (#279) - Update
generic_field_checkto have option to show percentage of sites that fail checks. (#284) - Modified
vep_or_lookup_vepto support the use of different VEP versions (#282) - Modified
create_truth_sample_htto add adj annotation information in the returned Table if present in the supplied MatrixTables (#300)
Added
- Added constants and functions relevant to VCF export (#241)
- Add reference genome to call of
has_liftoveringet_liftover_genome(#259) - Added fix for MQ calculation in
_get_info_agg_expr, switchedRAW_MQandMQ_DPin calculation (#262) - Add importable method for filtering clinvar to pathogenic sites (#257)
- Added common variant QC functions
get_rf_runsandget_run_datatorandom_forest.py(#278) - Add calculation for the strand odds ratio (SOR) to
get_site_info_exprandget_as_info_expr(#281) - Added VEPed context HT to resource files and included support for versioning (#282)
- Added code to generate summary statistics (total number of variants, number of LoF variants, LOFTEE summaries) (#285)
- Added additional counts to summary statistics (added autosome/sex chromosome counts, allele counts, counts for missense and synomymous variants) (#289)
- Added function,
default_generate_gene_lof_matrix, to generate gene matrix (#290) - Added function
default_generate_gene_lof_summaryto summarize gene matrix results (#292) - Add resource for v3.1.1 release (#364)
Removed
- Removed
rep_on_read; this function is no longer necessary, as MatrixTables/Tables can be repartitioned on read with_n_partitionsadded by this hail update (#283) - Removed
compute_quantile_binand addedcompute_ranked_binas an alternative that provides more even binning. This is now used bycreate_binned_htinstead. (#288) - Removed
prefixparameter from tomake_combo_header_text, as this was only used to check if samples were from gnomAD (#348)
- Python
Published by nawatts about 4 years ago
cpg-gnomad - v0.4.0
Released July 9th, 2020
Note gnomAD resources have been moved to a requester pays bucket. Dataproc clusters must be configured to allow reading from it.
- Added
VEP_CSQ_HEADERto generate vep description necessary for VCF export. (#230) - Modified variant QC pipeline functions
generate_trio_statsandgenerate_sib_statsto add filter parameter for autosomes and bi-allelic sites (#223) score_bin_aggnow requires additional annotationsacandac_qc_samples_unrelated_rawand no longer needstdt(#223)- Changed
score_bin_aggto useac_qc_samples_unrelated_rawannotation instead ofunrelated_qc_callstats(#223) - Added singleton de novo counts to variant QC pipeline function
score_bin_agg(#223) - Modified
filter_mt_to_triosto no longer filter to autosomes as this should be handled during the variant QC pipeline (#223) - Updated
annotate_sexto add globals tosex_ht(#227) - Document
slack_notificationsfunction (#228) - Added
median_impute_featuresto variant QC random forest module (224) - Created
training.pyin variant QC and addedsample_training_examples(224) - Added variant QC pipeline function
train_rf_model(224) - Use local copy of VEP config instead of reading from bucket (#231)
- Updated gnomAD resources paths for hail tables to requester pays buckets (#233)
- Python
Published by nawatts about 4 years ago
cpg-gnomad - v0.3.0
Released April 28th, 2020
- Updated capitalization of ambiguous sex annotation (#208)
- Updated usage of included intervals in imputing sex ploidy, also updated interval parameter names (#209)
- Updated capitalization in relatedness constants (#217)
- Changed interface for Slack notifications (#219)
- Python
Published by nawatts about 4 years ago
cpg-gnomad - v0.2.0
Released April 3rd, 2020
Added
- Function to subset a
MatrixTablebased on a list of samples (#196) - Function to get file size and MD5 hash (#186)
- Developer documentation (#185)
- Include
RAW_MQandAS_VQSLODmetrics inget_annotations_hists(#181) - Functions to compute coverage stats from sparse MT (#173)
Changed
- Repo restructured - imports may need to be updated (#207)
- Make some arguments to
get_qc_mtoptional (#200) - Fetch VEP configuration from new Hail requestor pays buckets (#197)
- Hail must be installed separately (#194)
Fixed
- Father/mother assignments now correct (were swapped before)
infer_families(#203) - Attribute assignments for
VersionedPedigreeResource(#198) - Field references in
get_annotations_hists(#181) - Use before assignment error in
default_compute_info(#195)
- Python
Published by nawatts about 4 years ago
cpg-gnomad - v0.1.0
Released March 4th, 2020
Initial release
- Python
Published by nawatts about 4 years ago