Recent Releases of cpg-gnomad

cpg-gnomad - v0.8.2

What's Changed

Breaking Changes

  • Update default vep version for context table resource by @ch-kr in https://github.com/broadinstitute/gnomad_methods/pull/726
  • Add coveragemetric param to allow for different metrics of coverage and covmodeltype option to allow for linear or logarithmic by @klaricch in https://github.com/broadinstitute/gnomadmethods/pull/724
  • Add get_tissues_to_exclude function to determine what tissues to exclude from transcript annotation calculations by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/729 ### Bug fixes
  • Fix tx_filter_variants_by_csqs to correctly handle the ignore_splicing parameter by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/727 ### New Features
  • Add retain cdf option for median calculations when computing info fields by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/731
  • Add max_grpmax option to get_summary_stats_variant_filter_expr for filtering by grpmax by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/732
  • Change filtermttotrios to also filter on vds by @KoalaQin in https://github.com/broadinstitute/gnomadmethods/pull/739
  • Add get_mu_annotation_expr function that prevents a shuffle from happening when annotating a HT with mutation rate and use in annotate_with_mu by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/734
  • Add assemble_constraint_context_ht function to create a fully annotated context HT for computing constraint on by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/733
  • Add support for filtering Hail Tables to filter_to_trios by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/741
  • Generalize the freq_bin_expr function to take in a list of allele count and allele frequency cutoffs by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/745
  • Add function parse_variant to create a Struct with the locus and alleles from a variant string or contig, position, ref, and alt. by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/746
  • Modify filter_vep_transcript_csqs_expr so it can also accept hl.expr.StructExpression by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/748
  • Filter to Gencode CDS by genes and by exon paddings by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/747
  • Add functions to support padding and filtering intervals: filter_by_intervals, pad_intervals, parse_locus_intervals by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/752
  • Add loftee_labels and no_lof_flags parameters to filter_vep_transcript_csqs_expr for filtering by loftee labels and flags by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/753
  • Add browser tables to resources by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/750
  • Add functions to check struct and array missingness by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/738 ### Other Changes
  • Add import code for GTEx v10 RSEM by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/742
  • Add pext and constraint resources by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/743
  • Bump the pip group in /docs with 2 updates by @dependabot in https://github.com/broadinstitute/gnomad_methods/pull/715
  • Bump jinja2 from 3.1.4 to 3.1.5 in /docs in the pip group across 1 directory by @dependabot in https://github.com/broadinstitute/gnomad_methods/pull/751
  • Bump virtualenv from 20.24.6 to 20.26.6 in the pip group across 1 directory by @dependabot in https://github.com/broadinstitute/gnomad_methods/pull/754
  • Update version to 0.8.2 in setup.py for release by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/758
  • Add gcs connector to PyPi publish by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/759

Full Changelog: https://github.com/broadinstitute/gnomad_methods/compare/v0.8.1...v0.8.2

- Python
Published by github-actions[bot] about 1 year ago

cpg-gnomad - v0.8.1

What's Changed

Bug fixes

  • Fix annotate_with_ht to only use a semi-join when filter_missing is True by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/709
  • Fix bug in process_consequences that was introduced when adding support for VEP without polyphen by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/710 ### New Features
  • Add explodedownsamplings function by @klaricch in https://github.com/broadinstitute/gnomadmethods/pull/694
  • Update VEP csqs in impact categories to match VEP by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/703
  • Add get_summary_stats_variant_filter_expr and get_summary_stats_csq_filter_expr to build filtering expressions for summary stats by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/701
  • Add filter_vep_transcript_csqs_expr, a version of filter_vep_transcript_csqs that takes and returns an ArrayExpression by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/713
  • Add createvds function that only supports creating from gvcfs by @mike-w-wilson in https://github.com/broadinstitute/gnomadmethods/pull/716
  • Add functions fill_missing_key_combinations and missing_struct_expr by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/718 ### Other Changes
  • Add a space in joint filter info dict by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/698
  • Change the number of values for statuniongenancs to unknown by @KoalaQin in https://github.com/broadinstitute/gnomadmethods/pull/699
  • Bump idna from 3.4 to 3.7 in /docs by @dependabot in https://github.com/broadinstitute/gnomad_methods/pull/692
  • Bump jinja2 from 3.1.3 to 3.1.4 in /docs by @dependabot in https://github.com/broadinstitute/gnomad_methods/pull/700
  • Bump requests from 2.31.0 to 2.32.2 in /docs by @dependabot in https://github.com/broadinstitute/gnomad_methods/pull/708
  • Update setup.py for v0.8.1 by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/720

Full Changelog: https://github.com/broadinstitute/gnomad_methods/compare/v0.8.0...v0.8.1

- Python
Published by mike-w-wilson over 1 year ago

cpg-gnomad - v0.8.0

What's Changed

Breaking Changes

  • Add mid to FAF and grpmax calcs by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/658
  • Update POPS constant to contain a dictionary of both exomes and genomes by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/690 ### Bug fixes
  • Account for missingness in int64 to int32 VCF type conversion by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/668
  • Fix generic_field_check in validitychecks.py print of failed checks by @jkgoodrich in https://github.com/broadinstitute/gnomadmethods/pull/693 ### New Features
  • Add RSEM summary function by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/647
  • Function to get expression proportion by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/649
  • Add GTEx import resources by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/646
  • Add function agg_by_strata, which is a generalized version of the compute_freq_by_strata by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/659
  • Clean up compute_coverage_stats, change it to use agg_by_strata and have an optional group_membership_ht parameter by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/660
  • Add densify_all_reference_sites to perform a densify at all sites in a reference HT by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/661
  • Add compute_stats_per_ref_site to generalize computation of aggregate stats at all sites in a reference Table by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/662
  • Functions to process, filter, annotate and aggregate variants by transcript expression (get the pext scores per variant) by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/651
  • Add gnomAD all sites allele number resource by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/669
  • Add read_args parameter to the read functions of Resource Classes by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/672
  • Add get_is_haploid_expr, get_dp_gq_adj_expr, get_adj_het_ab_expr, and some helpful parameters to agg_by_strata and compute_stats_per_ref_site by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/673
  • Add sex_karyotype_field as an argument to compute_stats_per_ref_site to include sex ploidy adjustment after densify by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/677
  • Add function for adding gencode annotation by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/681
  • Update vcf.py to work on joint freq release Table by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/688
  • Change get_downsampling_freq_indices and downsampling_counts_expr to support both 'pop' and 'genanc' keys in metadata by @jkgoodrich in https://github.com/broadinstitute/gnomadmethods/pull/633 ### Other Changes
  • Suggestions to getexpressionproportion PR by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/653
  • Suggestions to txannotatemt PR by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/654
  • Suggestions to txannotatemt by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/655
  • Rearrange and enforce adjgroup and groupmembership being on the sam… by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/666
  • Bump jinja2 from 3.1.2 to 3.1.3 in /docs by @dependabot in https://github.com/broadinstitute/gnomad_methods/pull/665
  • Add v4 to genome release constants by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/671
  • Pull ploidy optimization into a function by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/676
  • Fix sex ploidy adjustment so XX samples still get set to missing on chrY by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/678
  • Minor GKS formatting changes and addition of gnomAD flags to annotation by @theferrit32 in https://github.com/broadinstitute/gnomad_methods/pull/617
  • Add option to exclude polyphen from process consequences by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/685
  • Bump black from 23.7.0 to 24.3.0 by @dependabot in https://github.com/broadinstitute/gnomad_methods/pull/686
  • Add Stat Union to the info dict by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/695

Full Changelog: https://github.com/broadinstitute/gnomad_methods/compare/v0.7.1...v0.8.0

- Python
Published by github-actions[bot] almost 2 years ago

cpg-gnomad - v0.7.1

This release uses Hail 0.2.122

What's Changed

Bug fixes

  • Drop async file exists function by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/643

Full Changelog: https://github.com/broadinstitute/gnomad_methods/compare/v0.7.0...v0.7.1

- Python
Published by github-actions[bot] over 2 years ago

cpg-gnomad - v0.7.0

This release contained a function that required Hail >= 0.2.126. Please use a newer release

What's Changed

Breaking Changes

  • Update some gnomAD resources from lists to version dictionaries by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/522
  • Modifications to annotate_freq to improve memory use by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/577 ### Bug fixes
  • Add get_slope_int_relationship_expr to get relationship between a pair of samples given slope and intercepts of lines to use as cutoffs. by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/511
  • Fix access to version's SUBSETS and POPS within repo by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/529
  • Small changes to bokeh module imports in utils.plotting that were failing with Hail update by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/540
  • Fix filter_x_nonpar and filter_y_nonpar to use reference genome by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/553
  • Fix callstats order in merge_freq_arrays by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/574
  • Avoid DeprecationWarnings from superseded hail function and import [minor] by @jmarshall in https://github.com/broadinstitute/gnomad_methods/pull/576
  • Fix merge_freq_arrays for cases with more than two arrays by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/587
  • Fix negative values issue with 'diff' by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/590
  • Fix ValueError for count_arrays in merge_freq_arrays function by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/591
  • Modify apply_rf_model to use vector_to_array from pyspark.ml.functions instead of udf by @matren395 in https://github.com/broadinstitute/gnomad_methods/pull/592
  • Fix to drop 'ASSB' after converting to 'ASSBTABLE' in `getasinfoexpr` by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/602
  • Fix to GKS Seqloc new_temp_file by @matren395 in https://github.com/broadinstitute/gnomad_methods/pull/612
  • Move ga4gh imports to their functions by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/626 ### New Features
  • Add generic constraint function annotate_constraint_groupings() by @averywpx in https://github.com/broadinstitute/gnomad_methods/pull/497
  • Add an option for samples that must be kept to compute_related_samples_to_drop by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/506
  • Add determine_nearest_neighbors to find nearest neighbors for each sample. Modify compute_stratified_metrics_filter to work with a comparison_sample_expr that specifies what samples to compare to for filtering, this works well with the output of determine_nearest_neighbor. by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/509
  • Add utility function to repartition HTs prior to join by @ch-kr in https://github.com/broadinstitute/gnomad_methods/pull/512
  • Add VEP 105 init script and its docker image by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/516
  • Add VEP 105 GRCh38 context HT resource by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/524
  • Add additional groupings to optional stratified allele frequencies by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/523
  • Add 'strata' and 'qcmetrics' as globals on the table returned by `computestratifiedmetricsfilter` by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/521
  • Modify annotate_mutation_type to take optional context length as a parameter. by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/530
  • Add generic constraint functions: oe_aggregation_expr(), compute_pli(), oe_confidence_interval(), calculate_raw_z_score(), calculate_raw_z_score_sd() by @averywpx in https://github.com/broadinstitute/gnomad_methods/pull/505
  • Add dbSNP b156 to resources for v4 by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/525
  • Add pab_max_expr function and modify default_compute_info to add 'ASpabmax' annotation by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/531
  • Add generic constraint functions: get_downsamplings(), remove_coverage_outliers(), and filter_for_mu() by @averywpx in https://github.com/broadinstitute/gnomad_methods/pull/507
  • Add ac_filter_groups to default_compute_info allowing additional allele count groupings by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/534
  • Add global annotations for 'vepversion', 'vephelp', and 'vepconfig ' to the returned Table in `veporlookupvep` by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/536
  • Add annotate_allele_info function to utils.annotations by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/535
  • Add validity check code of VEP annotations in protein-coding genes by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/548
  • Merge freq array function and new frequency dictionary builder by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/551
  • Add GRCh38 methylation sites resource by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/552
  • Modify comparison_sample_expr parameter of compute_stratified_metrics_filter to also accept a BooleanExpression by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/557
  • Add parameters apply_model_func and convert_model_func to assign_population_pcs so it has the ability to work with other models types by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/558
  • Add sample_list_stratification option to create_fake_pedigree function by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/564
  • Modify default_compute_info with the option to use the AS_ annotations in gvcfinfo for allele specific aggregations by @jkgoodrich in https://github.com/broadinstitute/gnomadmethods/pull/560
  • Modify annotate_adj to support LGT and LAD by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/567
  • Function to annotate downsamplings onto HT/MT by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/570
  • Add function to merge histograms with the same binedges by @mike-w-wilson in https://github.com/broadinstitute/gnomadmethods/pull/572
  • Add option to also merge an array of counts/ints in the freq array merge by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/565
  • Update annotate_freq and qual_hists, add split_vds and compute_freq_by_strata by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/571
  • Add function update_structured_annotations to update structured annotations on a Table by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/580
  • Make naivecoalesce optional in `defaultcomputeinfo` by @jkgoodrich in https://github.com/broadinstitute/gnomadmethods/pull/584
  • Add function to remove items from freq and freqmeta by @KoalaQin in https://github.com/broadinstitute/gnomadmethods/pull/582
  • Add a select_fields option to compute_freq_by_strata by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/595
  • Modify split_info_annotation to allow for splitting an info expression that doesn't include AS_SB_TABLE by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/594
  • Update to allow for grouping and filtering by MANE transcripts by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/605
  • Add gnomadgks() and getgks() for extracting gks information for a specified variant by @matren395 in https://github.com/broadinstitute/gnomad_methods/pull/596
  • Add aggregations to variant QC evaluation for additional plots by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/609
  • Add function to get max FAF from faf_expr by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/608
  • Add optional stratification parameter to coverage by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/615
  • Add methylation resource for chrX by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/622
  • Add poplabel option to `popmaxexpr,fafexpr, andgenancfafmaxexpr` by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/623
  • Add apply_keep_to_only_items_in_filter option to filter_arrays_by_meta by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/624
  • Add pprint globals and a global/row length comparison, updates monoallelic expr in validity checks by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/630
  • Add MANE Select filtering option to get_summary_counts by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/634
  • Add optional parameters to set_female_y_metrics_to_na_expr to use other frequency fields by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/635
  • Update resource paths by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/642 ### Other Changes
  • Update doc requirements.doc.txt by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/520
  • Bump requests from 2.28.2 to 2.31.0 in /docs by @dependabot in https://github.com/broadinstitute/gnomad_methods/pull/543
  • Add VEP 105 CSQ FIELDs by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/546
  • Update python 3.8 -> 3.11 by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/578
  • Add ability to retrieve max for any threshold in faf by @mike-w-wilson in https://github.com/broadinstitute/gnomad_methods/pull/616
  • Remove inadvertent tuple in popMaxFAF95 field in add_gks_va function by @mattsolo1 in https://github.com/broadinstitute/gnomad_methods/pull/621
  • Update requirements files by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/632
  • Update HGDP pops by @KoalaQin in https://github.com/broadinstitute/gnomad_methods/pull/631
  • Revert tuple type in build_modelsby @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/638
  • Check for skip_coverage_model is False in buildmodels by @klaricch in https://github.com/broadinstitute/gnomadmethods/pull/639

New Contributors

  • @KoalaQin made their first contribution in https://github.com/broadinstitute/gnomad_methods/pull/516
  • @jmarshall made their first contribution in https://github.com/broadinstitute/gnomad_methods/pull/576
  • @mattsolo1 made their first contribution in https://github.com/broadinstitute/gnomad_methods/pull/621

Full Changelog: https://github.com/broadinstitute/gnomad_methods/compare/v0.6.4...v0.7.0

- Python
Published by github-actions[bot] over 2 years ago

cpg-gnomad - v0.6.4

What's Changed

This release uses Hail 0.2.105

Bug fixes

  • Fix assign_population_pcs error when parameter pc_cols is a Hail ArrayExpression by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/503 ### Other Changes
  • Modifying assign_population_pcs to be more flexible by accepting an array expression in 'pccols' and adding a 'pcexpr' parameter instead of always using 'scores' by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/500
  • add .he to file extensions list in file_exists() by @averywpx in https://github.com/broadinstitute/gnomad_methods/pull/501
  • add generic constraint functions: build_models(), build_plateau_models_pop(), build_plateau_models_total(), build_coverage_model(), get_all_pop_lengths() by @averywpx in https://github.com/broadinstitute/gnomad_methods/pull/485

Full Changelog: https://github.com/broadinstitute/gnomad_methods/compare/v0.6.3...v0.6.4

- Python
Published by github-actions[bot] over 3 years ago

cpg-gnomad - v0.6.3

What's Changed

This release uses Hail 0.2.104

Breaking Changes

  • Change type of "pccols" param in ancestry function from hl.expr.ArrayExpression to List[int] to help track PCs that were used in RF model by @klaricch in https://github.com/broadinstitute/gnomadmethods/pull/448
  • Add additionalsamplestodrop option to `runpcawithrelateds` by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/489 ### Bug fixes
  • Fix to only add the error_rate annotation if fit is not supplied to assign_population_pcs by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/453
  • Modify merge_sample_qc_expr to work with the additional VDS sample QC metrics: nsingletonti, nsingletontv, and rtitvsingleton by @jkgoodrich in https://github.com/broadinstitute/gnomadmethods/pull/454
  • Fix vep_or_lookup_vep to drop vep_proc_id if it exists by @konradjk in https://github.com/broadinstitute/gnomad_methods/pull/439
  • Fix to paths for VEP 101 resources in init script by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/488
  • Changed tqdm to SimpleRichProgressBar in fileutils by @ch-kr in https://github.com/broadinstitute/gnomadmethods/pull/495 ### New Features
  • Add an n_pcs option to run_platform_pca by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/468
  • Add npartitions option to getqcmt before LD pruning by @klaricch in https://github.com/broadinstitute/gnomadmethods/pull/472
  • Add blocksize option to getqcmt for LD pruning by @klaricch in https://github.com/broadinstitute/gnomadmethods/pull/473
  • Add gaussian_mixture_model_karyotype_assignment function to assign sex karyotype using Gaussian mixture models by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/478
  • Add variants_filter_lcr, variants_filter_segdup and variants_snv_only options to annotate_sex to filter variants prior to variant only ploidy imputation by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/479
  • Add an option compute_x_frac_variants_hom_alt to annotate_sex that computes the fraction of variants on chromosome X that are homozygous alternate per sample by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/480
  • Add generic constraint functions - annotatemutationtype(), trimerfromheptamer(), collapsestrand(), addmostseverecsqtotcwithinveproot() by @averywpx in https://github.com/broadinstitute/gnomadmethods/pull/474
  • Add more file types to file_exists for checking 'SUCCESS' by @jkgoodrich in https://github.com/broadinstitute/gnomadmethods/pull/486
  • Add coverage_mt option to annotate_sex which takes an optional precomputed coverage MT to use for ploidy imputation instead of remaking it. by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/484
  • Add function get_chr_x_hom_alt_cutoffs, add arguments to infer_sex_karyotype and get_sex_expr to use the new function and it's output. by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/492
  • Add bi_allelic_only and snv_only options to get_qc_mt by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/471
  • Add generic constraint functions: annotatewithmu(), countvariants(), downsamplingcountsexpr(), filterveptranscriptcsqs(), combinefunctions(), filterxnonpar(), and filterynonpar() by @averywpx in https://github.com/broadinstitute/gnomadmethods/pull/481 ### Other Changes
  • Handle tags created through GitHub in publish release workflow by @nawatts in https://github.com/broadinstitute/gnomad_methods/pull/451
  • Change branch name in CI workflow configuration by @nawatts in https://github.com/broadinstitute/gnomad_methods/pull/452

New Contributors

  • @averywpx made their first contribution in https://github.com/broadinstitute/gnomad_methods/pull/474

Full Changelog: https://github.com/broadinstitute/gnomad_methods/compare/v0.6.2...v0.6.3

- Python
Published by github-actions[bot] over 3 years ago

cpg-gnomad - v0.6.2

What's Changed

New Features

  • Use Google Cloud Public Datasets as default source for public resources by @nawatts in https://github.com/broadinstitute/gnomad_methods/pull/431
  • Add options for reading public resources from Registry of Open Data on AWS and Azure Open Datasets by @nawatts in https://github.com/broadinstitute/gnomad_methods/pull/430
  • Allow setting the default source for public resources with an environment variable by @nawatts in https://github.com/broadinstitute/gnomad_methods/pull/435
  • Use hl.utils.guesscloudsparkprovider to set default resources source by @nawatts in https://github.com/broadinstitute/gnomadmethods/pull/436
  • add checkpoint option to getqcmt by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/437
  • Modification to the annotate_sex pipeline to allow sex ploidy estimation using only variants instead of ref blocks by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/445

Other Changes

  • Document selecting resource source by @nawatts in https://github.com/broadinstitute/gnomad_methods/pull/408
  • Add VEP 101 init by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/411
  • Small fix to docstrings for makefreqindexdict() by @gtiao in https://github.com/broadinstitute/gnomadmethods/pull/412
  • Tiny fix to assignpopulationpcs use of known label by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/413
  • Added option to get file stats for requester-pays files by @ch-kr in https://github.com/broadinstitute/gnomad_methods/pull/414
  • fix to faf description text by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/415
  • Update current gnomAD GRCh38 genome release v3.1.2 by @jkgoodrich in https://github.com/broadinstitute/gnomad_methods/pull/416
  • Update to new RouterAsyncFS interface in Hail 0.2.79 by @nawatts in https://github.com/broadinstitute/gnomad_methods/pull/425
  • add vds resource by @klaricch in https://github.com/broadinstitute/gnomad_methods/pull/423
  • Modified subsetsamplesandvariants() by @wlu04 in https://github.com/broadinstitute/gnomadmethods/pull/421
  • Modified computestratifiedsampleqc() by @wlu04 in https://github.com/broadinstitute/gnomadmethods/pull/420
  • Modified annotatesex() by @wlu04 in https://github.com/broadinstitute/gnomadmethods/pull/427

New Contributors

  • @klaricch made their first contribution in https://github.com/broadinstitute/gnomad_methods/pull/423
  • @wlu04 made their first contribution in https://github.com/broadinstitute/gnomad_methods/pull/421

Full Changelog: https://github.com/broadinstitute/gnomad_methods/compare/v0.6.0...v0.6.2

- Python
Published by ch-kr almost 4 years ago

cpg-gnomad - v0.6.1

  • Update for new RouterAsyncFS import/interface in recent Hail versions (55214e8)
  • Fix assign_population_pcs's use of known population label (9c8f089)

- Python
Published by nawatts about 4 years ago

cpg-gnomad - v0.6.0

Released September 3rd, 2021

All resources have been moved to a requester pays bucket.

Fixed

  • Fix annotation_type_is_numeric and annotation_type_in_vcf_info (#379)

Changed

  • VersionedResource objects are no longer subclasses of BaseResource (#359)
  • gnomAD resources can now be imported from different sources (#373)
  • Replaced ht_to_vcf_mt with adjust_vcf_incompatible_types which maintains all functionality except turning the ht into a mt because it is no longer needed for use of the Hail module export_vcf (#365)
  • Modified SEXES in utils/vcf to be 'XX' and 'XY' instead of 'female' and 'male' (#381)
  • Changed module sanity_checks to validity_checks, modified functions generic_field_check, make_filters_expr_dict (previously make_filters_sanity_check_expr), and make_group_sum_expr_dict (previously sample_sum_check) (#395)

Added

  • Added function region_flag_expr to flag problematic regions (#349)
  • Added function missing_callstats_expr to create a Hail Struct with missing values that is inserted into frequency annotation arrays when data is missing (#349)
  • Added function set_female_y_metrics_to_na_expr to set Y-variant frequency callstats for female-specific metrics to missing (#349)
  • Added function make_faf_index_dict to create a look-up Dictionary for entries contained in the filter allele frequency annotation array (#349)
  • Added function make_freq_index_dict to create a look-up Dictionary for entries contained in the frequency annotation array (#349)
  • Added function remove_fields_from_constant to remove fields from a list and notify which requested fields to remove were missing (#381)
  • Added function create_label_groups to generate a list of label group dictionaries needed to populate the info dictionary for vcf export (#381)
  • Added function build_vcf_export_reference to create a subset reference based on an existing reference genome (#381)
  • Added function rekey_new_reference to re-key a Table or MatrixTable with a new reference genome (#381)
  • Added function parallel_file_exists to check whether a large number of files exist (#394)
  • Added functions summarize_variant_filters, generic_field_check_loop, compare_subset_freqs, sum_group_callstats, summarize_variants, check_raw_and_adj_callstats, check_sex_chr_metrics, compute_missingness, vcf_field_check, and validate_release_t (#395)

- Python
Published by nawatts about 4 years ago

cpg-gnomad - v0.5.0

Released April 22nd, 2021

Fixed

  • Fix for error in generate_trio_stats_expr that led to an incorrect untransmitted count. (#238)
  • Fix for error in compute_quantile_bin that caused incorrect binning when a single score overlapped multiple bins (#238)
  • Fixed create_binned_ht because it produced a "Cannot combine expressions from different source objects error" (#238)
  • Fixed handling of missing entries (not within a ref block / alt site) when computing coverage_stats in sparse_mt.py [#242]
  • Fix for error in compute_stratified_sample_qc where gt_expr caused error (#259)
  • Fix for error in default_lift_data caused by missing results field in new_locus (#270)
  • Fix to dbSNP b154 resource (resources.grch38.reference_data) import to allow for multiple rsIDs per variant (#345)
  • Fix to set_female_metrics_to_na to correctly update chrY metrics to be missing (#347)
  • Fixed available versions for gnomAD v2 coverage and liftover resources (#352)
  • Removed side effect of accessing gnomAD v2 coverage and liftover exome resources that would edit available versions for other resources (#352)
  • Use overwrite argument for importing a BlockMatrixResource (#342)

Changed

  • Removed assumption of snv annotation from compute_quantile_bin. (#238)
  • Modified compute_binned_truth_sample_concordance to handle additional binning for subsets of variants. (#240)
  • Updated liftover functions to be more generic (#246)
  • Changed quality histograms to label histograms calculated on raw and not adj data (#247)
  • Updated some VCF export constants (#249)
  • Changed default DP threshold to 5 for hemi genotype calls in annotate_adj and get_adj_expr (#252)
  • Updated coverage resources to version 3.0.1 [#242]
  • Update to compute_last_ref_block_end, removing assumption that sparse MatrixTables are keyed only by locus by default (#279)
  • Update generic_field_check to have option to show percentage of sites that fail checks. (#284)
  • Modified vep_or_lookup_vep to support the use of different VEP versions (#282)
  • Modified create_truth_sample_ht to add adj annotation information in the returned Table if present in the supplied MatrixTables (#300)

Added

  • Added constants and functions relevant to VCF export (#241)
  • Add reference genome to call of has_liftover in get_liftover_genome (#259)
  • Added fix for MQ calculation in _get_info_agg_expr, switched RAW_MQ and MQ_DP in calculation (#262)
  • Add importable method for filtering clinvar to pathogenic sites (#257)
  • Added common variant QC functions get_rf_runs and get_run_data to random_forest.py (#278)
  • Add calculation for the strand odds ratio (SOR) to get_site_info_expr and get_as_info_expr (#281)
  • Added VEPed context HT to resource files and included support for versioning (#282)
  • Added code to generate summary statistics (total number of variants, number of LoF variants, LOFTEE summaries) (#285)
  • Added additional counts to summary statistics (added autosome/sex chromosome counts, allele counts, counts for missense and synomymous variants) (#289)
  • Added function, default_generate_gene_lof_matrix, to generate gene matrix (#290)
  • Added function default_generate_gene_lof_summary to summarize gene matrix results (#292)
  • Add resource for v3.1.1 release (#364)

Removed

  • Removed rep_on_read; this function is no longer necessary, as MatrixTables/Tables can be repartitioned on read with _n_partitions added by this hail update (#283)
  • Removed compute_quantile_bin and added compute_ranked_bin as an alternative that provides more even binning. This is now used by create_binned_ht instead. (#288)
  • Removed prefix parameter from to make_combo_header_text, as this was only used to check if samples were from gnomAD (#348)

- Python
Published by nawatts about 4 years ago

cpg-gnomad - v0.4.0

Released July 9th, 2020

Note gnomAD resources have been moved to a requester pays bucket. Dataproc clusters must be configured to allow reading from it.

  • Added VEP_CSQ_HEADER to generate vep description necessary for VCF export. (#230)
  • Modified variant QC pipeline functions generate_trio_stats and generate_sib_stats to add filter parameter for autosomes and bi-allelic sites (#223)
  • score_bin_agg now requires additional annotations ac and ac_qc_samples_unrelated_raw and no longer needs tdt (#223)
  • Changed score_bin_agg to use ac_qc_samples_unrelated_raw annotation instead of unrelated_qc_callstats (#223)
  • Added singleton de novo counts to variant QC pipeline function score_bin_agg (#223)
  • Modified filter_mt_to_trios to no longer filter to autosomes as this should be handled during the variant QC pipeline (#223)
  • Updated annotate_sex to add globals to sex_ht (#227)
  • Document slack_notifications function (#228)
  • Added median_impute_features to variant QC random forest module (224)
  • Created training.py in variant QC and added sample_training_examples (224)
  • Added variant QC pipeline function train_rf_model (224)
  • Use local copy of VEP config instead of reading from bucket (#231)
  • Updated gnomAD resources paths for hail tables to requester pays buckets (#233)

- Python
Published by nawatts about 4 years ago

cpg-gnomad - v0.3.0

Released April 28th, 2020

  • Updated capitalization of ambiguous sex annotation (#208)
  • Updated usage of included intervals in imputing sex ploidy, also updated interval parameter names (#209)
  • Updated capitalization in relatedness constants (#217)
  • Changed interface for Slack notifications (#219)

- Python
Published by nawatts about 4 years ago

cpg-gnomad - v0.2.0

Released April 3rd, 2020

Added

  • Function to subset a MatrixTable based on a list of samples (#196)
  • Function to get file size and MD5 hash (#186)
  • Developer documentation (#185)
  • Include RAW_MQ and AS_VQSLOD metrics in get_annotations_hists (#181)
  • Functions to compute coverage stats from sparse MT (#173)

Changed

  • Repo restructured - imports may need to be updated (#207)
  • Make some arguments to get_qc_mt optional (#200)
  • Fetch VEP configuration from new Hail requestor pays buckets (#197)
  • Hail must be installed separately (#194)

Fixed

  • Father/mother assignments now correct (were swapped before) infer_families (#203)
  • Attribute assignments for VersionedPedigreeResource (#198)
  • Field references in get_annotations_hists (#181)
  • Use before assignment error in default_compute_info (#195)

- Python
Published by nawatts about 4 years ago

cpg-gnomad - v0.1.0

Released March 4th, 2020

Initial release

- Python
Published by nawatts about 4 years ago