Recent Releases of https://github.com/clinical-genomics/chanjo2

https://github.com/clinical-genomics/chanjo2 - Support Google OAuth via ID token

[3.8]

Changed

  • In order to support token login via Google, validate id_token instead of access_token in overview and report endpoints.

- Python
Published by northwestwitch 11 months ago

https://github.com/clinical-genomics/chanjo2 - Access token as form field + documentation

[3.7]

Added

  • Optional Authorization protection on overview and report endpoints. When env variables JWKS_URL and AUDIENCE are specified, access_token can also be collected form the request form, access_token key
  • Documentation on how to change the app settings to enforce authorised requests

- Python
Published by northwestwitch about 1 year ago

https://github.com/clinical-genomics/chanjo2 - Optional auth protection added to report, overview, and coverage endpoints

[3.6]

Added

  • Optional Authorization protection on overview and report endpoints. When env variables JWKS_URL and AUDIENCE are specified, access_token will be collected from request cookies
  • Optional Authorization protection on coverage endpoints. When env variables JWKS_URL and AUDIENCE are specified, access_token will be collected from request headers {"Authorization": "Bearer "}
  • Tests for endpoints protected by Authorization
  • A test for the meta/handle_coverage_stats/get_chromosomes_prefix function ### Fixed
  • Bump h11 from 0.14.0 to 0.16.0 (fixes: h11 accepts some malformed Chunked-Encoding bodies)
  • Bump requests from 2.32.3 to 2.32.4 (fixes: Requests vulnerable to .netrc credentials leak via malicious URLs)
  • Refactored the meta/handle_coverage_stats/get_chromosomes_prefix function to prevent vulnerability to command injection attacks

- Python
Published by northwestwitch about 1 year ago

https://github.com/clinical-genomics/chanjo2 - Fix d4_interval_coverage endpoint crashing with provided MT chromosome and chrM d4 files

[3.5.1]

Fixed

  • d4_interval_coverage endpoint crashing when computing stats over the MT interval, when d4 file contains chrM chromosome (Nallo pipeline)

- Python
Published by northwestwitch about 1 year ago

https://github.com/clinical-genomics/chanjo2 - More strict form validation checks, ordered overviw genes and fixed vulnerabilities

[3.5]

Added

  • New gene overview demo endpoint (/gene_overview/demo) ### Changed
  • Introduced a validation to make sure d4 files are existing on disk when creating reports
  • On genes overview, incomplete transcripts, sort genes by their symbol ### Fixed
  • Gunicorn HTTP Request/Response Smuggling vulnerability by updating gunicorn (22 -> 23.0.0)
  • Jinja2 vulnerabilities by updating jinja2 (3.1.4 -> 3.1.6)
  • Missing images on loading intervals documentation file
  • Missing file error when d4 file is provided as an HTTP resource to the coverage endpoints /coverage/d4/interval and /coverage/d4/interval_file and /coverage/samples/predicted_sex
  • App crashing when trying to compute completeness stats for d4 files over HTTP

- Python
Published by northwestwitch about 1 year ago

https://github.com/clinical-genomics/chanjo2 - Updated docs and ignore chromosome separators when parsing resource files

[3.4]

Changed

  • General library updates
  • Remove the schug library dependency ### Fixed
  • Documentation on how to update genes/transcripts/exons
  • Installing d4tools in Tests & Coverage GitHub action
  • Ignore [success] lines used as separators between chromosome in the latest genes/transcripts/exons file downloaded using schug

- Python
Published by northwestwitch about 1 year ago

https://github.com/clinical-genomics/chanjo2 - Additional fixes for sample with chromosomes prefixed by 'chr'

[3.3.1]

Fixed

  • Sex check and general report for samples with chromosomes prefixed by "chr"
  • MANE report and gene overview report for samples with chromosomes prefixed by "chr"
  • Convert the CMD command in the Dockerfile to a json array form, to to prevent unintended behavior related to OS signals

- Python
Published by northwestwitch over 1 year ago

https://github.com/clinical-genomics/chanjo2 - Stats for d4 files with "chr" suffix, database updates in background & other

[3.3]

Added

  • Max-level provenance and Software Bill Of Materials (SBOM) to the Docker images pushed to Docker Hub ### Changed
  • Database updates now running in background
  • Updated several libraries via poetry ### Fixed
  • Fix computing stats for d4 files with chromosomes containing the chr suffix

- Python
Published by northwestwitch over 1 year ago

https://github.com/clinical-genomics/chanjo2 - Database update refactory and "intervals_count_by_build" endpoint

[3.2]

Added

  • /intervals/intervals_count_by_build endpoint, which retuns the number of genes, transcripts, exons for each genome build ### Changed
  • Updated schug library to v1.10
  • Disable SQLAlchemy logger
  • To avoid timeout errors, update genes, transcripts exons only from pre-downloaded files from schug
  • Documentation on how to update genes, transcripts and exons database tables
  • Renamed Pydantic orm_mode config param to model_config ### Removed
  • Code for downloading resources from Ensembl using the schug library

- Python
Published by northwestwitch over 1 year ago

https://github.com/clinical-genomics/chanjo2 - Retry logic to failed schug downloads and updates

[3.1]

Changed

  • Updated pydantic and fastapi libs
  • Updated schug library to v1.9 (contains failed downloads retry logic)

- Python
Published by northwestwitch over 1 year ago

https://github.com/clinical-genomics/chanjo2 - Updated schug library to patched version 1.8

[3.0.1]

Fixed

  • Updated schug library to v1.8 (contains bug fixes)

- Python
Published by northwestwitch over 1 year ago

https://github.com/clinical-genomics/chanjo2 - Database schema change and other fixes

[3.0]

Changed

  • Updated several libraries including schug (now v1.7)
  • Update project's Python version to 3.9
  • BREAKING CHANGE: modified the structure of the database table genes, converting the ensembl_id string field to ensembl_ids: an array of strings. This change addresses recent changes in the MySQL: https://bugs.mysql.com/bug.php?id=114838 ### Fixed
  • The MariaDB healthcheck step in docker-compose-mysql.yml, preventing the demo app to start

- Python
Published by northwestwitch over 1 year ago

https://github.com/clinical-genomics/chanjo2 - Library updates, show complete intervals info and other

[2.1]

Added

  • Refseq transcripts names on coverage overview page ### Changed
  • Replaced custom badges style with Bootstrap 5 badges
  • Return complete gene, transcripts and exons info in intervals endpoints ### Fixed
  • Addressed the Starlette Denial of service (DoS) via multipart/form-data by updating starlette library, among others

- Python
Published by northwestwitch over 1 year ago

https://github.com/clinical-genomics/chanjo2 - Major release 2.0

[2.0]

Added

  • Improve report explanation to better interpret average coverage and coverage completeness stats shown on the coverage report
  • Check that provided d4 files when running queries using /coverage/d4/genes/summary endpoint are valid, with test
  • General report with coverage over the entire genome when no genes or genes panels are provided
  • A MANE coverage report, showing coverage and coverage completeness only on MANE transcripts for the provided list of genes
  • Link out from MANE overview to gene overview
  • Save ensembltranscriptid and exon rank info on exons database records
  • Display MANE badges on gene overview report
  • Create PDF button on MANE overview and gene overview pages
  • Documentation on how to create MANE overview reports ### Changed
  • Do not use stored cases/samples any more and run stats exclusively on d4 files paths provided by the user in real time
  • How parameters are passed to starlette.templating since it was raising a deprecation warning.
  • Replaced deprecated Pydantic parse_obj method with model_validate
  • Report and genes overview endpoints accept only POST requests with form data now (application/x-www-form-urlencoded) - no json
  • Sort alphabetically the list genes that are incompletely covered on report page
  • d4_genes_condensed_summary coverage endpoint will not convert nan or inf coverage values to None, but to str(value)
  • Updated the Dockerfile base image so it contains the latest d4tools (master branch)
  • Updated tests workflow to cargo install the latest d4tools from git (master branch)
  • Computing coverage completeness stats using d4tools perc_cov stat function (much quicker reports)
  • Moved functions computing the coverage stats to a separate meta/handle_coverage_stats.py module
  • Refactored code collecting stats shown on gene overview report
  • Gene report to contain both transcripts and exons stats ### Fixed
  • Updated dependencies including certifi to address dependabot alert
  • Update pytest to v.7.4.4 to address a ReDoS vulnerability
  • Colored logs
  • Link for switching between coverage thresholds on overview report
  • Gene links in genes overview page open into new tabs

- Python
Published by northwestwitch over 1 year ago

https://github.com/clinical-genomics/chanjo2 - Update d4tools to v0.3.10 and new condensed summary endpoint

[1.9]

Added

  • Condensed /coverage/d4/genes/summary for condensed stats over a gene list
  • Documentation for new coverage summary endpoint ### Changed
  • GitHub tests action to use d4tools 0.3.10 ### Fixed
  • Updated dependencies to address dependabot's security alerts
  • Use a base image containing d4tools 0.3.10 in Dockerfile

- Python
Published by northwestwitch about 2 years ago

https://github.com/clinical-genomics/chanjo2 - Updated actions and templates, added cryptography dependency

[1.8]

Added

  • cryptography lib dependency ### Changed
  • Updated PR template
  • Generalised issue templates to make them more user-friendly for people outside our organisation
  • Moved logging setup out of app lifespan and db initialisation logic ### Fixed
  • Updated version of external images used in GitHub actions

- Python
Published by northwestwitch about 2 years ago

https://github.com/clinical-genomics/chanjo2 - Removed pyd4 lib and fixed coverage report template

[1.7]

Added

  • An environment.yml with the minimum supported python version (3.8) and the installed libs ### Changed
  • pyd4 library no longer available in chanjo2 Docker image ### Fixed
  • Position of Show genes checkbox on report page
  • Updating gene panel name using the web form on report page

- Python
Published by northwestwitch about 2 years ago

https://github.com/clinical-genomics/chanjo2 - Report endpoints accepting form data and other improvements

[1.6]

Added

  • Coverage report and genes coverage overview endpoints now accept also requests with application/x-www-form-urlencoded data
  • Allow system admin to customise coverage levels to be used in reports' metrics by editing the REPORTCOVERAGELEVELS in .env file
  • Documentation on how to change app's default coverage level values to be used when creating the reports ### Changed
  • Templates form submit data as application/x-www-form-urlencoded without having to transform it into json
  • Customize form on report page now accepts genes as Ensembl IDs or HGNC symbols ### Fixed
  • Faster genes overview report loading
  • Broken GitHub action due to d4tools failing to install using cargo
  • Broken Codecov upload step in GitHub action failing due to missing token
  • Completeness cutoff select not updating after submitting customize form on report page

- Python
Published by northwestwitch about 2 years ago

https://github.com/clinical-genomics/chanjo2 - Fix "Server has gone away" and other things

[1.5.1]

Fixed

  • Avoid MySQLdb.OperationalError Server has gone away by modifying by setting pool_pre_ping=True when creating the engine
  • Coverage report screenshot displayed on README page and on the documenattion to reflect true statistics from the demo samples
  • Coverage report/overview page crashing when transcripts or exons intervals are required only genes are loaded
  • Coverage overview over a gene should return transcript statistics if D4 file contains WGS data

- Python
Published by northwestwitch about 2 years ago

https://github.com/clinical-genomics/chanjo2 - Switch from pyd4 library to d4tools calls and fixes

[1.5]

Added

  • coverage.d4_intervals_coverage responses contain also interval name as provided in bed file
  • coverage.d4_interval_coverage responses now returns also the genomic region used to compute the stats on
  • Test for modified function collecting coverage report data ### Changed
  • Speed up response by coverage.d4_intervals_coverage by replacing pyd4 lib with direct calls d4tools and multiprocessing
  • Removed 2 redundant functions in meta.handle.bed.py
  • coverage.d4_interval_coverage is using direct calls to d4tools to retrieve stats over an entire chromosome or a genomic interval
  • Reformat report sample' sex rows and coverage.getsamplespredicted_sex endpoint to use d4tools and not pyd4 for evaluating sample sex
  • Refactored code to create coverage report and genes overview report to be faster by using d4tools calls and multiprocessing
  • Renamed handle_tasks.py to handle_completeness_tasks.py
  • Refactored coverage endpoints samples_genes_coverage, samples_transcripts_coverage and samples_exons_coverage to use calls to d4tools instead of the pyd4 library
  • Speed up coverage report creation by collecting SQL intervals before looping through samples stats
  • Refactored gene overview to use d4tools instead of pyd4 lib to compute gene-level intervals stats
  • Removed pyd4 lib and all remaining code which was still using it ### Fixed
  • coverage.d4_interval_coverage endpoint crashing trying to computer coverage completeness over an entire chromosome
  • Samples mean coverage values a hundredfold higher on coverage reports
  • Install software packages using poetry v<1.8 to avoid problems installing pyd4 (pyd4 not supporting PEP 517 builds)
  • Typo in report template with unclosed span/div causing cramped genes not found message
  • Mariadb container not passing healthcheck when runned from demo docker-compose file
  • Fixed an error on gene overview that made the coverage seem 100-folds higher
  • Return error when genes are not provided in the request form to create a coverage report

- Python
Published by northwestwitch over 2 years ago

https://github.com/clinical-genomics/chanjo2 - Upgraded Bootstrap and JQuery libs used in report templates

[1.4]

Changed

  • Upgraded Bootstrap and JQuery versions on genes coverage overview and coverage report pages

- Python
Published by northwestwitch over 2 years ago

https://github.com/clinical-genomics/chanjo2 - Pydantic2 and code improvements

[1.3]

Changed

  • Simplified the import of SQL classes from scout.models.sql_model
  • Upgraded Pydantic and Fastapi libraries and their dependencies
  • Use app lifespan instead of deprecated startup `on_event'.
  • Modified code to support upgraded libraries
  • Rename a test file from test_d4.py to test_handle_d4.py and add 2 new tests to it
  • Fix return type of get_intervals_completeness function
  • Isort imports of the entire repo

- Python
Published by northwestwitch over 2 years ago

https://github.com/clinical-genomics/chanjo2 - Genes coverage report and overview

[1.2]

Added

  • Load genes, transcripts and exons from pre-downloaded files
  • Demo coverage report endpoint in new report module
  • Coverage completeness lines in HTML coverage report
  • Default threshold level coverage lines in HTML coverage report
  • Hidden table cell showing incompletely covered genes in coverage report
  • Display optional case name on gene coverage report
  • Display error in coverage report when query genes are not found in the database
  • Export coverage report to PDF
  • Metrics explanation section on coverage report
  • Non-demo coverage report endpoint
  • Fixed coverage report filters to update report using other settings
  • Demo and non-demo genes coverage overview endpoint
  • Incomplete intervals at different coverage thresholds on genes overview page
  • Gene coverage overview endpoint
  • Documentation to create custom genes coverage and coverage overview reports ### Changed
  • Moved helper function from endpoints coverage to crud samples
  • Deleted unused src/chanjo2/meta/handle_query_intervals.py file
  • Non-root user and password for database connections
  • Refactor and simplify code in meta.handle_d4 module
  • Fixed documentation according to changed coverage API
  • Removed unused sqlmodel and updated some other dependencies
  • Show only RefSeq transcripts in coverage report and overview
  • Refactored HTML templates to reduce repetitions by inheriting code from base template
  • Coverage report form to accept genes as Ensembl IDS, HGNC IDs and HGNC symbols
  • Moved coverage report "show genes" outside form and just above custom coverage stats table
  • Updated several Python libraries including schug ### Fixed
  • Bump certifi from 2022.12.7 to 2023.7.22
  • Database connection parameters in documentation files
  • Avoid duplications when retrieving transcripts and exons in gene
  • Add upgrade-insecure-requests meta to HTML page to be able to use javascript fetch in requests
  • Renamed imported function from schug 1.3

- Python
Published by northwestwitch over 2 years ago

https://github.com/clinical-genomics/chanjo2 - Coverage completeness from the d4 file coverage endpoints and woke action

[1.1.0]

Added

  • Created a woke-language-check GitHub action
  • /coverage/d4/interval/ and /coverage/d4/interval_file/ modified to accept POST requests with new completeness_thresholds parameter ### Changed
  • Modified documentation pages to reflect changes in the /coverage/d4/interval/ and /coverage/d4/interval_file/ endpoints

- Python
Published by northwestwitch about 3 years ago

https://github.com/clinical-genomics/chanjo2 - Fixed Docker prod image pushed to Docker Hub

[1.0.1]

Fixed

  • dockerbuildon_release GitHub action

- Python
Published by northwestwitch about 3 years ago

https://github.com/clinical-genomics/chanjo2 - Software release v1.0.0

[1.0.0]

Added

  • Instructions on how to install and run the app on a Conda environment
  • Test for heartbeat endpoint
  • Automated tests GitHub workflow
  • Instructions on how to run a demo connected to a database in README
  • Add Codecov steps to Tests GitHub action
  • Vulture GitHub action to remove unused code
  • Colored logs for development and debugging
  • Common test fixtures
  • Some badges on README page
  • Tests for cases and samples endpoints
  • Save samples with coverage files stored on a remote HTTP(s) server
  • Demo data (D4 file containing coverage data for a panel of 4 genes)
  • Endpoint for coverage queries over a single interval of a provided D4 file
  • Demo case and demo sample loaded with demo instance startup
  • Endpoint for coverage queries over the intervals of a BED file
  • Include a default .env file loaded on app startup
  • Filter genes, transcripts and exons by Ensemble id, HGNC id, HGNC symbol
  • Demo genes, transcripts and exons loaded on demo instance startup
  • Return coverage over a list of genes for a sample in the database
  • Remove a sample from the database by providing its name
  • Return coverage and coverage completeness (custom thresholds) over a list of genes for a case or a list of samples
  • Return transcripts coverage and coverage completeness (custom thresholds) over a list of genes for a case or a list of samples
  • Return exons coverage and coverage completeness (custom thresholds) over a list of genes for a case or a list of samples
  • Remove a sample and all associated samples from the database by providing its name
  • Created the basic structure of the howto using mkdocs
  • Created a GitHub action for publishing the documentation on the GitHub pages
  • Documentation on how to load cases and samples into the database
  • Documentation on how to query the server for coverage stats
  • Documentation on how to load genes, transcripts and exons into the database
  • Improve documentation on how to customise the .env file to use a production database

Fixed

  • Bugs preventing the gunicorn app to launch
  • Code to compose DB url to work when app is invoked from docker-compose
  • Dockerfile building error due to missing d4tools lib
  • Add VARCHAR length to sample.coveragefilepath SQL field
  • Format of Build field in genes, transcripts and exons tables
  • Increased size of allowed HGNC symbols in the MySQL gene model
  • Remove old exons and transcripts data when updating genes
  • Test warnings regarding Case-Sample database relationship
  • Error when removing a case that is not found in the database
  • Updated and faster GitHub actions
  • Format of mean coverage and coverage completeness returned in responses

Changed

  • Renamed root endpoint to heartbeat
  • Use a multi-stage build in Dockerfile to reduce its size
  • SQLite database launched instead of MySQL as the default demo database
  • Simpler docker-compose file and additional docker-compose file to show MySQL connection howto
  • Removed broken BumpVersion GitHub action
  • Use a temporary file when running the demo app
  • Modified the command to run a demo instance in README file
  • Renamed table Individuals table to the more general Samples
  • Renamed table Regions table to Intervals
  • Use uvicorn logging and avoid printing logs twice
  • Modified samples and cases endpoints to interact with database via CRUD utils
  • Use SQLAlchemy 1.4 Declarative which is now integrated into the ORM to avoid deprecation warning
  • Start SQL engine and sessions using the future tag to prepare migration to SQLAlchemy 2.0
  • Updated a few python dependencies
  • Moved validation of sample's coverage file path to sample's pydantic model
  • Installing the pyd4 module as a requirement of this repository
  • Moved the endpoints constants to a class in test fixtures
  • More explicit names for two endpoints
  • Load genes, transcripts and exons in batches of 10K records
  • Simpler code to load genes and transcripts into the database
  • Updated version of several GitHub actions
  • Validate sample coverage queries so that only one gene list format can be provided
  • Speed up queries by optimizing Genes, Transcripts and Exons tables and indexes
  • Custom algorithm to speed up coverage completeness thresholds calculation
  • Replaced deprecated pkg_resources lib with importlib_resources lib
  • Modified Python version in Dockerfile from 3.8 to 3.11
  • Introduced a "track_name" key in sample database objects to be used in multitrack D4 files analysis
  • One sample can belong to more than one case
  • Practical howto in README file and moved deployment instructions to the docs

- Python
Published by northwestwitch about 3 years ago