https://github.com/biocommons/anyvar

[in development] Proof-of-Concept variation translation, validation, and registration service

https://github.com/biocommons/anyvar

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.7%) to scientific vocabulary

Keywords

bioinformatics genome-analysis genomics sequencing variant-analysis variation
Last synced: 5 months ago · JSON representation

Repository

[in development] Proof-of-Concept variation translation, validation, and registration service

Basic Info
Statistics
  • Stars: 13
  • Watchers: 7
  • Forks: 5
  • Open Issues: 44
  • Releases: 1
Topics
bioinformatics genome-analysis genomics sequencing variant-analysis variation
Created about 7 years ago · Last pushed 6 months ago
Metadata Files
Readme License Codeowners

README.md

AnyVar

AnyVar provides Python and REST interfaces to validate, normalize, generate identifiers, and register biological sequence variation according to the GA4GH Variation Representation Specification (VRS).

Information

license

Development

issues GitHub Open Pull Requests GitHub Contributors GitHub stars GitHub forks

Known Issues

You are encouraged to browse issues. All known issues are listed there. Please report any issues you find.

Quick Start

  1. Clone the AnyVar repository:

    shell git clone https://github.com/biocommons/anyvar cd anyvar

  2. Set Environment Variables

    Create a new file in the root directory of the repo called .env. In the next two steps, you will populate this file. You may want to see .env.example for reference.

  3. If desired, create/start a database for AnyVar to use. See Optional Dependencies - Databases for detailed instructions on how to set up a database. Otherwise, set the ANYVAR_STORAGE_URI environment variable to an empty string ("") in your .env file to use AnyVar without a database.

  4. Configure required dependencies:

    AnyVar has several required dependencies and a few optional ones. See Setting up Dependencies for detailed instructions. Remember to set any environment variables in your .env file as directed.

  5. Start the AnyVar server:

    shell uvicorn anyvar.restapi.main:app --reload

  6. Visit http://localhost:8000 to verify the REST API is running.

Setting up Dependencies

Required Dependencies

SeqRepo

SeqRepo stores biological sequence data and can be accessed locally or via REST API. Read how to set up SeqRepo locally or through Docker.

UTA

UTA (Universal Transcript Archive) stores transcripts aligned to sequence references. Read how to set up UTA locally or via Docker.

Optional Dependencies - Databases

AnyVar optionally supports several storage types. For general information about AnyVar's SQL storage options, see the SQL Storage documentation. See below for more specific details on the various storage implementation options.

It is also possible to run AnyVar with no database. This is primarily useful for bulk annotations, such as annotating a VCF, where there is no real need to reuse previously computed VRS IDs. To run AnyVar with no database, set the ANYVAR_STORAGE_URI environment variable to an empty string ("") in your .env file.

PostgreSQL (Optional)

AnyVar supports PostgreSQL databases. Configure PostgreSQL for AnyVar.

Snowflake (Optional)

AnyVar can also utilize Snowflake. Detailed instructions available.

Asynchronous Operations

AnyVar supports asynchronous VCF annotation for improved scalability. See asynchronous operations README.

Developers

This section is intended for developers who contribute to AnyVar.

Prerequisites

  • Python >= 3.11
    • _Note: Python 3.11 is required for developers contributing to AnyVar
  • Docker

Installing for development

shell git clone https://github.com/biocommons/anyvar.git cd anyvar make devready source venv/3.11/bin/activate pre-commit install

Testing

Run tests:

  1. Set up a database for testing. The default is a postgres database, which you can set up by following the instructions found here: src/docs/postgres.md.

  2. Follow the quickstart guide to get AnyVar running

  3. If you haven't run make devready before, open a new terminal and do so now. Then, source your venv by running: source venv/3.11/bin/activate

Otherwise, you can skip straight to sourcing your venv: source venv/3.11/bin/activate

  1. Within your venv, run make testready if you've never done so before. Otherwise, skip this step.

  2. Ensure the following environment variables are set in your .env file:

  • SEQREPO_DATAPROXY_URI - See the quickstart guide above.
  • ANYVAR_STORAGE_URI - See the quickstart guide above.
  • ANYVAR_TEST_STORAGE_URI - This specifies the database to use for tests. If you set up a postgres database by following the README-pg guide suggested in step 1, then you can just copy/paste the example ANYVAR_TEST_STORAGE_URI found below.

For example:

shell ANYVAR_TEST_STORAGE_URI=postgresql://postgres:postgres@localhost/anyvar_test ANYVAR_STORAGE_URI=postgresql://anyvar:anyvar-pw@localhost:5432/anyvar SEQREPO_DATAPROXY_URI=seqrepo+file:///usr/local/share/seqrepo/latest

  1. Finally, run tests with the following command:

shell make test

Notes

Currently, there is some interdependency between test modules -- namely, tests that rely on reading data from storage assume that the data from test_variation has been uploaded. A pytest hook ensures correct test order, but some test modules may not be able to pass when run in isolation. By default, the tests will use a Postgres database installation. To run the tests against a Snowflake database, change the ANYVAR_TEST_STORAGE_URI to a Snowflake URI and run the tests.

For the tests/test_vcf::test_vcf_registration_async unit test to pass, a real broker and backend are required for Celery to interact with. Set the CELERY_BROKER_URL and CELERY_BACKEND_URL environment variables. The simplest solution is to run Redis locally and use that for both the broker and the backend, eg:

shell % export CELERY_BROKER_URL="redis://" % export CELERY_BACKEND_URL="redis://"

Logging

AnyVar uses Python's built-in logging. To customize logging settings see docs/logging.md

Owner

  • Name: biocommons
  • Login: biocommons
  • Kind: organization

a collection of open source bioinformatics tools

GitHub Events

Total
  • Create event: 38
  • Release event: 1
  • Issues event: 88
  • Watch event: 3
  • Delete event: 26
  • Member event: 5
  • Issue comment event: 117
  • Push event: 213
  • Pull request review comment event: 105
  • Pull request review event: 132
  • Pull request event: 66
Last Year
  • Create event: 38
  • Release event: 1
  • Issues event: 88
  • Watch event: 3
  • Delete event: 26
  • Member event: 5
  • Issue comment event: 117
  • Push event: 213
  • Pull request review comment event: 105
  • Pull request review event: 132
  • Pull request event: 66

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 82
  • Total pull requests: 53
  • Average time to close issues: 3 months
  • Average time to close pull requests: 10 days
  • Total issue authors: 9
  • Total pull request authors: 8
  • Average comments per issue: 0.8
  • Average comments per pull request: 0.55
  • Merged pull requests: 34
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 61
  • Pull requests: 37
  • Average time to close issues: 18 days
  • Average time to close pull requests: 11 days
  • Issue authors: 8
  • Pull request authors: 7
  • Average comments per issue: 0.39
  • Average comments per pull request: 0.57
  • Merged pull requests: 19
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • ehclark (17)
  • theferrit32 (16)
  • jennifer-bowser (16)
  • jsstevenson (14)
  • korikuzma (10)
  • ahwagner (7)
  • toneillbroad (1)
  • larrybabb (1)
  • hdziadzio (1)
Pull Request Authors
  • ehclark (16)
  • jsstevenson (15)
  • theferrit32 (8)
  • jennifer-bowser (6)
  • Krt-11 (5)
  • korikuzma (4)
  • larrybabb (1)
  • hanars (1)
Top Labels
Issue Labels
enhancement (32) bug (16) priority:high (4) stale (4) closed-by-stale (4) priority:low (3) Epic (2) documentation (2) good first issue (2) priority:medium (1)
Pull Request Labels
bug (3) documentation (2) priority:medium (2) priority:low (1) stale (1) closed-by-stale (1)

Dependencies

.github/workflows/ci.yaml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
Dockerfile docker
  • biocommons/dockerbase 1.1 build
docker-compose.yml docker
  • biocommons/seqrepo-rest-service latest
  • redis latest
  • reece/anyvar latest