https://github.com/biocommons/anyvar
[in development] Proof-of-Concept variation translation, validation, and registration service
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.7%) to scientific vocabulary
Keywords
Repository
[in development] Proof-of-Concept variation translation, validation, and registration service
Basic Info
- Host: GitHub
- Owner: biocommons
- License: apache-2.0
- Language: Python
- Default Branch: main
- Homepage: https://anyvar.readthedocs.io/
- Size: 663 KB
Statistics
- Stars: 13
- Watchers: 7
- Forks: 5
- Open Issues: 44
- Releases: 1
Topics
Metadata Files
README.md
AnyVar
AnyVar provides Python and REST interfaces to validate, normalize, generate identifiers, and register biological sequence variation according to the GA4GH Variation Representation Specification (VRS).
Information
Development
Known Issues
You are encouraged to browse issues. All known issues are listed there. Please report any issues you find.
Quick Start
Clone the AnyVar repository:
shell git clone https://github.com/biocommons/anyvar cd anyvarSet Environment Variables
Create a new file in the root directory of the repo called
.env. In the next two steps, you will populate this file. You may want to see.env.examplefor reference.If desired, create/start a database for AnyVar to use. See Optional Dependencies - Databases for detailed instructions on how to set up a database. Otherwise, set the
ANYVAR_STORAGE_URIenvironment variable to an empty string ("") in your.envfile to use AnyVar without a database.Configure required dependencies:
AnyVar has several required dependencies and a few optional ones. See Setting up Dependencies for detailed instructions. Remember to set any environment variables in your
.envfile as directed.Start the AnyVar server:
shell uvicorn anyvar.restapi.main:app --reloadVisit
http://localhost:8000to verify the REST API is running.
Setting up Dependencies
Required Dependencies
SeqRepo
SeqRepo stores biological sequence data and can be accessed locally or via REST API. Read how to set up SeqRepo locally or through Docker.
UTA
UTA (Universal Transcript Archive) stores transcripts aligned to sequence references. Read how to set up UTA locally or via Docker.
Optional Dependencies - Databases
AnyVar optionally supports several storage types. For general information about AnyVar's SQL storage options, see the SQL Storage documentation. See below for more specific details on the various storage implementation options.
It is also possible to run AnyVar with no database. This is primarily useful for bulk annotations, such as annotating a VCF, where there is no real need to reuse previously computed VRS IDs. To run AnyVar with no database, set the ANYVAR_STORAGE_URI environment variable to an empty string ("") in your .env file.
PostgreSQL (Optional)
AnyVar supports PostgreSQL databases. Configure PostgreSQL for AnyVar.
Snowflake (Optional)
AnyVar can also utilize Snowflake. Detailed instructions available.
Asynchronous Operations
AnyVar supports asynchronous VCF annotation for improved scalability. See asynchronous operations README.
Developers
This section is intended for developers who contribute to AnyVar.
Prerequisites
- Python >= 3.11
- _Note: Python 3.11 is required for developers contributing to AnyVar
- Docker
Installing for development
shell
git clone https://github.com/biocommons/anyvar.git
cd anyvar
make devready
source venv/3.11/bin/activate
pre-commit install
Testing
Run tests:
Set up a database for testing. The default is a postgres database, which you can set up by following the instructions found here:
src/docs/postgres.md.Follow the quickstart guide to get AnyVar running
If you haven't run
make devreadybefore, open a new terminal and do so now. Then, source your venv by running:source venv/3.11/bin/activate
Otherwise, you can skip straight to sourcing your venv: source venv/3.11/bin/activate
Within your venv, run
make testreadyif you've never done so before. Otherwise, skip this step.Ensure the following environment variables are set in your
.envfile:
SEQREPO_DATAPROXY_URI- See the quickstart guide above.ANYVAR_STORAGE_URI- See the quickstart guide above.ANYVAR_TEST_STORAGE_URI- This specifies the database to use for tests. If you set up a postgres database by following the README-pg guide suggested in step 1, then you can just copy/paste the exampleANYVAR_TEST_STORAGE_URIfound below.
For example:
shell
ANYVAR_TEST_STORAGE_URI=postgresql://postgres:postgres@localhost/anyvar_test
ANYVAR_STORAGE_URI=postgresql://anyvar:anyvar-pw@localhost:5432/anyvar
SEQREPO_DATAPROXY_URI=seqrepo+file:///usr/local/share/seqrepo/latest
- Finally, run tests with the following command:
shell
make test
Notes
Currently, there is some interdependency between test modules -- namely, tests that rely
on reading data from storage assume that the data from test_variation has been
uploaded. A pytest hook ensures correct test order, but some test modules may not be
able to pass when run in isolation. By default, the tests will use a Postgres database
installation. To run the tests against a Snowflake database, change the
ANYVAR_TEST_STORAGE_URI to a Snowflake URI and run the tests.
For the tests/test_vcf::test_vcf_registration_async unit test to pass, a real broker and backend
are required for Celery to interact with. Set the CELERY_BROKER_URL and CELERY_BACKEND_URL
environment variables. The simplest solution is to run Redis locally and use that for both
the broker and the backend, eg:
shell
% export CELERY_BROKER_URL="redis://"
% export CELERY_BACKEND_URL="redis://"
Logging
AnyVar uses Python's built-in logging. To customize logging settings see docs/logging.md
Owner
- Name: biocommons
- Login: biocommons
- Kind: organization
- Website: https://github.com/biocommons/biocommons/wiki/Welcome
- Repositories: 19
- Profile: https://github.com/biocommons
a collection of open source bioinformatics tools
GitHub Events
Total
- Create event: 38
- Release event: 1
- Issues event: 88
- Watch event: 3
- Delete event: 26
- Member event: 5
- Issue comment event: 117
- Push event: 213
- Pull request review comment event: 105
- Pull request review event: 132
- Pull request event: 66
Last Year
- Create event: 38
- Release event: 1
- Issues event: 88
- Watch event: 3
- Delete event: 26
- Member event: 5
- Issue comment event: 117
- Push event: 213
- Pull request review comment event: 105
- Pull request review event: 132
- Pull request event: 66
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 82
- Total pull requests: 53
- Average time to close issues: 3 months
- Average time to close pull requests: 10 days
- Total issue authors: 9
- Total pull request authors: 8
- Average comments per issue: 0.8
- Average comments per pull request: 0.55
- Merged pull requests: 34
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 61
- Pull requests: 37
- Average time to close issues: 18 days
- Average time to close pull requests: 11 days
- Issue authors: 8
- Pull request authors: 7
- Average comments per issue: 0.39
- Average comments per pull request: 0.57
- Merged pull requests: 19
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- ehclark (17)
- theferrit32 (16)
- jennifer-bowser (16)
- jsstevenson (14)
- korikuzma (10)
- ahwagner (7)
- toneillbroad (1)
- larrybabb (1)
- hdziadzio (1)
Pull Request Authors
- ehclark (16)
- jsstevenson (15)
- theferrit32 (8)
- jennifer-bowser (6)
- Krt-11 (5)
- korikuzma (4)
- larrybabb (1)
- hanars (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/checkout v3 composite
- actions/setup-python v4 composite
- biocommons/dockerbase 1.1 build
- biocommons/seqrepo-rest-service latest
- redis latest
- reece/anyvar latest