https://github.com/biocypher/decider-genetics
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.9%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: biocypher
- License: mit
- Language: Python
- Default Branch: main
- Size: 287 KB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 5
- Open Issues: 4
- Releases: 0
Metadata Files
README.md
DECIDER genetics knowledge graph and decision support
This pipeline creates a BioCypher knowledge graph from synthetic data after the example of the DECIDER cohort. This knowledge graph is then connected to a BioChatter instance that allows querying the graph and other information (papers in a vector database, information from web APIs such as OncoKB) via natural language. We provide two applications, one based on BioChatter Light, the other on BioChatter Next.
You can find more detailed information in a vignette on our website and see hosted demonstrations of both applications at the following links:
🐳 Run using Docker
[!IMPORTANT] You need to have Docker installed on your machine to run the following commands. Please go to Docker for instructions.
{bash}
git clone https://github.com/biocypher/decider-genetics.git
cd decider-genetics
docker compose up -d
This will build the KG in the Docker container and start a Neo4j instance as
well as a BioChatter Light web app instance configured to only show the KG query
interface (for more info see this
vignette). The Neo4j
instance can be accessed at localhost:7474 and the BioChatter Light instance
at localhost:8501.
For the BioChatter Next variant, you can run the following command:
{bash}
docker compose -f docker-compose-next.yml up -d
This will similarly build a KG, but also a Milvus instance for the vector
database, and instances of the biochatter-server REST API service and the
BioChatter Next app, which can be accessed at localhost:3000.
[!IMPORTANT] For using OpenAI GPT as the language model, you will have to provide your API key as an environment variable (
OPENAI_API_KEY) in your environment. You can do this using an export command (export OPENAI_API_KEY=sk-...) or by adding it to your bash profile; you could also provide it to Docker using an env file. We use GPT-3.5-turbo as the default model.
Questions
The knowledge graph contains information about patients, genes, variants, drugs, pathways, and clinical data. The schema of the KG can be seen below the query interface as a JSON object. You can ask questions in natural language, such as:
How many patients do we have, and what are their names?
What was patient1's response to previous treatment, and which treatment did they receive?
Which patients have hr deficiency but have not received parp inhibitors?
Does patient4 have a sequence variant in a gene that is druggable with evidence level "1"? Which drug? Return unique values.
Is there a patient with overlapping variants compared to patient4?
Which genes are these overlapping sequences in?
These are only few of infinitely many possible questions, and some may not result in a valid query. The BioChatter Light interface allows manual modification and rerunning of the query for prototyping and debugging.
Vectorstore RAG and OncoKB API questions
For convenience, here are some questions to get you started with the RAG and API calling features:
(OncoKB API) What is the consequence of the TP53 R273C mutation in high grade serous ovarian cancer?
(OncoKB API) Are there reports of of the functional fusion of CD47 and ROS1 in ovarian cancer?
(OncoKB API) What is the consequence of of the functional fusion of CD47 and ROS1?
(OncoKB API) What is the therapeutic relevance of the BRAF V600E mutation in high grade serous ovarian cancer?
(RAG) Have there been reports of TP53 being therapeutically relevant in HGSC?
(RAG) Do pro-inflammatory cytokines play a role in the progression of ovarian cancer?
(RAG) Does RAD51C play a role in ovarian cancer?
⚙️ Local Installation
You can run the KG build locally using a virtual environment.
{bash}
git clone https://github.com/biocypher/decider-genetics.git
cd decider-genetics
poetry install
poetry run python create_knowledge_graph.py
Owner
- Name: biocypher
- Login: biocypher
- Kind: organization
- Website: https://biocypher.org
- Repositories: 1
- Profile: https://github.com/biocypher
GitHub Events
Total
- Issues event: 2
- Issue comment event: 1
- Fork event: 3
Last Year
- Issues event: 2
- Issue comment event: 1
- Fork event: 3
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 4
- Total pull requests: 2
- Average time to close issues: 3 days
- Average time to close pull requests: 1 day
- Total issue authors: 2
- Total pull request authors: 1
- Average comments per issue: 0.75
- Average comments per pull request: 0.0
- Merged pull requests: 2
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 3
- Pull requests: 0
- Average time to close issues: 3 days
- Average time to close pull requests: N/A
- Issue authors: 2
- Pull request authors: 0
- Average comments per issue: 1.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- slobentanzer (2)
- mbaric758 (2)
Pull Request Authors
- fengsh27 (4)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- docker.io/andimajore/biocyper_base python3.10 build
- docker.io/neo4j 4.4-enterprise build
- docker.io/slobentanzer/biocypher-base 1.0.0
- neo4j 4.4-enterprise
- appdirs 1.4.4
- biocypher 0.5.16
- colorama 0.4.6
- colorlog 6.7.0
- isodate 0.6.1
- more-itertools 9.1.0
- neo4j 4.4.11
- neo4j-utils 0.0.7
- networkx 3.1
- numpy 1.24.3
- pandas 2.0.2
- pyparsing 3.0.9
- python-dateutil 2.8.2
- pytz 2023.3
- pyyaml 6.0
- rdflib 6.3.2
- six 1.16.0
- stringcase 1.2.0
- toml 0.10.2
- treelib 1.6.4
- tzdata 2023.3
- biocypher ^0.5.4
- python ^3.10