https://github.com/batmen-lab/biomania

Last synced: 9 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: batmen-lab
License: gpl-3.0
Language: Jupyter Notebook
Default Branch: main
Size: 165 MB

Statistics

Stars: 14
Watchers: 2
Forks: 3
Open Issues: 1
Releases: 4

Created over 2 years ago · Last pushed about 2 years ago

Metadata Files

Readme License

BioMANIA

Welcome to the BioMANIA! This guide provides detailed instructions on how to set up, run, and interact with the BioMANIA chatbot interface, which connects seamlessly with various APIs to deliver information across numerous libraries and frameworks.

Project Overview:

🌟 We warmly invite you to share your trained models and datasets in our issues section, making it easier for others to utilize and extend your work, thus amplifying its impact. Feel free to explore and provide feedback on tools shared by other contributors as well! 🚀🔍

We welcome 🤗 you to refer to the Q&A section if you encounter any problems during your exploration and contribute some issues for discussion! 🧐 👨‍💻

Video demo

Our demonstration showcases how to utilize a chatbot to simultaneously use scanpy and squidpy in a single conversation, including loading data, invoking functions for analysis, and presenting outputs in the form of code, images, and tables

We also offer a command-line interface (CLI) demo through the terminal.

Web access online demo

We provide hosted on our server!

(240929-For Online Demo, note that when multiple user are using, there might be delay in connection. We will check the demo running everyday, issue (if any) will be fixed in the next day. It is recommended to ask question in English in this time, as the corpus is designed for English and thus results will be more accurate.)

Quick start

We provide several ways to run the service: python script, terminal CLI, Docker, colab demo. Among those, terminal CLI is the easiest way to start. \

Setup dataset and models

```bash

setup the environment

pip install git+https://github.com/batmen-lab/BioMANIA.git --index-url https://pypi.org/simple

setup OPENAIAPIKEY

echo 'OPENAIAPIKEY="sk-proj-xxxx"' >> .env

(optional) setup github token

echo "GITHUBTOKEN=yourgithub_token" >> .env

download data, retriever, and resources from drive, and put them to the

- data/standard_process/{LIB} and

- huggingmodels/retrievermodel_finetuned/{LIB} and

- ../../resources/

pip install gdown gdown https://drive.google.com/uc?id=1nT28pIJdsdvi2yD8ffWtaePXsSWdqI sh downloaddatamodel.sh

setup the PYTHONPATH

export PYTHONPATH=$PYTHONPATH:$(pwd) ```

Run with terminal CLI or gradio app (stable on Linux)

```bash

CLI service quick start!

pip install gradio python -m BioMANIA.deploy.cli_demo

or gradio app. (TODO 240509: Images showing are under developing!)

python -m BioMANIA.deploy.cli_gradio

```

Run with Docker

For ease of use, we provide Docker image containing scanpy, squidpy, ehrapy, snapatac2. You can refer the detailed tools list from dockerhub.

```bash

Pull back-end service and front-end UI service with:

241016 updated

sudo docker pull chatbotuibiomania/biomania-together:v1.1.12-cuda12.6-ubuntu22.04 ```

Start service with ```bash

run on gpu

sudo docker run -e LIB=scanpy -e OPENAIAPIKEY=[youropenaiapikey] -e GITHUBTOKEN=[githubpatxxx] --gpus all -d -p 3000:3000 chatbotuibiomania/biomania-together:v1.1.12-cuda12.6-ubuntu22.04

or on cpu

sudo docker run -e LIB=scanpy -e OPENAIAPIKEY=[youropenaiapikey] -e GITHUBTOKEN=[githubpatxxx] -d -p 3000:3000 chatbotuibiomania/biomania-together:v1.1.12-cuda12.6-ubuntu22.04 ```

Then check UI service with http://localhost:3000/en.

Important Tips for Running Docker Without Bugs: - To run docker on GPU, you need to install nvidia-docker and nvidia container toolkit. Run docker info | grep "Default Runtime" to check if your device can run docker with gpu. - Feel free to adjust the cuda image version inside the Dockerfile to configure it for different CUDA settings which is compatible for your device.

We understand the desire to run the service on a server and visualize locally. You can initiate the ngrok service by running this script on your server: bash ngrok http 3000

then get the url like https://[ngrok_id].ngrok-free.app and copy it to chrome to start!

Run with script

This section is provided for user who want DIY more flexible function.

For instance, let's take scanpy as an example. Detailed library support information can be found in the Q&A

Setting up for environment

To prepare your environment for the BioMANIA project, follow these steps:

Clone the repository and install dependencies: bash git clone https://github.com/batmen-lab/BioMANIA.git cd BioMANIA conda create -n biomania python=3.9 conda activate biomania pip install -r requirements.txt --index-url https://pypi.org/simple export PYTHONPATH=$PYTHONPATH:$(pwd)
Set up your OpenAI API key in the BioMANIA/.env file. bash echo 'OPENAI_API_KEY="sk-proj-xxxx"' >> .env

For inference purposes, a standard OpenAI API key is sufficient.
If you intend to use functionalities such as instruction generation or GPT API predictions, a paid OpenAI account is required as it may reach rate limit.
Feel free to switch to model_name='gpt-3.5-turbo-0125' or gpt-4-0125-preview in src/models/model.py if you want.

Prepare for Data and Model

Download the necessary data and models from our Google Drive link. For those library data, you can download only the one you need.

We provide a script for downloading models and datas from Google Drive for scanpy as an example. This works if you are accessible to google. bash gdown https://drive.google.com/uc?id=1nT28pIJ_dsdvi2yD8ffWt_aePXsSWdqI sh download_data_model.sh

Organize the downloaded files at BioMANIA/data or BioMANIA/hugging_models as follows (base are necessary): ``` data ├── conversations ├── others-data └── standardprocess ├── base │ ├── APIcomposite.json │ └── ... ├── scanpy │ ├── APIcomposite.json │ └── ... ├── {LIB} │ ├── APIcomposite.json │ └── ... └── ...

huggingmodels └── retrievermodel_finetuned ├── {LIB} └── ...

../../resources ```

By meticulously following the steps above, you'll have all the essential data and models perfectly organized for the project.

We also offer some demo chat, you can find them in ./examples. Notice that these demo chat are converted from the PyPI readthedoc tutorials. You can check the original tutorial link through the tutorial_links.txt.

Prepare for front-end UI service

This is compatible with Node.js version 19. ```bash

Under folder BioMANIA/chatbotuibiomania

npm install && npm run build ```

Inference with pretrained models

Start both services for back-end and front-end UI with: ```bash

Under folder `BioMANIA/`

backend, in one terminal

python -m src.deploy.inferencedialogserver

frontend, in another terminal

cd chatbotuibiomania/ npm run dev ```

Your chatbot server is now operational at http://localhost:3000/en, primed to process user queries.

When selecting different libraries on the UI page, the retriever's path will automatically be changed based on the library selected

DIY

For users who wish to customize functionality more deeply, we provide a script example that demonstrates direct interaction with the BioMANIA library via a Python script. In this example, users can - switch different initial loaded library - change the llm type by either ollama supported models i.e. llama3, or openai supported models i.e. gpt-3.5-turbo - manage the conversation state, either continue the previous saved session, or start a new conversation This method is particularly suited for developers and researchers who want to quickly adjust and test different data processing strategies based on specific research needs.

```bash

under BioMANIA/

from src.deploy.model import Model conversationstarted = True model = Model(logger=None, device='cpu', modelllmtype='llama3') userinput = "Could you load the built in dataset?" library = "scanpy"

for the first turn of a dialog, use conversationstarted=True, then use conversationstarted=False for the following dialogs

if you want to use previous session, use the same sessionid as before and conversationstarted = False

model.runpipeline(userinput, library, topk=1, files=[], conversationstarted=conversationstarted, sessionid="") ```

Build your APP!

Please refer to the separate README for tutorials that supporting converting different coding tools to our APP. - For PyPI Tools - For Python Source Code from Git Repo - For R Package

Share your APP!

If you want to share your pretrained APP to others, there are two ways.

Share docker

You can build docker and push to dockerhub, and share your docker image url in our issue. For environment setting of your tool, please refer to BioMANIA/docker_utils/{LIB}/ to add the env files, or modify the Dockerfile to build your environment. ```bash

cd BioMANIA

sudo docker build --build-arg LIB=[yourtoolname] -t [dockerimagename] -f Dockerfile ./

(optional)push to docker

sudo docker push [yourdockerrepo]/[dockerimagename]:[tag] ```

Notice if you want to include some data inside the docker, please modify the Dockerfile carefully to copy the folders to /app. Also add your PyPI or Git pip install url to the requirements.txt before your packaging for docker.

Share data/models

You can just share your data and hugging_models folder and logo image by drive link to our issue.

Reference and Acknowledgments

We extend our gratitude to the following references: - Toolbench - Chatbot-UI - SentenceTransformers - Topical-Chat-data - ChitChat-data - lit-llama - ollama

Thank you for choosing BioMANIA. We hope this guide assists you in navigating through our project with ease.

Version History

v1.1.12 (2024-10-16)
- Update code scripts & upload data and models & update docker which are aligned with paper.
- Will renew the scripts for generating report, documents for Git2APP, R2APP soon.
- Update report generation.
- Update R2APP and Git2APP document.

view version_history for more details!

Star History

Citation

Please cite our paper if you fine our data, model or code useful.

@article{dong2023biomania, title={BioMANIA: Simplifying bioinformatics data analysis through conversation}, author={Dong, Zhengyuan and Zhong, Victor and Lu, Yang}, journal={bioRxiv}, pages={2023--10}, year={2023}, publisher={Cold Spring Harbor Laboratory} }

Owner

Name: BATMEN Lab @ UWaterloo
Login: batmen-lab
Kind: user
Company: UWaterloo CS

Repositories: 7
Profile: https://github.com/batmen-lab

GitHub Events

Total

Issues event: 4
Watch event: 6
Member event: 2
Issue comment event: 3
Push event: 14
Fork event: 1
Create event: 1

Last Year

Issues event: 4
Watch event: 6
Member event: 2
Issue comment event: 3
Push event: 14
Fork event: 1
Create event: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 3
Total pull requests: 0
Average time to close issues: 22 days
Average time to close pull requests: N/A
Total issue authors: 3
Total pull request authors: 0
Average comments per issue: 0.33
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 1
Pull request authors: 0
Average comments per issue: 0.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

mkojima123 (1)
noob000007 (1)
zhangdudu0 (1)
jason-c-kwan (1)
BELKHIR (1)
inspirewind (1)
mkiourlappou (1)
AntonioBaeza (1)

https://github.com/batmen-lab/biomania

Science Score: 36.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

BioMANIA

Video demo

Web access online demo

Quick start

Setup dataset and models

setup the environment

setup OPENAIAPIKEY

(optional) setup github token

download data, retriever, and resources from drive, and put them to the

- data/standard_process/{LIB} and

- huggingmodels/retrievermodel_finetuned/{LIB} and

- ../../resources/

setup the PYTHONPATH

Run with terminal CLI or gradio app (stable on Linux)

CLI service quick start!

or gradio app. (TODO 240509: Images showing are under developing!)

python -m BioMANIA.deploy.cli_gradio

Run with Docker

Pull back-end service and front-end UI service with:

241016 updated

run on gpu

or on cpu

Run with script

Setting up for environment

Prepare for Data and Model

Prepare for front-end UI service

Under folder BioMANIA/chatbotuibiomania

Inference with pretrained models

Under folder BioMANIA/

backend, in one terminal

frontend, in another terminal

DIY

under BioMANIA/

for the first turn of a dialog, use conversationstarted=True, then use conversationstarted=False for the following dialogs

if you want to use previous session, use the same sessionid as before and conversationstarted = False

Build your APP!

Share your APP!

Share docker

cd BioMANIA

(optional)push to docker

Share data/models

Reference and Acknowledgments

Version History

Star History

Citation

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Under folder `BioMANIA/`