biochatter

Backend library for conversational AI in biomedicine

https://github.com/biocypher/biochatter

Science Score: 77.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: nature.com, zenodo.org
  • Committers with academic emails
    1 of 22 committers (4.5%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.0%) to scientific vocabulary

Keywords

biocypher chatbot knowledge-graph llm retrieval-augmented-generation vector-database

Keywords from Contributors

surrogate standardization hack
Last synced: 6 months ago · JSON representation ·

Repository

Backend library for conversational AI in biomedicine

Basic Info
  • Host: GitHub
  • Owner: biocypher
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: main
  • Homepage: http://biochatter.org/
  • Size: 197 MB
Statistics
  • Stars: 178
  • Watchers: 7
  • Forks: 50
  • Open Issues: 113
  • Releases: 85
Topics
biocypher chatbot knowledge-graph llm retrieval-augmented-generation vector-database
Created over 2 years ago · Last pushed 7 months ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

BioChatter

| | | | | | --- | --- | --- | --- | | License | License: MIT | Python | Python | | Package | PyPI version Downloads DOI | Build status | CI Docs | | Tests | Coverage | Docker | Latest image Image size | | Development | Project Status: Active – The project has reached a stable, usable state and is being actively developed. Code style Ruff | Contributions | PRs Welcome Contributor Covenant |

Description

🤖 BioChatter is a community-driven Python library that connects biomedical applications to conversational AI, making it easy to leverage generative AI models in the biomedical domain.

🌟 Key Features

  • Generic backend for biomedical AI applications
  • Seamless integration with multiple LLM providers
  • Native connection to BioCypher knowledge graphs
  • Extensive testing and evaluation framework
  • Living benchmark of specific biomedical applications

🚀 Demo Applications and Utilities

📖 Learn more in our paper.

Installation

To use the package, install it from PyPI, for instance using pip (pip install biochatter) or Poetry (poetry add biochatter).

Extras

The package has some optional dependencies that can be installed using the following extras (e.g. pip install biochatter[xinference]):

  • xinference: support for querying open-source LLMs through Xorbits Inference

  • podcast: support for podcast text-to-speech (for the free Google TTS; the paid OpenAI TTS can be used without this extra)

  • streamlit: support for streamlit UI functions (used in BioChatter Light)

Usage

Check out the documentation for examples, use cases, and more information. Many common functionalities covered by BioChatter can be seen in use in the BioChatter Light code base. Built with
Material for
MkDocs

🤝 Getting involved

We are very happy about contributions from the community, large and small! If you would like to contribute to BioCypher development, please refer to our contribution guidelines and the developer docs. :)

If you want to ask informal questions, talk about dev things, or just chat, please join our community at https://biocypher.zulipchat.com!

Imposter syndrome disclaimer: We want your help. No, really. There may be a little voice inside your head that is telling you that you're not ready, that you aren't skilled enough to contribute. We assure you that the little voice in your head is wrong. Most importantly, there are many valuable ways to contribute besides writing code.

This disclaimer was adapted from the Pooch project.

Git LFS Configuration

This repository uses Git LFS for some large files. If you're a developer and don't need to work with these files, you have two options:

  1. Disable Git LFS smudge globally (set once for all repositories): bash git lfs install --skip-smudge git clone https://github.com/biocypher/biochatter.git

  2. Skip LFS files for a one-time clone: bash GIT_LFS_SKIP_SMUDGE=1 git clone https://github.com/biocypher/biochatter.git

Both options will prevent Git LFS from downloading the large files while still allowing you to work with the repository normally.

More information about LLMs

Check out this repository for more info on computational biology usage of large language models.

Citation

If you use BioChatter in your work, please cite our paper.

Owner

  • Name: biocypher
  • Login: biocypher
  • Kind: organization

Citation (CITATION.cff)

cff-version: 1.2.0
message: >-
  If you use this software, please cite it using the
  preferred-citation below.
type: software
preferred-citation:
  type: article
  title: A platform for the biomedical application of large language models
  authors:
    - given-names: Sebastian
      family-names: Lobentanzer
      orcid: 'https://orcid.org/0000-0003-3399-6695'
    - given-names: Shaohong
      family-names: Feng
    - given-names: Noah
      family-names: Bruderer
    - given-names: Andreas
      family-names: Maier
    - name: The BioChatter Consortium
      type: consortium
    - given-names: Cankun
      family-names: Wang
    - given-names: Jan
      family-names: Baumbach
    - given-names: Jorge
      family-names: Abreu-Vicente
    - given-names: Nils
      family-names: Krehl
    - given-names: Qin
      family-names: Ma
    - given-names: Thomas
      family-names: Lemberger
    - given-names: Julio
      family-names: Saez-Rodriguez
      orcid: 'https://orcid.org/0000-0002-8552-8976'
  doi: 10.1038/s41587-024-02534-3
  journal: Nature Biotechnology
  year: 2025
repository: 'https://github.com/biocypher/biochatter'
repository-code: 'https://github.com/biocypher/biochatter'
repository-artifact: 'https://pypi.org/project/biochatter/'
url: 'https://biochatter.org/'
license: MIT

GitHub Events

Total
  • Create event: 62
  • Issues event: 56
  • Release event: 15
  • Watch event: 98
  • Delete event: 32
  • Member event: 3
  • Issue comment event: 106
  • Push event: 395
  • Pull request review comment event: 26
  • Pull request review event: 48
  • Pull request event: 96
  • Fork event: 26
Last Year
  • Create event: 62
  • Issues event: 56
  • Release event: 15
  • Watch event: 98
  • Delete event: 32
  • Member event: 3
  • Issue comment event: 106
  • Push event: 395
  • Pull request review comment event: 26
  • Pull request review event: 48
  • Pull request event: 96
  • Fork event: 26

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 1,048
  • Total Committers: 22
  • Avg Commits per committer: 47.636
  • Development Distribution Score (DDS): 0.258
Past Year
  • Commits: 340
  • Committers: 15
  • Avg Commits per committer: 22.667
  • Development Distribution Score (DDS): 0.271
Top Committers
Name Email Commits
slobentanzer s****r@g****m 778
fengsh s****8@g****m 63
Tatan47 n****r@g****m 30
Nils Krehl n****l@p****e 29
drAbreu j****u@e****g 23
Luna Zetsche l****e@s****e 18
Tatan47 5****7 15
Yuyao Song 4****8 13
AndiMajore a****e@g****m 13
fengsh27 1****7 11
emmaver e****n@h****m 9
melis m****i@s****e 9
marlis@engelke.me m****s@e****e 8
Yasmin Tehranchian y****n@h****e 8
Patrick Dittrich s****r@g****m 6
Ubuntu u****u@u****l 6
Francesco Carli f****4@g****m 3
github-actions[bot] 4****] 2
Ikko Eltociear Ashimine e****r@g****m 1
Newton Winter 7****t 1
sturmcmaper s****r@g****m 1
let20 t****e@b****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 110
  • Total pull requests: 132
  • Average time to close issues: 2 months
  • Average time to close pull requests: 12 days
  • Total issue authors: 17
  • Total pull request authors: 24
  • Average comments per issue: 0.85
  • Average comments per pull request: 1.33
  • Merged pull requests: 98
  • Bot issues: 0
  • Bot pull requests: 6
Past Year
  • Issues: 47
  • Pull requests: 91
  • Average time to close issues: 23 days
  • Average time to close pull requests: 6 days
  • Issue authors: 13
  • Pull request authors: 16
  • Average comments per issue: 0.79
  • Average comments per pull request: 1.25
  • Merged pull requests: 69
  • Bot issues: 0
  • Bot pull requests: 6
Top Authors
Issue Authors
  • slobentanzer (119)
  • nilskre (6)
  • fcarli (5)
  • mengerj (4)
  • winternewt (3)
  • anisdismail (2)
  • fengsh27 (2)
  • kvitoslava-yarish (2)
  • vd-dragan21 (1)
  • emmaver (1)
  • WagnerJon (1)
  • sergeifedorenko (1)
  • Zeno-kimono (1)
  • mbaric758 (1)
  • noahbruderer (1)
Pull Request Authors
  • slobentanzer (65)
  • fengsh27 (31)
  • fcarli (20)
  • nilskre (9)
  • mengerj (8)
  • HansJarchow (8)
  • bastienchassagnol (7)
  • github-actions[bot] (6)
  • kvitoslava-yarish (6)
  • emmaver (4)
  • Tatan47 (4)
  • drAbreu (3)
  • AndiMajore (2)
  • AmanSCoder (2)
  • cpommier (2)
Top Labels
Issue Labels
model request (8) bug (7) enhancement (2) question (1)
Pull Request Labels
enhancement (12) documentation (6)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 500 last-month
  • Total docker downloads: 12
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 78
  • Total maintainers: 1
pypi.org: biochatter

Backend library for conversational AI in biomedicine

  • Versions: 78
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 500 Last month
  • Docker Downloads: 12
Rankings
Downloads: 8.9%
Dependent packages count: 10.1%
Average: 13.5%
Dependent repos count: 21.5%
Maintainers (1)
Last synced: 6 months ago

Dependencies

poetry.lock pypi
  • bump2version 1.0.1 develop
  • exceptiongroup 1.1.2 develop
  • iniconfig 2.0.0 develop
  • pluggy 1.2.0 develop
  • pytest 7.4.0 develop
  • tomli 2.0.1 develop
  • aiohttp 3.8.5
  • aiosignal 1.3.1
  • altair 5.0.1
  • async-timeout 4.0.3
  • attrs 23.1.0
  • blinker 1.6.2
  • cachetools 5.3.1
  • certifi 2023.7.22
  • charset-normalizer 3.2.0
  • click 8.1.6
  • colorama 0.4.6
  • dataclasses-json 0.5.14
  • decorator 5.1.1
  • environs 9.5.0
  • filelock 3.12.2
  • frozenlist 1.4.0
  • fsspec 2023.6.0
  • gitdb 4.0.10
  • gitpython 3.1.32
  • greenlet 2.0.2
  • grpcio 1.53.0
  • gtts 2.3.2
  • huggingface-hub 0.16.4
  • idna 3.4
  • importlib-metadata 6.8.0
  • jinja2 3.1.2
  • joblib 1.3.2
  • jsonschema 4.19.0
  • jsonschema-specifications 2023.7.1
  • langchain 0.0.147
  • markdown-it-py 3.0.0
  • markupsafe 2.1.3
  • marshmallow 3.20.1
  • mdurl 0.1.2
  • multidict 6.0.4
  • mypy-extensions 1.0.0
  • nltk 3.8.1
  • numexpr 2.8.5
  • numpy 1.25.2
  • openai 0.27.8
  • openapi-schema-pydantic 1.2.4
  • packaging 23.1
  • pandas 2.0.3
  • pillow 9.5.0
  • protobuf 4.24.0
  • py 1.11.0
  • pyarrow 12.0.1
  • pydantic 1.10.12
  • pydeck 0.8.0
  • pygments 2.16.1
  • pymilvus 2.2.8
  • pympler 1.0.1
  • pymupdf 1.22.5
  • python-dateutil 2.8.2
  • python-dotenv 1.0.0
  • pytz 2023.3
  • pytz-deprecation-shim 0.1.0.post0
  • pyyaml 6.0.1
  • redis 4.6.0
  • referencing 0.30.2
  • regex 2023.8.8
  • requests 2.31.0
  • retry 0.9.2
  • rich 13.5.2
  • rpds-py 0.9.2
  • safetensors 0.3.2
  • six 1.16.0
  • smmap 5.0.0
  • sqlalchemy 1.4.49
  • streamlit 1.25.0
  • tenacity 8.2.2
  • tiktoken 0.4.0
  • tokenizers 0.13.3
  • toml 0.10.2
  • toolz 0.12.0
  • tornado 6.3.2
  • tqdm 4.66.1
  • transformers 4.31.0
  • typing-extensions 4.7.1
  • typing-inspect 0.9.0
  • tzdata 2023.3
  • tzlocal 4.3.1
  • ujson 5.8.0
  • urllib3 2.0.4
  • validators 0.21.2
  • watchdog 3.0.0
  • yarl 1.9.2
  • zipp 3.16.2
pyproject.toml pypi
  • gTTS ^2.3.2
  • langchain 0.0.147
  • nltk ^3.8.1
  • openai ^0.27.8
  • pymilvus 2.2.8
  • pymupdf ^1.22.3
  • python >=3.10,<3.12
  • redis ^4.5.5
  • retry ^0.9.2
  • streamlit ^1.23.1
  • tiktoken ^0.4.0
  • transformers ^4.30.2