https://github.com/cbib/trialmatchai

TrialMatchAI aims to seamlessly match cancer patients to clinical trials based on their unique genomic and clinical profiles using AI

https://github.com/cbib/trialmatchai

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.1%) to scientific vocabulary

Keywords

artificial-intelligence cancer-genomics clinical-trials large-language-models personalized-medicine personalized-recommendation python
Last synced: 6 months ago · JSON representation

Repository

TrialMatchAI aims to seamlessly match cancer patients to clinical trials based on their unique genomic and clinical profiles using AI

Basic Info
  • Host: GitHub
  • Owner: cbib
  • License: other
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 48.6 MB
Statistics
  • Stars: 14
  • Watchers: 5
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Topics
artificial-intelligence cancer-genomics clinical-trials large-language-models personalized-medicine personalized-recommendation python
Created almost 2 years ago · Last pushed 6 months ago
Metadata Files
Readme License

README.md

TrialMatchAI

Logo

An AI-driven tool designed to match patients with the most relevant clinical trials. Leveraging state-of-the-art Large Language Models (LLMs), Natural Language Processing (NLP), and Explainable AI (XAI), TrialMatchAI structures trial documentation and patient data to provide transparent, personalized recommendations.


⚠️ Disclaimer

At this stage, TrialMatchAI is still under active development and largely a prototype provided for research and informational purposes only. It is NOT medical advice and should not replace consultation with qualified healthcare professionals.


🔍 Key Features

  • AI-Powered Matching: Utilizes advanced LLMs to parse complex eligibility criteria and patient records (including unstructured notes and genetic reports).
  • Personalized Recommendations: Tailors trial suggestions based on each patient’s unique clinical history and genomic profile.
  • Explainable Insights: Provides clear, chain-of-thought explanations for every recommended trial, enhancing trust and interpretability.
  • Real-Time Updates: Maintains an up-to-date database of recruiting trials.
  • Scalable Architecture: Dockerized components enable easy deployment of Elasticsearch indices and indexing pipelines.

⚙️ System Requirements

  • OS: Linux or macOS
  • Docker & Docker Compose: For running the Elasticsearch container
  • Python: ≥ 3.8
  • GPU: NVIDIA (e.g., H100) with ≥ 60 GB VRAM (recommended for large-scale processing)
  • Disk Space: ≥ 100 GB free (for data and indices)

🚀 Installation & Setup

  1. Clone the Repository
    bash git clone https://github.com/cbib/TrialMatchAI.git cd TrialMatchAI

  2. Ensure the Repository Is Up to Date
    bash git pull origin main

  3. Make the Setup Script Executable
    bash chmod +x setup.sh

  4. (Optional) Configure Elasticsearch Password

    • Open the .env file located in the docker/ folder.
    • Update the ELASTIC_PASSWORD variable to your desired secure password.
      dotenv # docker/.env ELASTIC_PASSWORD=YourNewPassword

4a. (Optional) Sync config.json Password
If you updated ELASTIC_PASSWORD above, open config.json in the repo root and update the Elasticsearch password field to match:
json { "elasticsearch": { "host": "https://localhost:9200", "username": "elastic", "password": "YourNewPassword", . . }, ... }

  1. Run the Setup Script
    bash ./setup.sh
    • Installs Python dependencies
    • Downloads datasets, resources, and model archives from Zenodo
    • Verifies GPU availability
    • Builds the Elasticsearch container via Docker Compose
    • Launches indexing pipelines in the background
    • Estimated Time: ~60–90 minutes (depending on hardware)

🎯 Usage Example

Run the matcher on a sample input directory:

bash python -m src.Matcher.main

Results are saved under results/, with detailed criterion-level explanations for each recommended trial.


🤝 Contributing

We welcome community contributions! To contribute:

  1. Fork the repository.
  2. Create a feature branch: git checkout -b feature/YourFeature.
  3. Commit your changes and push to your branch.
  4. Open a Pull Request against main.

Please follow our code style and include tests where applicable.


🙋 Support & Contact

For questions, issues, or feature requests, open an issue on GitHub or reach out to:

Owner

  • Name: Centre de Bioinformatique de Bordeaux
  • Login: cbib
  • Kind: organization
  • Location: Université de Bordeaux (146, rue Léo Saignat 33076 Bordeaux cedex)

GitHub Events

Total
  • Watch event: 10
  • Delete event: 5
  • Member event: 2
  • Push event: 23
  • Pull request event: 2
  • Fork event: 1
  • Create event: 3
Last Year
  • Watch event: 10
  • Delete event: 5
  • Member event: 2
  • Push event: 23
  • Pull request event: 2
  • Fork event: 1
  • Create event: 3

Dependencies

requirements.txt pypi
  • FlagEmbedding ==1.3.4
  • FlagEmbedding ==1.3.3
  • Jinja2 ==3.1.6
  • Requests ==2.32.3
  • bert_score ==0.3.13
  • bioregistry ==0.11.35
  • colorcet ==3.1.0
  • datasets ==2.19.0
  • elasticsearch ==9.0.0
  • faiss_cpu ==1.9.0.post1
  • falcon ==4.0.2
  • filelock ==3.18.0
  • gliner ==0.2.16
  • joblib ==1.4.2
  • langchain ==0.3.23
  • langchain_community ==0.3.21
  • langchain_core ==0.3.52
  • langchain_openai ==0.3.13
  • local_gemma ==0.2.0
  • matplotlib ==3.10.1
  • nltk ==3.9.1
  • numpy ==2.2.4
  • openpyxl ==3.1.5
  • optimum ==1.23.3
  • pandas ==2.2.3
  • peft ==0.14.0
  • pycountry ==24.6.1
  • pydantic ==2.11.3
  • pymongo ==4.12.0
  • python-dotenv ==1.1.0
  • python_dateutil ==2.9.0.post0
  • rapidfuzz ==3.13.0
  • rich ==14.0.0
  • scikit_learn ==1.6.1
  • scipy ==1.15.2
  • seaborn ==0.13.2
  • seqeval ==1.2.2
  • spacy ==3.7.5
  • statsmodels ==0.14.4
  • tenacity ==9.0.0
  • torch ==2.5.1
  • tqdm ==4.67.1
  • transformers ==4.49.0
elasticsearch/docker-compose.yml docker
  • docker.elastic.co/elasticsearch/elasticsearch ${STACK_VERSION}
  • docker.elastic.co/kibana/kibana ${STACK_VERSION}