Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.2%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: CEDARS-NLP
- Language: Python
- Default Branch: main
- Homepage: https://pines.ai/
- Size: 22.5 MB
Statistics
- Stars: 3
- Watchers: 0
- Forks: 0
- Open Issues: 3
- Releases: 0
Metadata Files
README.md
Overview
Background
PINES (Progressive Inference Networked Episodic Service) is a natural language processing (NLP) package aimed at detecting clinical events in the electronic health record (EHR). This software suite incorporates specialized functions and a dedicated application programming interface (API) designed to facilitate its use as a service integrated with a CEDARS instance, even though it can be used as a standalone tool as well. PINES exists as an open-source Python package under GPL-3 license. The latest package and prior versions can be cloned from GitHub. Full documentation is available here. Please see the Terms of Use before using this software. PINES is provided as-is with no guarantee whatsoever and users agree to be held responsible for compliance with their local government/institutional regulations.
General Requirements
Local installation
- Python 3.9 or later
- poetry
Docker installation
- Docker
Installation
Local
To install the package locally, run the following commands:
bash
git clone https://github.com/CEDARS-NLP/PINES.git
cd PINES
poetry install # this will install all required packages
poetry run python pines.py # this will run the package
Docker
bash
git clone https://github.com/CEDARS-NLP/PINES.git
docker build -t pines-api .
docker run -dp 127.0.0.1:8036:8036 pines-api
Basic Concepts
Input: Clinical Note
Output: Label, Score
We fine tuned the clinical-longformer[@li2023comparative] model on our dataset. The clinical-longformer, starting with Longformer checkpoint, was further pre-trained on MIMIC-III dataset. After finetuning, the model is then used to predict the presence of a label in a new clinical note. The model outputs a score which is a measure of the confidence of the model in the prediction.
Note: The trained models are not open-source and are not included in the repository. Please email the authors for access to the trained models.
Model Card
- VTE Detection Model
| Property | Value | | --- | --- | | Model Name | vte-longformer-4k-cedars | | Model Version | 1.0 | | Model Type | Longformer | | Context Length | 4096 | | Training Data | Internal MSKCC dataset |
- Metastatic Disease Detection Model
| Property | Value | | --- | --- | | Model Name | mets-longformer-4k-pycedars | | Model Version | 1.0 | | Model Type | Longformer | | Model Size | 4k | | Training Data | Internal MSKCC dataset |
Operational Schema

PINES can be run as a standalone service or as part of a CEDARS deployment. The standalone service can be run as a Docker container or as a local installation.
In all deployments, the service can be accessed via a REST API.
Sample Code
Detection of metastatic disease in a clinical note.
Using Httpie
bash
http POST http://localhost:8036/predict text="The patient had metastates."
Using Curl
bash
curl -X POST "http://localhost:8036/predict" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-d "{\"text\":\"The patient had metastates.\"}"
Output
json
{
"model": "mets-longformer-4k-pycedars",
"prediction": {
"label": "LABEL_1",
"score": 0.9969003200531006
}
}
Future Development
We are currently documenting the performance of PINES with a focus on hematology and oncology clinical research. Please communicate with package author Simon Mantha, MD, MPH (smantha@cedars.io) if you want to discuss new features or using this software for your clinical research application.
References
Owner
- Name: CEDARS-NLP
- Login: CEDARS-NLP
- Kind: organization
- Repositories: 1
- Profile: https://github.com/CEDARS-NLP
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: PINES
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Rohan
family-names: Singh
email: singhrohan@outlook.com
affiliation: Memorial Sloan Kettering Cancer Center
orcid: 'https://orcid.org/0009-0003-1326-7249'
- given-names: Simon
family-names: Mantha
affiliation: Memorial Sloan Kettering Cancer Center
repository-code: 'https://github.com/CEDARS-NLP/PINES'
url: 'https://pines.ai'
abstract: >-
PINES (Progressive Inference Networked Episodic Service)
is a natural language processing (NLP) package aimed at
detecting clinical events in the electronic health record
(EHR). This software suite incorporates specialized
functions and a dedicated application programming
interface (API) designed to facilitate its use as a
service integrated with a CEDARS instance, even though it
can be used as a standalone tool as well. PINES exists as
an open-source Python package under GPL-3 license. The
latest package and prior versions can be cloned from
GitHub. Full documentation is available here. Please see
the Terms of Use before using this software. PINES is
provided as-is with no guarantee whatsoever and users
agree to be held responsible for compliance with their
local government/institutional regulations.
keywords:
- nlp
- llm
- transformer
- clinical
- data labelling
- EHR
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Dependencies
- actions/checkout v3 composite
- actions/configure-pages v3 composite
- actions/deploy-pages v2 composite
- actions/upload-pages-artifact v1 composite
- ruby/setup-ruby v1 composite
- pytorch/pytorch latest build
- annotated-types 0.6.0
- anyio 4.2.0
- boolean-py 4.0
- cachecontrol 0.14.0
- certifi 2024.2.2
- charset-normalizer 3.3.2
- click 8.1.7
- colorama 0.4.6
- cyclonedx-python-lib 6.4.1
- defusedxml 0.7.1
- exceptiongroup 1.2.0
- fastapi 0.109.2
- filelock 3.13.1
- fsspec 2024.2.0
- ghp-import 2.1.0
- h11 0.14.0
- html5lib 1.1
- huggingface-hub 0.20.3
- idna 3.6
- importlib-metadata 7.0.1
- iniconfig 2.0.0
- jinja2 3.1.3
- license-expression 30.2.0
- markdown 3.5.2
- markdown-it-py 3.0.0
- markupsafe 2.1.5
- mdurl 0.1.2
- mergedeep 1.3.4
- mkdocs 1.5.3
- mpmath 1.3.0
- msgpack 1.0.7
- networkx 3.2.1
- numpy 1.26.4
- nvidia-cublas-cu12 12.1.3.1
- nvidia-cuda-cupti-cu12 12.1.105
- nvidia-cuda-nvrtc-cu12 12.1.105
- nvidia-cuda-runtime-cu12 12.1.105
- nvidia-cudnn-cu12 8.9.2.26
- nvidia-cufft-cu12 11.0.2.54
- nvidia-curand-cu12 10.3.2.106
- nvidia-cusolver-cu12 11.4.5.107
- nvidia-cusparse-cu12 12.1.0.106
- nvidia-nccl-cu12 2.19.3
- nvidia-nvjitlink-cu12 12.3.101
- nvidia-nvtx-cu12 12.1.105
- packageurl-python 0.13.4
- packaging 23.2
- pathspec 0.12.1
- pip 24.0
- pip-api 0.0.30
- pip-audit 2.7.0
- pip-requirements-parser 32.0.1
- platformdirs 4.2.0
- pluggy 1.4.0
- py-serializable 1.0.0
- pydantic 2.6.1
- pydantic-core 2.16.2
- pygments 2.17.2
- pyparsing 3.1.1
- pytest 7.4.4
- python-dateutil 2.8.2
- pyyaml 6.0.1
- pyyaml-env-tag 0.1
- regex 2023.12.25
- requests 2.31.0
- rich 13.7.0
- safetensors 0.4.2
- six 1.16.0
- sniffio 1.3.0
- sortedcontainers 2.4.0
- starlette 0.36.3
- sympy 1.12
- tokenizers 0.15.1
- toml 0.10.2
- tomli 2.0.1
- torch 2.2.0
- tqdm 4.66.1
- transformers 4.37.2
- triton 2.2.0
- typing-extensions 4.9.0
- urllib3 2.2.0
- uvicorn 0.26.0
- watchdog 3.0.0
- webencodings 0.5.1
- zipp 3.17.0
- pip-audit ^2.7.0 develop
- pytest ^7.4.4 develop
- fastapi ^0.109.0
- mkdocs ^1.5.3
- python ^3.9
- torch ^2.2.0
- transformers ^4.36.2
- uvicorn ^0.26.0
- annotated-types ==0.6.0
- anyio ==4.2.0
- certifi ==2024.2.2
- charset-normalizer ==3.3.2
- click ==8.1.7
- colorama ==0.4.6
- exceptiongroup ==1.2.0
- fastapi ==0.109.2
- filelock ==3.13.1
- fsspec ==2024.2.0
- ghp-import ==2.1.0
- h11 ==0.14.0
- huggingface-hub ==0.20.3
- idna ==3.6
- importlib-metadata ==7.0.1
- jinja2 ==3.1.3
- markdown ==3.5.2
- markupsafe ==2.1.5
- mergedeep ==1.3.4
- mkdocs ==1.5.3
- numpy ==1.26.4
- packaging ==23.2
- pathspec ==0.12.1
- platformdirs ==4.2.0
- pydantic ==2.6.1
- pydantic-core ==2.16.2
- python-dateutil ==2.8.2
- pyyaml ==6.0.1
- pyyaml-env-tag ==0.1
- regex ==2023.12.25
- requests ==2.31.0
- safetensors ==0.4.2
- six ==1.16.0
- sniffio ==1.3.0
- starlette ==0.36.3
- tokenizers ==0.15.1
- tqdm ==4.66.1
- transformers ==4.37.2
- typing-extensions ==4.9.0
- urllib3 ==2.2.0
- uvicorn ==0.26.0
- watchdog ==3.0.0
- zipp ==3.17.0