https://github.com/chenghaomou/idefics2-contract-qa
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (3.9%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
Basic Info
- Host: GitHub
- Owner: ChenghaoMou
- License: mit
- Language: Jupyter Notebook
- Default Branch: main
- Size: 28.3 KB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 0
Created about 2 years ago
· Last pushed almost 2 years ago
Metadata Files
Readme
License
README.md
Fine-tuning Idefics2 on EDGAR Contract QA Dataset
See Blog for more details.
dataset.ipynbprepares the dataset.train.pyfine-tunes the model on the dataset.benchmark.ipynbevaluates the model on the test dataset.
Datasets
chenghao/sec-material-contracts-qa-splitted consists of the following data: 1. chenghao/sec-material-contracts-qa 2. jordyvl/DUDEsubset100val
Data splits: train (80%), test (20%)
Model
More details can be found at idefics2-edgar. The training script can be run with a single GPU (A100-80GB) with low resolution input and QLoRA training.
References:
Owner
- Name: Chenghao Mou
- Login: ChenghaoMou
- Kind: user
- Location: Ireland
- Website: https://sleeplessindebugging.blog/
- Repositories: 32
- Profile: https://github.com/ChenghaoMou
NLP/AI