https://github.com/artiks12/modelfinetuningpipeline

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (6.5%) to scientific vocabulary

Last synced: 9 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: artiks12
Language: Jupyter Notebook
Default Branch: main
Size: 9.77 KB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created about 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme

ModelFineTuningPipeline

This is the repository to fine-tune models. It is a part of the master thesis "Evaluation and Adaptation of Large Language Models for Question-Answering on Legislation" made in University of Latvia.

How to Use

This script was used with Python 3.10 so it is recomended to use this version of python. You also need to do these things: - Install the unsloth package. - Download llama.cpp code and binaries: https://github.com/ggml-org/llama.cpp - Put training and validation data in datasets folder. Repository that creates these datasets is available here: https://github.com/artiks12/DatasetPreperation - Specify your HuggingFace key in key.json file

Since there is a bug in unsloth code that prevents you from quantizing models in any format other than F16, a separate file for model quantization is created (QuantizeModel.py). You need to specify the path to your model GGUF file and quantization methods and run the script.

Owner

Login: artiks12
Kind: user

Repositories: 2
Profile: https://github.com/artiks12

GitHub Events

Total

Push event: 1
Public event: 1

Last Year

Push event: 1
Public event: 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science