https://github.com/artiks12/modelfinetuningpipeline

https://github.com/artiks12/modelfinetuningpipeline

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.5%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: artiks12
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 9.77 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 1 year ago · Last pushed about 1 year ago
Metadata Files
Readme

README.md

ModelFineTuningPipeline

This is the repository to fine-tune models. It is a part of the master thesis "Evaluation and Adaptation of Large Language Models for Question-Answering on Legislation" made in University of Latvia.

How to Use

This script was used with Python 3.10 so it is recomended to use this version of python. You also need to do these things: - Install the unsloth package. - Download llama.cpp code and binaries: https://github.com/ggml-org/llama.cpp - Put training and validation data in datasets folder. Repository that creates these datasets is available here: https://github.com/artiks12/DatasetPreperation - Specify your HuggingFace key in key.json file

Since there is a bug in unsloth code that prevents you from quantizing models in any format other than F16, a separate file for model quantization is created (QuantizeModel.py). You need to specify the path to your model GGUF file and quantization methods and run the script.

Owner

  • Login: artiks12
  • Kind: user

GitHub Events

Total
  • Push event: 1
  • Public event: 1
Last Year
  • Push event: 1
  • Public event: 1