socratic-llm

Training pipeline for fine tuning Phi-3-mini-instruct to follow the Socratic method

https://github.com/giovannigatti/socratic-llm

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (3.6%) to scientific vocabulary

Keywords

fine-tuning llm phi-3-mini
Last synced: 9 months ago · JSON representation ·

Repository

Training pipeline for fine tuning Phi-3-mini-instruct to follow the Socratic method

Basic Info
Statistics
  • Stars: 23
  • Watchers: 3
  • Forks: 2
  • Open Issues: 0
  • Releases: 0
Topics
fine-tuning llm phi-3-mini
Created almost 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme License Citation

README.md

ollama

Socratic LLM

Static Badge Static Badge Static Badge Static Badge Static Badge Static Badge

Using Large Language Models (LLMs) in education presents unique challenges. Typically, LLMs are designed to provide direct answers to questions, which can hinder students' critical thinking and self-discovery skills. To address this, we focus on fine-tuning LLMs to facilitate Socratic interactions. Instead of giving straightforward answers, these models guide students to explore and find the answers themselves. We achieve this through Direct Preference Optimization (DPO). We test our approach with diverse datasets, including various educational materials and Socratic dialogues. Using advanced models like GPT-4o for evaluation, our results show that DPO successfully fine-tunes LLMs for Socratic dialogue, enhancing their educational value.

This repository contains the source material for the paper "EULER: Fine Tuning a Large Language Model for Socratic Interactions".

Finally, this project is just one piece of a broader educational initiative. At EURECOM, we are crafting chatbots designed to support students in their learning journeys. These chatbots can answer student inquiries by navigating through a wealth of educational resources from our institution, including textbooks, slides, and lecture notes. Curious to see it in action? Try chatting with EULER and experience it yourself!

Model inference

HuggingFace

It's possible to download and execute the model using HuggingFace's transformers library with:

```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch

model = AutoModelForCausalLM.frompretrained( "eurecom-ds/Phi-3-mini-4k-socratic", torchdtype=torch.bfloat16, trustremotecode=True, device_map="cuda", )

tokenizer = AutoTokenizer.frompretrained("eurecom-ds/Phi-3-mini-4k-socratic", trustremote_code=True) ```

[!TIP] When using the transformers library, you need to apply the chat template available at inference.txt. Check out for more details at Phi-3-mini-4k-socratic.

Ollama

The model is also available at OllamaHub: eurecom-ds/phi-3-mini-4k-socratic. We also made available the quantized versions for memory constrained environments. Ollama allows swiftly mounting this model in a web service, or simply for local execution. For example,

```bash

Ollama installation

curl -fsSL https://ollama.com/install.sh | sh

Launching ollama service

ollama serve &

Running the quantized model locally

ollama run eurecom-ds/phi-3-mini-4k-socratic:Q4_0 ```

Check out more about Ollama here.

https://github.com/user-attachments/assets/5e7f4b66-332c-48a5-b110-6f5b1a219f39

[!TIP] The model's inference template is already managed by Ollama service (check it here). Thus, you can query the model directly: ```python from ollama import Client

client = Client(host="http://address-to-your-ollama-server:port") client.pull("eurecom-ds/phi-3-mini-4k-socratic")

userquery = f"Student: How can I stop fire?" response = client.chat(model="eurecom-ds/phi-3-mini-4k-socratic", messages=[{'role': 'user', 'content': userquery, }, ]) print(print(response['message']['content']))

To address that, let's consider it further. When thinking about stopping

something like a wildfire or even fires in general - are there mechanisms

at play besides extinguishing them directly with water or foam? What do you

think could be effective strategies to halt the process of combustion itself?

```

Chatbot

Running a chatbot

You can interact with the model using a chatbot application powered with Gradio by running a Docker container.

bash docker run --rm --gpus all -p 2121:2121 -v /home/<user>/huggingface/:/huggingface -e HF_HOME=/huggingface -it eurecomds/phi-3-mini-4k-socratic

You can specify which port the chatbot application starts with --server-port <port number> (default 2121), or load the model with 4-bit quantization by adding --load-in-4bit to the end of the above command line.

Building your own chatbot

Our model was trained to follow the Socratic method over multiple interactions. However, you need to provide a chat history to its inputs. Thus, we advise prefixing student's and professor's by role and to present them in a linear path to the model. For example, the chat history below can be used as the model's input.

text Student: What can stop a fire? Professor: Can you think of different methods or substances that might be effective in stopping a fire? Student: I could use water Professor: Water extinguishes fire due to its cooling effect and its ability to remove heat. Can you think about how the heat absorption by water might affect the fire triangle, which consists of heat, fuel, and oxygen? And considering your answer, what other methods could be effective in different scenarios? Student: Maybe using a carbon dioxide for removing oxygen?

For more details, check out how we built our chatbot at socratic_ui.py.

Scripts

We also make available evaluation scripts.

  • self_eval.py: Perform evaluation of the LLM and prompt engineering (e.g., GPT-4o or Llama3:70b)
  • eval_model.py: Perform evaluation of the finetuned model or the base model and prompt engineering only
  • gen_train_dataset.py: Generates the dataset for DPO finetuning using another LLM as a judge (i.e., GPT-4o)
  • train.py: Runs DPO on the base model
  • human_vs_gpt.py: Use Judge model to perform evaluation of the human scored examples (validation of judge LLM)
  • pipeline.py: Executes the training pipeline end-to-end (DPO dataset generation + finetuning + evaluation)

For each script, check --help for more details.

Pipeline artifacts

When running the complete pipeline, the script generates a set of training and evaluation artifacts following the given structure:

├── training_root # name to be specified by the user │ ├── dpo # DPO related files │ │ ├── {dataset} # seed dataset {mathdial,tutorchat,debugging} │ │ ├── train_dataset.json # Examples generated by the base model + prompt engineering then classified in choosen/rejected by the judge model │ │ ├── weights # Finetuned model weights │ │ ├── checkpoints # Training checkpoints │ ├── evaluation # Performance assements related files │ │ ├── {dataset} # seed dataset {mathdial,tutorchat,debugging} │ │ │ ├── from_finetuned_with_tutorchat.json # GPT-4o evaluation using model finetuned with tutorchat data │ │ │ ├── from_finetuned_with_mathdial.json # " " " " " " mathdial data │ │ │ ├── from_finetuned_with_debugging.json # " " " " " " debbuging data │ │ │ ├── base.json # " " " base model + prompt-engineering │ │ │ ├── gpt4o.json # " " " GPT-4o + prompt-engineering │ │ ├── human_vs_gpt.json # Comparison between human asssessment and judge LLM │ ├── figures # report evaluation figures

Running in Docker container

It's possible to run any project's script with a Docker container. To do so, first build the image with

bash $ docker build -t socratic-llm .

Then run it with (tip: don't forget to mount the GPU and script's input/output directories). For example,

bash $ docker run --rm --gpus all -v socratic-llm/:/socractic-llm -v /home/<user>/huggingface:/huggingface -e HF_HOME=/huggingface -it socratic-llm -m pipeline --judge-llm openai <open-ai-key> gpt-4o --output-dir /socractic-llm --instruct-model microsoft/Phi-3-mini-4k-instruct

Cite this work

@inproceedings{"bonino2024socratic", title = {EULER: Fine Tuning a Large Language Model for Socratic Interactions}, author = {Bonino, Giulia and Sanmartino, Gabriele and Gatti Pinheiro, Giovanni and Papotti, Paolo and Troncy, Raphael and Michiardi, Pietro}, year = 2024, month = {November}, booktitle = {Proceedings of the Second International Workshop on Artificial Intelligence Systems in Education co-located with 23rd International Conference of the Italian Association for Artificial Intelligence (AIxIA 2024)}, publisher = {CEUR Workshop Proceedings} }

Owner

  • Name: Giovanni Gatti
  • Login: GiovanniGatti
  • Kind: user
  • Location: France

Citation (CITATION.cff)

cff-version: 1.2.0
message: "When citing this work, use the following metadata. And, thank you!"
authors:
- family-names: "Bonino"
  given-names: "Giulia"
- family-names: "Sanmartino"
  given-names: "Gabrielle"
- family-names: "Gatti Pinheiro"
  given-names: "Giovanni"
  orcid: "https://orcid.org/0000-0003-2401-4768"
- family-names: "Papotti"
  given-names: "Paolo"
  orcid: "https://orcid.org/0000-0003-0651-4128"
- family-names: "Troncy"
  given-names: "Raphaël"
  orcid: "https://orcid.org/0000-0003-0457-1436"
- family-names: "Michiardi"
  given-names: "Pietro"
  orcid: "https://orcid.org/0000-0003-4675-7677"
title: "Socratic LLM"
version: 0.1.0
date-released: 2024-08-26
url: "https://github.com/GiovanniGatti/socratic-llm/"
preferred-citation:
  type: conference-paper
  authors:
    - family-names: "Bonino"
      given-names: "Giulia"
    - family-names: "Sanmartino"
      given-names: "Gabrielle"
    - family-names: "Gatti Pinheiro"
      given-names: "Giovanni"
      orcid: "https://orcid.org/0000-0003-2401-4768"
    - family-names: "Papotti"
      given-names: "Paolo"
      orcid: "https://orcid.org/0000-0003-0651-4128"
    - family-names: "Troncy"
      given-names: "Raphaël"
      orcid: "https://orcid.org/0000-0003-0457-1436"
    - family-names: "Michiardi"
      given-names: "Pietro"
      orcid: "https://orcid.org/0000-0003-4675-7677"
  title: "EULER: Fine Tuning a Large Language Model for Socratic Interactions"
  year: "2024"
  month: "11"
  conference:
    name: "Proceedings of the Second International Workshop on Artificial Intelligence Systems in Education co-located with 23rd International Conference of the Italian Association for Artificial Intelligence (AIxIA 2024)"
    address: "Bolzano"
  publisher:
    name: "CEUR Workshop Proceedings"
  institution:
    name: "CEUR"

GitHub Events

Total
  • Watch event: 18
  • Push event: 3
  • Fork event: 2
Last Year
  • Watch event: 18
  • Push event: 3
  • Fork event: 2

Dependencies

Dockerfile docker
  • python 3.11-buster build
  • python 3.11-slim-buster build
requirements.txt pypi
  • bitsandbytes ==0.43.1
  • bokeh ==3.4.2
  • openai ==1.35.5
  • peft ==0.11.1
  • torch ==2.3.1
  • tqdm ==4.66.4
  • transformers ==4.41.2