https://github.com/cjbarrie/promptstability

Repo for paper analyzing stability of outcomes resulting from variations in language model prompt specification

https://github.com/cjbarrie/promptstability

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.6%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Repo for paper analyzing stability of outcomes resulting from variations in language model prompt specification

Basic Info
  • Host: GitHub
  • Owner: cjbarrie
  • Language: TeX
  • Default Branch: main
  • Size: 58.9 MB
Statistics
  • Stars: 2
  • Watchers: 3
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Created over 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme

README.md

promptstability

Repo for paper analyzing stability of outcomes resulting from variations in language model prompt specification.

Usage

To run all annotation scripts:

bash python 00_master.py

Development package

PyPI Tests Changelog License

Installation

Pypi installation

Install this library using pip:

bash pip install promptstability

Example usage

```python import pandas as pd import matplotlib.pyplot as plt from utils import PromptStabilityAnalysis, getopenaiapi_key from openai import OpenAI import ollama

1. Load a 5% subsample of the manifestos dataset

df = pd.read_csv('data/manifestos.csv') df = df[df['scale'] == 'Economic']

Take 5% of the rows (at least 1 row if the dataset is very small)

samplesize = max(1, int(0.1 * len(df))) df = df.sample(samplesize, randomstate=123) data = list(df['sentencecontext'].values)

Define the prompt texts

originaltext = ( "The text provided is a UK party manifesto. " "Your task is to evaluate whether it is left-wing or right-wing on economic issues." ) promptpostfix = "Respond with 0 for left-wing or 1 for right-wing."

2. ANALYSIS USING OPENAI (e.g., GPT-3.5-turbo)

Define the OpenAI annotation function

APIKEY = getopenaiapikey() OPENAIMODEL = 'gpt-3.5-turbo' client = OpenAI(api_key=APIKEY)

def annotateopenai(text, prompt, temperature=0.1): try: response = client.chat.completions.create( model=OPENAIMODEL, temperature=temperature, messages=[ {"role": "system", "content": prompt}, {"role": "user", "content": text} ] ) except Exception as e: print(f"OpenAI exception: {e}") raise e return ''.join(choice.message.content for choice in response.choices)

Instantiate the analysis class using OpenAI’s annotation function

psaopenai = PromptStabilityAnalysis(annotationfunction=annotate_openai, data=data)

Run intra-prompt (baseline) analysis using the updated method name intra_pss

print("Running OpenAI intra-prompt (baseline) analysis...") kaopenaiintra, annotatedopenaiintra = psaopenai.intrapss( originaltext, promptpostfix, iterations=3, # minimal iterations plot=False ) print("OpenAI intra-prompt KA scores:", kaopenaiintra)

Run inter-prompt analysis using the updated method name inter_pss

temperatures = [0.1, 0.5, 1.0] print("Running OpenAI inter-prompt analysis...") kaopenaiinter, annotatedopenaiinter = psaopenai.interpss( originaltext, promptpostfix, nrvariations=3,
temperatures=temperatures, iterations=1, plot=False ) print("OpenAI inter-prompt KA scores:", ka
openai_inter)

3. ANALYSIS USING OLLAMA (with your local deepseek-r1:8b)

Define the Ollama annotation function.

(Make sure that your Ollama server is running locally and that 'deepseek-r1:8b' is available.)

OLLAMAMODEL = 'deepseek-r1:8b' def annotateollama(text, prompt, temperature=0.1): try: response = ollama.chat(model=OLLAMA_MODEL, messages=[ {"role": "system", "content": prompt}, {"role": "user", "content": text} ]) except Exception as e: print(f"Ollama exception: {e}") raise e return response['message']['content']

Instantiate the analysis class using Ollama’s annotation function

psaollama = PromptStabilityAnalysis(annotationfunction=annotate_ollama, data=data)

Run intra-prompt (baseline) analysis for Ollama with few iterations

print("Running Ollama intra-prompt (baseline) analysis...") kaollamaintra, annotatedollamaintra = psaollama.intrapss( originaltext, promptpostfix, iterations=3, plot=False ) print("Ollama intra-prompt KA scores:", kaollamaintra)

Run inter-prompt analysis for Ollama with a couple of temperatures

temperatures = [0.1, 0.5] # or whichever temperatures you want to test print("Running Ollama inter-prompt analysis...") kaollamainter, annotatedollamainter = psaollama.interpss( originaltext, promptpostfix, nrvariations=3, temperatures=temperatures, iterations=1, plot=False ) print("Ollama inter-prompt KA scores:", kaollama_inter)

```

Development

To contribute to this library, send any PRs to the library repo at https://github.com/palaiole13/promptstability.

Owner

  • Name: Christopher Barrie
  • Login: cjbarrie
  • Kind: user
  • Company: University of Edinburgh

Lecturer in Computational Sociology, University of Edinburgh.

GitHub Events

Total
  • Watch event: 2
  • Push event: 9
Last Year
  • Watch event: 2
  • Push event: 9