cs329_project
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.4%) to scientific vocabulary
Scientific Fields
Repository
Basic Info
- Host: GitHub
- Owner: zrobertson466920
- Language: Python
- Default Branch: main
- Size: 2.73 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 1
- Releases: 0
Metadata Files
README.md
Papers
Abstract: This paper introduces a novel mechanism for scalable oversight, leveraging Total Variation Distance Mutual Information (TVD-MI) in a principal-agent framework. Our approach uniquely addresses the challenge of oversight with information asymmetry, where the principal lacks direct access to ground truth. Unlike classical methods requiring perfect probability estimates, our mechanism provides robust theoretical guarantees while remaining practically implementable. We prove that the mechanism is robust to specification gaming—neither principals nor agents gain significant utility from distorting their natural responses. We validate our theoretical results through comprehensive experiments in two high-stakes domains: scientific review and medical text assessment. Our experiments demonstrate that TVD-MI effectively detects strategic behavior in paper reviews and correlates more strongly with human agreement on correctness (0.110 ± 0.014) compared to LLM judges (0.020-0.035 ± 0.004). These results establish TVD-MI as a practical tool for scalable oversight while highlighting important limitations in current LLM-based evaluation approaches.
Information-Theoretic LLM Evaluation Framework
This repository contains the implementation of an information-theoretic framework for evaluating Language Model (LLM) outputs, developed through two related works:
- Workshop Paper: "Implementability of Information Elicitation Mechanisms with Pre-Trained Language Models" presented at the ICML TF2M workshop
- Class Project: "Information-Theoretic Measures for LLM Output Evaluation" (CS329 2023)
Project Evolution
This work began as research into information-theoretic measures for LLM evaluation, first presented at ICML TF2M workshop. The class project extends this foundation by: - Implementing asynchronous API-based evaluation - Adding comparative analysis between human and LLM judges - Developing scalable oversight mechanisms for text quality
Publications
Original Workshop Paper
"Implementability of Information Elicitation Mechanisms with Pre-Trained Language Models" - Develops theoretical foundations for information-theoretic LLM evaluation - Introduces Difference of Entropies (DoE) estimator - Provides initial empirical validation
Class Project Extension
"Information-Theoretic Measures for LLM Output Evaluation" - Implements practical framework for large-scale evaluation - Compares human vs LLM judgment patterns - Demonstrates applications in text quality assessment
Citations
Software
bibtex
@software{robertson2023llm,
author = {Robertson, Zachary and Bedi, Suhana and Lee, Hansol},
title = {LLM Evaluation Framework},
year = {2023},
url = {https://github.com/zrobertson466920/CS329_Project},
version = {1.0.0}
}
Research Paper
bibtex
@inproceedings{robertson2024implementability,
title={Implementability of Information Elicitation Mechanisms with Pre-Trained Language Models},
author={Robertson, Zachary and Cha, Hannah and Sheha, Andrew and Koyejo, Sanmi},
booktitle={ICML 2024 Workshop on Theoretical Foundations of Foundation Models}
}
Key Technical Components
This project implements a framework for evaluating and comparing Language Model (LLM) responses using information-theoretic measures and pairwise comparisons.
Overview
The framework provides: - Automated data generation from LLM responses - Evaluation using both synthetic and LLM-based critics/judges - Information-theoretic scoring mechanisms - Asynchronous API handling for efficient processing - Comprehensive tracking of API usage and mechanism calls
Setup
Clone the repository:
bash git clone [repository-url] cd [repository-name]Install dependencies:
bash pip install -r requirements.txtCreate a config.py file with your OpenAI API settings:
python OPENAI_API_KEY = "your-api-key" OPENAI_MODEL = "gpt-4" # or your preferred model MAX_TOKENS = 4000
Project Structure
.
├── async_llm_test.py # Main experiment runner
├── api_utils.py # API handling utilities
├── config.py # Configuration settings
├── data/ # Data directory
│ └── ... # Generated datasets
└── README.md
Core Components
ExperimentOracle Class
Manages experiment configuration and execution. Supports two modes: - Synthetic: Uses predefined distributions for testing - LLM: Uses actual LLM calls for evaluation
API Utilities
Handles all LLM API interactions with: - Asynchronous processing - Rate limiting - Usage tracking - Error handling
Evaluation Mechanisms
Implements two types of evaluators: 1. Critics: Assess information gain between responses 2. Judges: Perform pairwise comparisons of response quality
Usage
Basic Example
```python import asyncio from asyncllmtest import ExperimentOracle
Define experiment configuration
expconfig = { "exptype": "llm", "numagents": 3, "modelconfig": { "modelname": "gpt-4", "maxtokens": 4000, "temperature": 1.0 }, "taskdescription": "Abstract review task", "agentperspectives": [ {"strategy": "Please review the following abstract in three sentences."}, {"strategy": "Please review the following abstract in three sentences."}, {"strategy": None} # Null model ], "dataconfig": { "ntasks": 50, "preload": False } }
Create oracle and run experiment
async def runexperiment(): oracle = ExperimentOracle(expconfig) await oracle.experiment()
if name == "main": asyncio.run(run_experiment()) ```
Running Tests
bash
python async_llm_test.py
Output Format
The experiment generates two types of files:
Dataset Files (
*_data.json):json { "task_description": "...", "agent_perspectives": [...], "tasks": [ { "context": "...", "responses": [...] } ], "metadata": { "model_config": {...}, "data_config": {...}, "generation_time": "..." } }Results Files (
*_results.json):json { "task_description": "...", "agent_perspectives": [...], "comparisons": [ { "agent_pair": [i, j], "comparison_type": "critic|judge", "result": 0|1, "x": "...", "y": "...", "prompt": "..." } ] }
Statistics Tracking
The framework tracks: - API calls and token usage - Mechanism calls (critic and judge) - Execution time and costs
Access statistics programmatically: ```python from apiutils import getapistats, getmechanism_stats
apistats = getapistats() mechanismstats = getmechanismstats() ```
Contributing
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Owner
- Name: Zachary Robertson
- Login: zrobertson466920
- Kind: user
- Website: https://zrobertson466920.github.io/
- Repositories: 1
- Profile: https://github.com/zrobertson466920
I am a PhD student at Stanford researching human and AI cooperation.
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Robertson" given-names: "Zachary" - family-names: "Bedi" given-names: "Suhana" - family-names: "Lee" given-names: "Hansol" title: "LLM Evaluation Framework" version: 1.0.0 doi: 10.5281/zenodo.1234 date-released: 2023-12-14 url: "https://github.com/zrobertson466920/CS329_Project"
GitHub Events
Total
- Issues event: 3
- Issue comment event: 3
- Member event: 2
- Push event: 17
- Pull request event: 4
- Fork event: 1
- Create event: 4
Last Year
- Issues event: 3
- Issue comment event: 3
- Member event: 2
- Push event: 17
- Pull request event: 4
- Fork event: 1
- Create event: 4