https://github.com/centre-for-humanities-computing/llm-tweet-classification
Classifying tweets with large language models with zero- and few-shot learning.
https://github.com/centre-for-humanities-computing/llm-tweet-classification
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.3%) to scientific vocabulary
Repository
Classifying tweets with large language models with zero- and few-shot learning.
Basic Info
- Host: GitHub
- Owner: centre-for-humanities-computing
- License: mit
- Language: Python
- Default Branch: main
- Size: 2.65 MB
Statistics
- Stars: 7
- Watchers: 0
- Forks: 1
- Open Issues: 3
- Releases: 1
Metadata Files
README.md
llm-tweet-classification
Classifying tweets with large language models with zero- and few-shot learning with custom and generic prompts, as well as supervised learning algorithms for comparison.
Our results on annotating tweets with labels exemplar and political:
| F1-scores & Accuracies | Precision-Recall |
|:---------:|:----------------:|
|
|
|
Getting Started
Install all requirements for the LLM classification script.
bash
pip install -r requirements.txt
NB: This will only install a minimal set of requirements to create figures for reproducability sake with the code below. A more complete requirements file for running the full pipeline can be found in configs.
Inference
The repo contains a CLI script llm_classification.py.
You can use it for running arbitrary classification tasks in .tsv or .csv files with Large Language models from either
HuggingFace or OpenAI.
If you intend to use OpenAI models, you will have to specify your API key and ORG as environment variables.
bash
export OPENAI_API_KEY="..."
export OPENAI_ORG="..."
The script has one command-line argument, namely a config file of the following format:
``` [paths] infile="labelleddata.csv" out_dir="predictions/"
[system] seed=0 device="cpu"
[model] name="google/flan-t5-base" task="few-shot"
[inference] xcolumn="rawtext" ycolumn="exemplar" nexamples=5 ```
If you intend to use a custom prompt for a given model, you can save it in a txt file and add its path to the
paths section of the config.
[paths]
in_file="labelled_data.csv"
out_dir="predictions/"
prompt_file="custom_prompt.txt"
If you want to use hand-selected examples for few-shot learning, pass along a subset of the original data int the paths section of the config. Examples have to be in the same format as the data.
[paths]
in_file="labelled_data.csv"
out_dir="predictions/"
examples="examples.csv"
You can run the CLI like this:
bash
python3 llm_classification.py "config.cfg"
Config Documentation
- Paths:
- in_file:
str- Path to input file, either.csvor.tsv - out_dir:
str- Output directory. The script creates one if not already there.
- in_file:
- System:
- seed:
int- Random seed for selecting few-shot examples. Is ignored whentask=="zero-shot" - device:
str- Device to run inference on. Change tocuda:0if you want to run on GPU.
- seed:
- Model:
- name:
str- Name of the model from OpenAI or HuggingFace. - task:
{"few-shot", "zero-shot"}- Indicates whether zero-shot or few-shot inference should be run.
- name:
- Inference:
- x_column:
str- Name of independent variable in the table. - y_column:
str- Name of dependent variable in the table. - n_examples:
int- Number of examples to give to few-shot models. Is ignored whentask=="zero-shot"
- x_column:
OpenAI script
For ease of use we have developed a script that generates predictions for all OpenAI models in one run. We did this, because OpenAI inference can run on low performance instances, as such it isn't a problem if it takes a long time to run. Additionally since all instances access the same API, and there are rate limits, we could not start multiple instances and run them in parallel.
Paths in this script are hardcoded and you might need to adjust it for personal use.
bash
python3 run_gpt_inference.py
Supervised Classification
For supervised models we made a separate script. This includes running and evaluating Glove-200d with logistic regression and finetuning DistilBert for classification.
This script requires different requirements, therefore you should install these from the appropriate file:
bash
pip install -r supervised_requirements.txt
Paths in this script are hardcoded and you might need to adjust it for personal use.
bash
python3 supervised_classification.py
Output
This will output a table with predictions added to the out_dir folder in the config.
The file name format is as follows:
python
f"predictions/{task}_pred_{column}_{model}.csv"
Each table will have a pred_<y_column> and also a train_test_set column that is labelled train for all examples included in the prompt for few-shot
learning and test everywhere else.
Evaluating results
To evaluate the performance of the model(s), you can run the CLI evaluation.py script. It has two command line arguments: --indir and --outdir. These, respectively, refer to the folder in which the predictions from the llmclassification.py script has been saved (i.e., your predictions folder), and the folder where the classification report(s) should be saved.
--indir defaults to 'predictions/' and --out_dir defaults to 'output/' (which is a folder that is created if it does not exist already)
It can be run as follows:
python
python3 evaluation.py --in_dir "your/data/path" --out_dir "your/out/path"
It expects the output file(s) from llm_classification.py in the specified file name format and placement.
It will output two files to the specified out folder:
- a txt file with the classification report for the test data for each of the files in the --in_dir folder.
- a csv file with the same information as the txt file, but which can be used for plotting the results.
Plotting results
The plotting.py script takes the csv-file produced by the evaluation script and makes three plots:
- accfigure.png: The accuracy for each of the 8 models on each outcome (political, exemplar) in each task (zero-shot, few-shot) with each prompt type (generic, custom). It's split into four quadrants, with the left side being the exemplar column, the right being political, the upper line being custom prompts and the lower column being generic prompts.
- f1figure.png: The f1-score for positive labels for each model in each task – again split into political and exemplar + generic and custom prompt.
- precrecfigure.png: Precision plotted against recall for each of the models, split into three rows and four columns. Rows indicate task (zero-shot, few-shot, supervised classification), columns indiciate label column (political, exemplar) and prompt type (generic, custom)
python
python3 plotting.py
These are all saved in a figures/ folder.
Owner
- Name: Center for Humanities Computing Aarhus
- Login: centre-for-humanities-computing
- Kind: organization
- Email: chcaa@cas.au.dk
- Location: Aarhus, Denmark
- Website: https://chc.au.dk/
- Repositories: 130
- Profile: https://github.com/centre-for-humanities-computing
GitHub Events
Total
- Release event: 2
- Watch event: 4
- Push event: 2
- Create event: 1
Last Year
- Release event: 2
- Watch event: 4
- Push event: 2
- Create event: 1
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 3
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 0.33
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- KasperFyhn (3)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- numpy >=1.23.5
- pandas >=1.5.0
- scikit-learn >=1.2.0
- scikit-llm >=0.2.0
- stormtrooper >=0.2.1
- torch >=2.0.0
- datasets >=2.14.5
- embetter >=0.5.2
- gensim >=4.2.0
- numpy >=1.23.0
- pandas >=2.0.0
- scikit-learn >=1.2.0
- torch >=2.0.1
- tqdm >=4.66.0
- transformers >=4.23.0
- nvidia/cuda 12.2.0-devel-ubuntu22.04 build