https://github.com/centerforaisafety/tdc2023-starter-kit

This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.

https://github.com/centerforaisafety/tdc2023-starter-kit

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (5.5%) to scientific vocabulary

Keywords

neurips-2023 redteaming tdc-2023 trojan
Last synced: 6 months ago · JSON representation

Repository

This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.

Basic Info
  • Host: GitHub
  • Owner: centerforaisafety
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage: https://trojandetection.ai/
  • Size: 1.58 MB
Statistics
  • Stars: 75
  • Watchers: 6
  • Forks: 26
  • Open Issues: 0
  • Releases: 0
Topics
neurips-2023 redteaming tdc-2023 trojan
Created over 2 years ago · Last pushed almost 2 years ago
Metadata Files
Readme License

README.md

Starter Kit for TDC 2023 (LLM Edition)

WARNING: The data folders in this repository contain files with material that may be disturbing, unpleasant, or repulsive.

This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition. To learn more about the competition, please see the competition website. Starter kits for individual tracks are in the trojan_detection and red_teaming folders. Please see the README in those folders for instructions on downloading data, running baselines, and generating submissions.

Post-competition evaluations: To evaluate methods on the held-out data and behavior classifiers, see the Local Evaluation section in the README file for each track. These scores can be compared with the official leaderboard scores.

Citation

If you find this useful in your research, please consider citing:

@inproceedings{tdc2023,
  title={TDC 2023 (LLM Edition): The Trojan Detection Challenge},
  author={Mantas Mazeika and Andy Zou and Norman Mu and Long Phan and Zifan Wang and Chunru Yu and Adam Khoja and Fengqing Jiang and Aidan O'Gara and Ellie Sakhaee and Zhen Xiang and Arezoo Rajabi and Dan Hendrycks and Radha Poovendran and Bo Li and David Forsyth},
  booktitle={NeurIPS Competition Track},
  year={2023}
}

Owner

  • Name: centerforaisafety
  • Login: centerforaisafety
  • Kind: organization

GitHub Events

Total
  • Watch event: 8
  • Fork event: 2
Last Year
  • Watch event: 8
  • Fork event: 2

Dependencies

red_teaming/requirements.txt pypi
  • huggingface_hub ==0.16.4
  • nltk ==3.8.1
  • numpy ==1.24.2
  • openai ==0.27.8
  • pandas ==1.16.0
  • scikit-learn ==1.2.2
  • sentence-transformers ==2.2.2
  • tokenizers ==0.13.3
  • torch ==2.0.0
  • tqdm ==4.65.0
  • transformers ==4.31.0
  • wandb ==0.14.0
trojan_detection/requirements.txt pypi
  • nltk ==3.8.1
  • numpy ==1.24.2
  • sentence-transformers ==2.2.2
  • tokenizers ==0.13.3
  • torch ==2.0.0
  • tqdm ==4.65.0
  • transformers ==4.31.0
  • wandb ==0.14.0
red_teaming/replicating_evaluation_server/local_generation_docker/Dockerfile docker
  • nvcr.io/nvidia/pytorch 22.12-py3 build