alta2023_shared_task

Baseline model for the ALTA 2023 shared task

https://github.com/zhanhl316/alta2023_shared_task

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.8%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Baseline model for the ALTA 2023 shared task

Basic Info

Host: GitHub
Owner: zhanhl316
License: apache-2.0
Language: Python
Default Branch: main
Size: 19.2 MB

Statistics

Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created almost 3 years ago · Last pushed almost 3 years ago

Metadata Files

Readme Contributing License Code of conduct Citation

ALTA 2023 Shared Task

Homepage

Basic Task Description

The recent advancements in Large Language Models (LLMs) represent a paradigm shift in the field of human-computer interactions. However, akin to any groundbreaking technology, LLMs are a double-edged sword for our society. Beyond disseminating distorted news, the potential misappropriation of LLMs may engender a myriad of social and ethical dilemmas, including academic malfeasance and election manipulation. This incident underscores the escalating urgency within scholarly communities to devise strategies for the detection and thorough scrutiny of synthetic text.

How to use this baseline?

Step 0: Requirements

Python 3.8
Pytorch 1.8.1
CUDA 10.1

Step 1: Installation

Please follow the steps to initialize your enviroment. bash conda create -n alta2023_baseline python=3.8 source activate alta2023_baseline git clone https://github.com/zhanhl316/ALTA2023_shared_task.git cd ALTA2023_shared_task pip install -r requirements.txt

Step 2: Data and Pretrained Model Preparation

(1) Data Preparation: Please follow the format of train/test .json file in the folder "data", and replace them with your own train/dev/test files.

(2) The baseline model is based on RoBERTa (large). Pretrained Model Preparation: Please download Reberta-large model files from Huggingface Repo, Roberta-large, and put these files in the folder "pretrained_model/roberta-large".

Step 3: Training

shell sh run_glue.sh

Step 4: Test

sh run_glue-test.sh

Have any questions?

Please contact Haolan Zhan through haolan.zhan@monash.edu

Owner

Name: Haolan Zhan
Login: zhanhl316
Kind: user

Repositories: 1
Profile: https://github.com/zhanhl316

Research Track: Natural Language Processing, Deep Learning, Text Generation.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science