https://github.com/lzw108/fmd

This is a continuous project on Financial Misinformation Detection (FMD).

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (9.4%) to scientific vocabulary

Last synced: 8 months ago · JSON representation

Repository

This is a continuous project on Financial Misinformation Detection (FMD).

Basic Info

Host: GitHub
Owner: lzw108
License: mit
Language: Jupyter Notebook
Default Branch: main
Homepage:
Size: 1.89 MB

Statistics

Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created almost 2 years ago · Last pushed over 1 year ago

Metadata Files

Readme License

Financial Misinformation Detection

This work also supported Financial Misinformation Detection (FMD) challenge at COLING 2025

Paper arXiv

News

📢 Jan. 20, 2025 Our FMDLlama paper has been accepted by WWW 2025 as a short paper.

📢 Jan. 20, 2025 The Financial Misinformation Detection Challenge has successfully wrapped up at COLING 2025. Learn more about the challenge.

📢 Sep. 26, 2024 New preprint paper related to this work: "FMDLlama: Financial Misinformation Detection based on Large Language Models" at arXiv.

Datasets

Practice data: Link
Complete train data: Link
Test data: TBD

Usage

Data preprocess

You can follow the practicedatapreprocess.ipynb file to get instruction train/val/test data in ./data/practicedata/instructdata/ path. The default is an instruction example, change accordingly as need.

Convert data format

```python

train

python src/converttoconvdata.py --origdata ./data/practicedata/instructdata/FMDtrain.json --writedata ./data/practicedata/instructdata/train.json --dataset_name fmd

val

python src/converttoconvdata.py --origdata ./data/practicedata/instructdata/FMDval.json --writedata ./data/practicedata/instructdata/val.json --dataset_name fmd ```

The commands above are to convert the data into dialogue data format for LLMs training. The current format is used for the LLaMA2 series (i.e. "Human": "sentence", "Assistant": "sentence" ). If you need to switch to other LLMs, please make the corresponding modifications.

Fine-tune

python bash ./src/run_sft.sh

Inference

python bash src/run_inference.sh

Evaluation

Follow the evaluation.ipynb file to get F1, rouge, bertscore, and final score.

License

This project is licensed under [MIT]. Please find more details in the MIT file.

Citation

@article{liu2024fmdllama, title={FMDLlama: Financial Misinformation Detection based on Large Language Models}, author={Liu, Zhiwei and Zhang, Xin and Yang, Kailai and Xie, Qianqian and Huang, Jimin and Ananiadou, Sophia}, journal={arXiv preprint arXiv:2409.16452}, year={2024} }

GitHub Events

Total

Push event: 4

Last Year

Push event: 4

Dependencies

requirements.txt pypi

bert-score *
datasets *
deepspeed *
flash-attn *
gradio_client *
peft *
rouge_score *
sentencepiece *
textblob *
torch *
transformers *
wandb *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/lzw108/fmd

Science Score: 23.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Financial Misinformation Detection

News

Datasets

Usage

Data preprocess

Convert data format

train

val

Fine-tune

Inference

Evaluation

License

Citation

GitHub Events

Total

Last Year

Dependencies