adaptive-linguistic-prompting-alp-multimodal-llm-phishing-detection

https://github.com/atharvab7/adaptive-linguistic-prompting-alp-multimodal-llm-phishing-detection

Last synced: 10 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: AtharvaB7
License: mit
Language: TeX
Default Branch: main
Size: 35.2 KB

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Releases: 0

Created about 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme License Citation

Adaptive-Linguistic-Prompting-ALP-Multimodal-LLM-Phishing-Detection

Data Source

This project uses a curated dataset originally created by Jehyun Lee et al., 2024 as part of their work on multimodal LLM-based phishing detection.

The original dataset is available at: MultimodalLLMPhishing_Detection (GitHub)
We apply a filtered version of this dataset (311 benign, 289 phishing samples), as described in Section 3.2 of our paper (Adaptive Linguistic Prompting (ALP) Enhances Phishing Webpage Detection in Multimodal Large Language Models).
All credit for the dataset content and original data collection goes to the original authors.

Ethical Statement:

This work is done for research purposes and for the general advancement of the use of LLMs in the space of cybersecurity and phishing detection. The few-shot prompts and the original dataset should NOT be used for any unethical or illegal purposes.

Research Paper:

The paper: Adaptive Linguistic Prompting (ALP) Enhances Phishing Webpage Detection in Multimodal Large Language Models
Accepted to NLP4PI @ ACL 2025
Bhargude, A., Gonehal, I., Yoon, D., (2025, May)

Owner

Name: Atharva Bhargude
Login: AtharvaB7
Kind: user

Repositories: 1
Profile: https://github.com/AtharvaB7

Citation (citations.bib)



@misc{agrawal2022largelanguagemodelsfewshot,
      title={Large Language Models are Few-Shot Clinical Information Extractors}, 
      author={Monica Agrawal and Stefan Hegselmann and Hunter Lang and Yoon Kim and David Sontag},
      year={2022},
      eprint={2205.12689},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2205.12689}, 
}

@misc{brown2020languagemodelsfewshotlearners,
      title={Language Models are Few-Shot Learners}, 
      author={Tom B. Brown and Benjamin Mann and Nick Ryder and Melanie Subbiah and Jared Kaplan and Prafulla Dhariwal and Arvind Neelakantan and Pranav Shyam and Girish Sastry and Amanda Askell and Sandhini Agarwal and Ariel Herbert-Voss and Gretchen Krueger and Tom Henighan and Rewon Child and Aditya Ramesh and Daniel M. Ziegler and Jeffrey Wu and Clemens Winter and Christopher Hesse and Mark Chen and Eric Sigler and Mateusz Litwin and Scott Gray and Benjamin Chess and Jack Clark and Christopher Berner and Sam McCandlish and Alec Radford and Ilya Sutskever and Dario Amodei},
      year={2020},
      eprint={2005.14165},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2005.14165}, 
}
@misc{touvron2023llamaopenefficientfoundation,
      title={LLaMA: Open and Efficient Foundation Language Models}, 
      author={Hugo Touvron and Thibaut Lavril and Gautier Izacard and Xavier Martinet and Marie-Anne Lachaux and Timothée Lacroix and Baptiste Rozière and Naman Goyal and Eric Hambro and Faisal Azhar and Aurelien Rodriguez and Armand Joulin and Edouard Grave and Guillaume Lample},
      year={2023},
      eprint={2302.13971},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2302.13971}, 
}
@misc{li2024knowphishlargelanguagemodels,
      title={KnowPhish: Large Language Models Meet Multimodal Knowledge Graphs for Enhancing Reference-Based Phishing Detection}, 
      author={Yuexin Li and Chengyu Huang and Shumin Deng and Mei Lin Lock and Tri Cao and Nay Oo and Hoon Wei Lim and Bryan Hooi},
      year={2024},
      eprint={2403.02253},
      archivePrefix={arXiv},
      primaryClass={cs.CR},
      url={https://arxiv.org/abs/2403.02253}, 
}
@misc{koide2024chatspamdetector,
      title={ChatSpamDetector: Leveraging Large Language Models for Effective Phishing Email Detection}, 
      author={Takashi Koide and Naoki Fukushi and Hiroki Nakano and Daiki Chiba},
      year={2024},
      eprint={2402.18093v1},
      archivePrefix={arXiv},
      primaryClass={cs.CR},
      url={https://arxiv.org/html/2402.18093v1}, 
}
@misc{lee2024multimodallargelanguagemodels,
      title={Multimodal Large Language Models for Phishing Webpage Detection and Identification}, 
      author={Jehyun Lee and Peiyuan Lim and Bryan Hooi and Dinil Mon Divakaran},
      year={2024},
      eprint={2408.05941},
      archivePrefix={arXiv},
      primaryClass={cs.CR},
      url={https://arxiv.org/abs/2408.05941}, 
}
@misc{ji2024evaluatingeffectivenessrobustnessvisual,
      title={Evaluating the Effectiveness and Robustness of Visual Similarity-based Phishing Detection Models}, 
      author={Fujiao Ji and Kiho Lee and Hyungjoon Koo and Wenhao You and Euijin Choo and Hyoungshick Kim and Doowon Kim},
      year={2024},
      eprint={2405.19598},
      archivePrefix={arXiv},
      primaryClass={cs.CR},
      url={https://arxiv.org/abs/2405.19598}, 
}

@misc{kulkarni2024mlllmevaluatingrobustness,
      title={From ML to LLM: Evaluating the Robustness of Phishing Webpage Detection Models against Adversarial Attacks}, 
      author={Aditya Kulkarni and Vivek Balachandran and Dinil Mon Divakaran and Tamal Das},
      year={2024},
      eprint={2407.20361},
      archivePrefix={arXiv},
      primaryClass={cs.CR},
      url={https://arxiv.org/abs/2407.20361}, 
}

@misc{divakaran2024llmscybersecuritynew,
      title={LLMs for Cyber Security: New Opportunities}, 
      author={Dinil Mon Divakaran and Sai Teja Peddinti},
      year={2024},
      eprint={2404.11338},
      archivePrefix={arXiv},
      primaryClass={cs.CR},
      url={https://arxiv.org/abs/2404.11338}, 
}

@misc{wei2023chainofthoughtpromptingelicitsreasoning,
      title={Chain-of-Thought Prompting Elicits Reasoning in Large Language Models}, 
      author={Jason Wei and Xuezhi Wang and Dale Schuurmans and Maarten Bosma and Brian Ichter and Fei Xia and Ed Chi and Quoc Le and Denny Zhou},
      year={2023},
      eprint={2201.11903},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2201.11903}, 
}

@misc{Lee2024GitHub,
  author = {Jehyun Lee},
  title = {Multimodal LLM Phishing Detection - GitHub},
  year = {2024},
  url = {https://github.com/JehLeeKR/Multimodal_LLM_Phishing_Detection/},
  note = {Accessed: 2025-01-29}
}
@misc{ExplodingTopics2024,
  author = {Josh Howarth},
  title = {Most Visited Websites in the World (November 2024)},
  year = {2025},
  url = {https://explodingtopics.com/blog/most-visited-websites},
  note = {Accessed: 2025-01-29}
}
@misc{SecureList2023,
  author = {Vladislav Tushkanov},
  title = {What does ChatGPT know about phishing?},
  year = {2023},
  url = {https://securelist.com/chatgpt-anti-phishing/109590/},
  note = {Accessed: 2025-01-29}
}

@ARTICLE{10735206,
  author={Zara, Ume and Ayyub, Kashif and Ullah Khan, Hikmat and Daud, Ali and Alsahfi, Tariq and Gulzar Ahmad, Saima},
  journal={IEEE Access}, 
  title={Phishing Website Detection Using Deep Learning Models}, 
  year={2024},
  volume={12},
  number={},
  pages={167072-167087},
  keywords={Phishing;Blocklists;Accuracy;Uniform resource locators;Protocols;Internet;Accesslists;Principal component analysis;IP networks;Feature extraction;Deep learning;ensemble learning;feature selection;GRU;LSTM;machine learning;phishing detection;RNN;RF;XGBoost},
  doi={10.1109/ACCESS.2024.3486462}}
@article{Yin_2024,
   title={A survey on multimodal large language models},
   volume={11},
   ISSN={2053-714X},
   url={http://dx.doi.org/10.1093/nsr/nwae403},
   DOI={10.1093/nsr/nwae403},
   number={12},
   journal={National Science Review},
   publisher={Oxford University Press (OUP)},
   author={Yin, Shukang and Fu, Chaoyou and Zhao, Sirui and Li, Ke and Sun, Xing and Xu, Tong and Chen, Enhong},
   year={2024},
   month=nov }
@article{article,
author = {Xiang, Guang and Hong, Jason and Rosé, Carolyn and Cranor, Lorrie},
year = {2011},
month = {09},
pages = {21},
title = {CANTINA+: A Feature-Rich Machine Learning Framework for Detecting Phishing Web Sites},
volume = {14},
journal = {ACM Trans. Inf. Syst. Secur.},
doi = {10.1145/2019599.2019606}
}
@inproceedings{whittaker2010large,
  title={Large-Scale Automatic Classification of Phishing Pages.},
  author={Whittaker, Colin and Ryner, Brian and Nazif, Marria},
  booktitle={Ndss},
  volume={10},
  pages={2010},
  year={2010}
}
@inproceedings{lin2021phishpedia,
  title={Phishpedia: A hybrid deep learning based approach to visually identify phishing webpages},
  author={Lin, Yun and Liu, Ruofan and Divakaran, Dinil Mon and Ng, Jun Yang and Chan, Qing Zhou and Lu, Yiwen and Si, Yuxuan and Zhang, Fan and Dong, Jin Song},
  booktitle={30th USENIX Security Symposium (USENIX Security 21)},
  pages={3793--3810},
  year={2021}
}
@inproceedings{abdelnabi2020visualphishnet,
  title={Visualphishnet: Zero-day phishing website detection by visual similarity},
  author={Abdelnabi, Sahar and Krombholz, Katharina and Fritz, Mario},
  booktitle={Proceedings of the 2020 ACM SIGSAC conference on computer and communications security},
  pages={1681--1698},
  year={2020}
}
@misc{anongithub,
  author = {Atharva Bhargude},
  title = {Adaptive-Linguistic-Prompting-ALP-Multimodal-LLM-Phishing-Detection},
  year = {2025},
  url = {https://github.com/AtharvaB7/Adaptive-Linguistic-Prompting-ALP-Multimodal-LLM-Phishing-Detection},
  note = {Accessed: 2025-01-30}
}
@misc{acharya2024,
      title={Pirates of Charity: Exploring Donation-based Abuses in Social Media Platforms}, 
      author={Acharya and Lazzaro and Cinà and Holz},
      year={2024},
      eprint={2412.15621},
      archivePrefix={arXiv},
      primaryClass={cs.CR},
      url={https://arxiv.org/abs/2412.15621}, 
}

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

adaptive-linguistic-prompting-alp-multimodal-llm-phishing-detection

Science Score: 54.0%