TY - GEN
T1 - Generating Phishing Attacks and Novel Detection Algorithms in the Era of Large Language Models
AU - Fairbanks, Jeffrey
AU - Serra, Edoardo
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Phishing is a significant cybersecurity threat, with the financial impact of email security breaches and lack of awareness estimated to be between $50-100 billion in 2022. The advent of Large Language Models (LLMs) has further automated and intensified phishing attacks, posing greater challenges for defenders, especially large organizations being targeted by Advanced Persistent Threats (APT) at scale, such as Department of Energy National Labs. This study presents the development of two innovative algorithms. The first algorithm improves the efficacy of phishing attacks, while the second algorithm counteracts and defends against phishing attacks that leverage LLMs. The attack method takes detectable malicious phishing emails and rewrites them using an innovative LLM-based automatic output optimization technique, which includes Reflection and Beam Search, while preserving the original semantic meaning and Indicators Of Compromise (IOC). This approach bypasses most-commonly used institutional security tools, NLP and other LLM phishing detection systems. The results indicate that this attack algorithm increases the success rate of phishing attacks by up to 98%. The defensive algorithm presented in this research is also employed for defensive measures. When the proposed defensive algorithm is applied, it identifies malicious emails with 97% greater accuracy. The research detailed in this paper demonstrates that these algorithm serve dual purposes: one is utilized as an attack mechanism by altering the output, and the other as a defensive measure against phishing attacks by modifying the defensive prompt. Taking these algorithms and implementing them in the Department of Energy Laboratory (DOE) has demonstrated the effectiveness of applying these approaches to real world applications, and has been implemented into large-scale production environments.
AB - Phishing is a significant cybersecurity threat, with the financial impact of email security breaches and lack of awareness estimated to be between $50-100 billion in 2022. The advent of Large Language Models (LLMs) has further automated and intensified phishing attacks, posing greater challenges for defenders, especially large organizations being targeted by Advanced Persistent Threats (APT) at scale, such as Department of Energy National Labs. This study presents the development of two innovative algorithms. The first algorithm improves the efficacy of phishing attacks, while the second algorithm counteracts and defends against phishing attacks that leverage LLMs. The attack method takes detectable malicious phishing emails and rewrites them using an innovative LLM-based automatic output optimization technique, which includes Reflection and Beam Search, while preserving the original semantic meaning and Indicators Of Compromise (IOC). This approach bypasses most-commonly used institutional security tools, NLP and other LLM phishing detection systems. The results indicate that this attack algorithm increases the success rate of phishing attacks by up to 98%. The defensive algorithm presented in this research is also employed for defensive measures. When the proposed defensive algorithm is applied, it identifies malicious emails with 97% greater accuracy. The research detailed in this paper demonstrates that these algorithm serve dual purposes: one is utilized as an attack mechanism by altering the output, and the other as a defensive measure against phishing attacks by modifying the defensive prompt. Taking these algorithms and implementing them in the Department of Energy Laboratory (DOE) has demonstrated the effectiveness of applying these approaches to real world applications, and has been implemented into large-scale production environments.
KW - Agentic AI
KW - Artificial Intelligence (AI)
KW - Beam Search
KW - Big Data
KW - Email Phishing
KW - Large Language Model (LLM)
KW - Reflection
UR - http://www.scopus.com/inward/record.url?scp=85218014060&partnerID=8YFLogxK
U2 - 10.1109/BigData62323.2024.10825007
DO - 10.1109/BigData62323.2024.10825007
M3 - Conference contribution
AN - SCOPUS:85218014060
T3 - Proceedings - 2024 IEEE International Conference on Big Data, BigData 2024
SP - 2314
EP - 2319
BT - Proceedings - 2024 IEEE International Conference on Big Data, BigData 2024
A2 - Ding, Wei
A2 - Lu, Chang-Tien
A2 - Wang, Fusheng
A2 - Di, Liping
A2 - Wu, Kesheng
A2 - Huan, Jun
A2 - Nambiar, Raghu
A2 - Li, Jundong
A2 - Ilievski, Filip
A2 - Baeza-Yates, Ricardo
A2 - Hu, Xiaohua
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 IEEE International Conference on Big Data, BigData 2024
Y2 - 15 December 2024 through 18 December 2024
ER -