VWC-BERT: Scaling Vulnerability-Weakness-Exploit Mapping on Modern AI Accelerators

Siddhartha Shankar Das, Mahantesh Halappanavar, Antonino Tumeo, Edoardo Serra, Alex Pothen, Ehab Al-Shaer

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Defending cybersystems needs accurate mapping of software and hardware vulnerabilities to generalized descriptions of weaknesses, and weaknesses to exploits. These mappings enable cyber defenders to build plans for effective defense and assessment of potential risks to a cybersystem. With close to 200k vulnerabilities, manual mapping is not a feasible option. However, automated mapping is challenging due to limited training data, computational intractability, and limitations in computational natural language processing. Tools based on breakthroughs in Transformer-based language models have been demonstrated to classify vulnerabilities with high accuracy. We make three key contributions in this paper: (1) We present a new framework, VWC-BERT, that augments the Transformer-based hierarchical multi-class classification framework of Das et al. (V2W-BERT) with the ability to map weaknesses to exploits. (2) We implement VWC-BERT on modern AI accelerator platforms using two data parallel techniques for the pre-training phase and demonstrate nearly linear speedups across NVIDIA accelerator platforms. We observe nearly linear speedups for up to 16 V100 and 8 A100 GPUs, and about 3.4× speedup for A100 relative to V100 GPUs. Enabled by scaling, we also demonstrate higher accuracy using a larger language model, RoBERTa-Large. We show up to 87% accuracy for strict and up to 98% accuracy for relaxed classification. (3) We develop a novel parallel link manager for the link prediction phase and demonstrate up to 21× speedup with 16 V100 GPUs relative to one V100 GPU, and thus reducing the runtime from 2.5 hours to 10 minutes. We believe that generalizability and scalability of VWC-BERT will benefit both the theoretical development and practical deployment of novel cyberdefense solutions and vulnerability classification.

Original languageEnglish
Title of host publicationProceedings - 2022 IEEE International Conference on Big Data, Big Data 2022
EditorsShusaku Tsumoto, Yukio Ohsawa, Lei Chen, Dirk Van den Poel, Xiaohua Hu, Yoichi Motomura, Takuya Takagi, Lingfei Wu, Ying Xie, Akihiro Abe, Vijay Raghavan
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1224-1229
Number of pages6
ISBN (Electronic)9781665480451
DOIs
StatePublished - 2022
Event2022 IEEE International Conference on Big Data, Big Data 2022 - Osaka, Japan
Duration: 17 Dec 202220 Dec 2022

Publication series

NameProceedings - 2022 IEEE International Conference on Big Data, Big Data 2022

Conference

Conference2022 IEEE International Conference on Big Data, Big Data 2022
Country/TerritoryJapan
CityOsaka
Period17/12/2220/12/22

Keywords

  • AI Accelerators
  • Cybersecurity
  • Deep Learning
  • Language Models
  • Transformers

Fingerprint

Dive into the research topics of 'VWC-BERT: Scaling Vulnerability-Weakness-Exploit Mapping on Modern AI Accelerators'. Together they form a unique fingerprint.

Cite this