V2W-BERT: A Framework for Effective Hierarchical Multiclass Classification of Software Vulnerabilities: A Framework for Effective Hierarchical Multiclass Classification of Software Vulnerabilities

Siddhartha Shankar Das, Edoardo Serra, Mahantesh Halappanavar, Alex Pothen, Ehab Al-Shaer

Research output: Chapter in Book/Report/Conference proceedingChapter

38 Scopus citations

Abstract

We consider the problem of automating the mapping of observed vulnerabilities in software listed in Common Vulnerabilities and Exposures (CVE) reports to weaknesses listed in Common Weakness Enumerations (CWE) reports, a hierarchically designed dictionary of software weaknesses. Mapping of CVEs to CWEs provides a means to understand how they might be exploited for malicious purposes, and to mitigate their impact. Since manual mapping of CVEs to CWEs is not a viable approach due to their ever-increasing sizes, automated approaches need to be devised but obtaining highly accurate mapping is a challenging problem. We present a novel Transformer-based learning framework (V2W-BERT) in this paper to solve this problem by bringing together ideas from natural language processing, link prediction and transfer learning. Our method outperforms previous approaches not only for CWE instances with abundant data to train, but also for rare CWE classes with little or no data. Using vulnerability and weakness reports from MITRE and the National Vulnerability Database, we achieve up to 97% prediction accuracy for randomly partitioned data and up to 94% prediction accuracy in temporally partitioned data. We demonstrate significant improvements in using historical data to predict weaknesses for future instances of CVEs. We believe that our work will would influence the design of better automated mapping approaches, and also that this technology could be deployed for more effective cybersecurity.

Original languageAmerican English
Title of host publication2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA)
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665420990
DOIs
StatePublished - 1 Jan 2021
Event8th IEEE International Conference on Data Science and Advanced Analytics, DSAA 2021 - Virtual, Online, Portugal
Duration: 6 Oct 20219 Oct 2021

Publication series

Name2021 IEEE 8th International Conference on Data Science and Advanced Analytics, DSAA 2021

Conference

Conference8th IEEE International Conference on Data Science and Advanced Analytics, DSAA 2021
Country/TerritoryPortugal
CityVirtual, Online
Period6/10/219/10/21

Keywords

  • cyber-security
  • databases
  • dictionaries
  • link prediction
  • transformer

EGS Disciplines

  • Computer Sciences

Fingerprint

Dive into the research topics of 'V2W-BERT: A Framework for Effective Hierarchical Multiclass Classification of Software Vulnerabilities: A Framework for Effective Hierarchical Multiclass Classification of Software Vulnerabilities'. Together they form a unique fingerprint.

Cite this