Evaluating Zero-Shot Chatgpt Performance on Predicting CVE Data From Vulnerability Descriptions
DOI:
https://doi.org/10.34190/eccws.23.1.2285Keywords:
AI, ChatGPT, CVE, ML, NVD, vulnerability managementAbstract
Vulnerability management is a critical industry activity driven by compliance and regulations aiming to allocate best-fitted resources to address vulnerabilities efficiently. The increasing number of vulnerabilities reported and discovered by a diverse community results in varying quality of the reports and differing perspectives. To tackle this, machine learning (ML) has shown promise in automating vulnerability assessments. While some existing ML approaches have demonstrated feasibility, there is room for improvement. Additionally, gaps remain in the literature to understand how the specific terminology used in vulnerability databases and reports influences ML interpretation. Large Language Model (LLM) systems, such as ChatGPT, are praised for their versatility and high applicability to any domain. However, how well or poorly a state-of-the-art LLM system performs on existing vulnerability datasets at a large scale and across different scoring metrics needs to be clarified or well-researched. This paper aims to close several such gaps and present a more precise and comprehensive picture of how ChatGPT performs on predicting vulnerability metrics based on NVD's CVE vulnerability database. We analyze the responses from ChatGPT on a set of 113,228 (~50% out of all NVD vulnerabilities) CVE vulnerability descriptions and measure its performance against NVD-CVE as ground truth. We measure and analyze the predictions for several vulnerabilities in metadata and calculate performance statistics.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 European Conference on Cyber Warfare and Security
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.