Static Vulnerability Analysis Using Intermediate Representations: A Literature Review


  • Adam Spanier Univeristy of Nebraska at Omaha
  • William Mahoney Univeristy of Nebraska at Omaha



Vulnerability detection, LLVM, Compilers, Intermediate representation analysis


Analysis (SA) in Cybersecurity is a practice aimed at detecting vulnerabilities within the source code of a program. Modern SA applications, though highly sophisticated, lack programming language agnostic generalization, instead requiring codebase specific implementations for each programming language. The manner in which SA is implemented today, though functional, requires significant man hours to develop and maintain, higher costs due to custom applications for each language, and creates inconsistencies in implementation from SA-tool to SA-tool. A source of programming language generalization occurs within compilers. During the compilation process, source code is converted into a grammatically consistent Intermediate Representation (IR) (e.g. LLVM-IR) before being converted to an output format. The grammatical consistencies provided by the IR theoretically allow the same program written in different languages to be analyzed using the same mechanism. By using the IRs of compiled programming languages as the codebase of SA practices, multiple programming languages can be encompassed by a single SA tool. To begin understanding the possibilities the combination of SA and IRs may reveal, this research presents the following outcomes: 1) a systematic literature search, 2) a literature review, and 3) the classification of existing work pertaining to SA practices using IRs. The results of the study indicate that generalized Static Analysis using the LLVM IR is already a common practice in all compilers, but that the extended use of the LLVM IR in Cybersecurity SA practices aimed at finding vulnerabilities in source code remains underdeveloped.