Systematically Analysing Prompt Injection Vulnerabilities in Diverse LLM Architectures

Authors

  • Thomas Heverin The Baldwin School
  • Victoria Benjamin The Baldwin School
  • Emily Braca The Baldwin School
  • Israel Carter The Baldwin School
  • Hafsa Kanchwala The Baldwin School
  • Nava Khojasteh The Baldwin School
  • Charly Landow The Baldwin School
  • Yi Luo The Baldwin School
  • Caroline Ma The Baldwin School
  • Anna Magarelli The Baldwin School
  • Rachel Mirin The Baldwin School
  • Avery Moyer The Baldwin School
  • Kayla Simpson The Baldwin School
  • Amelia Skawinski The Baldwin School

DOI:

https://doi.org/10.34190/iccws.20.1.3292

Keywords:

artificial intelligence, prompt injections, AI security

Abstract

This paper presents an exploratory systematic analysis of prompt injection vulnerabilities across 36 diverse large language models (LLMs), revealing significant security concerns in these widely adopted AI tools. Prompt injection attacks, which involve crafting inputs to manipulate LLM outputs, pose risks such as unauthorized access, data leaks, and misinformation. Through 144 tests with four tailored prompt injections, we found that 56% of attempts successfully bypassed LLM safeguards, with vulnerability rates ranging from 53% to 61% across different prompt designs. Notably, 28% of tested LLMs were susceptible to all four prompts, indicating a critical lack of robustness. Our findings show that model size and architecture significantly influence susceptibility, with smaller models generally more prone to attacks. Statistical methods, including random forest feature analysis and logistic regression, revealed that model parameters play a primary role in vulnerability, though LLM type also contributes. Clustering analysis further identified distinct vulnerability profiles based on model configuration, underscoring the need for multi-faceted defence strategies. The study's implications are broad, particularly for sectors integrating LLMs into sensitive applications. Our results align with OWASP and MITRE’s security frameworks, highlighting the urgency for proactive measures, such as human oversight and trust boundaries, to protect against prompt injection risks. Future research should explore multilingual prompt injections and multi-step attack defences to enhance the resilience of LLMs in complex, real-world environments. This work contributes valuable insights into LLM vulnerabilities, aiming to advance the field toward safer AI deployments.

Author Biography

Thomas Heverin, The Baldwin School

Dr. Thomas Heverin teaches artificial intelligence as well as ethical hacking at the Baldwin School, an all-girls college preparatory school. With over 15 years of experience in cybersecurity and teaching, he holds a CISSP certification, a Ph.D. in Information Science, and a U.S. Navy patent focused on cyber-risk assessments.

Downloads

Published

2025-03-24