Evaluating the Effectiveness of Psychological Prompt Injection Attacks on Large Language Models for Social Engineering Artifact Generation
DOI:
https://doi.org/10.34190/eccws.24.1.3515Keywords:
Prompt injections, Psychological techniques, Social engineering, AI vulnerabilitiesAbstract
This study explores the vulnerability of Large Language Models (LLMs) to prompt injection attacks, a critical security concern. We investigate the effectiveness of four psychological techniques (PTs) from social engineering – Impersonation, Incentive, Persuasion, and Quid Pro Quo – in facilitating these attacks. Prompt injection involves manipulating LLMs by embedding malicious instructions within user prompts, potentially generating harmful content or compromising sensitive data. Understanding these mechanisms is crucial for developing effective defenses. Our research assesses how these PTs influence prompt injection success rates against ChatGPT-4o mini and Gemma-7b-it LLMs used for ChatGPT and Gemini respectively. We hypothesized that PTs significantly increase the likelihood of successful attacks, with some techniques being more effective. 220 prompt injection tests (110 per LLM) were conducted, designed to elicit social-engineering artifacts like phishing emails, fake login screens, and ransomware notes, evaluating model susceptibility to diverse attack vectors. The four PTs were chosen based on their relevance to manipulating human behavior in social engineering. Impersonation involves assuming a trusted identity, Incentive offers rewards, Persuasion uses manipulative tactics, and Quid Pro Quo involves reciprocal exchanges. These techniques were adapted for prompt injections to simulate real-world social engineering scenarios. Statistical methods, including ANOVA and Kruskal-Wallis tests, assessed the overall impact of PTs. Mann-Whitney U tests with Bonferroni correction compared individual techniques, and Cohen’s d measured effect sizes. Results demonstrate a statistically significant impact of PTs on prompt injection success. Impersonation was most effective across both LLMs, followed by Persuasion and Quid Pro Quo, with Incentive being least effective. These findings align with social engineering principles, highlighting the power of impersonation and other manipulative tactics. Our research has significant implications for LLM security and AI-driven social engineering. LLM vulnerability to psychologically-driven prompt injections necessitates proactive security measures. Future research should focus on robust defense mechanisms, explore the interplay of PTs, and investigate their impact on LLM security. This study contributes to understanding LLM vulnerabilities and developing more resilient AI systems.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 European Conference on Cyber Warfare and Security

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.