SLMs Meet GraphRAG: A Structured Approach to Context-Aware Cybersecurity Hint Generation

Authors

DOI:

https://doi.org/10.34190/iccws.21.1.4434

Keywords:

Cybersecurity education, Small Language Model (SLM), Retrieval-Augmented Generation (RAG), Knowledge graph, Human-in-the-Loop

Abstract

Generating hints for learners who are engaged in hands-on cybersecurity exercises is the goal of our research. Learners sometimes get stuck or frustrated, they head in the wrong direction or are missing information that is necessary for solving an exercise. While using large language models (LLMs) is an option, LLMs typically require the sharing of student data with third-party AI providers. In order to improve privacy and minimize cost and computational overhead, previous research has explored using locally deployed small language models (SLMs) with retrieval-augmented generation (RAG). However while RAG has been shown to enhance SLM capabilities without the need to fine tune, it falls short when answering open-ended or multi-step questions that require reasoning across interconnected concepts. This limitation is particularly evident in cybersecurity education, where students often need help understanding how threats, tools, and strategies relate to one another. The cybersecurity hint system EDUHints (Wolff et al, 2025) currently relies on a standard RAG pipeline. In classroom testing, students were unsure whether generated hints meaningfully answered their questions. To address this challenge, we present a custom GraphRAG approach that builds on a proposed cybersecurity education focused ontology and knowledge graph called AISecKG. We extend the ontology to let us incorporate natural language-to-bash command mappings, a valuable feature as students tend to ask questions regarding command-line use. Graph data is extracted using multiple methods and semantically scored to prioritize only the most relevant results. Our pipeline currently employs Microsoft’s Phi-3-mini-4k-instruct SLM, integrates LangChain for modular orchestration, and uses Neo4j as the graph database. We survey cybersecurity instructors to rate responses generated by the EDUHints and our GraphRAG system. Results show that hints generated using a GraphRAG are preferred almost three times more by cybersecurity instructors. This suggests that an SLM’s educational hint generation abilities can be improved through our GraphRAG architecture.  

Author Biographies

Ishan Abraham, Lewis & Clark College

Ishan Abraham is a senior at Lewis & Clark College with a major in Computer Science & Mathematics and two minors; Data Science and Entreprenurial Leadership. 

Jens Mache, Lewis & Clark College

Jens Mache is a professor of computer science at Lewis & Clark College in Portland, Oregon. Cybersecurity certifications include SANS/ GIAC Certified Intrusion Analyst (GCIA), Penetration Tester (GPEN), Incident Handler (GCIH). Publications include "Training Artificial Neural Networks to Predict the 3-5 MeV Relativistic Electron Flux at Geosynchronous Orbit" from 1994.

Taylor Wolff, The Evergreen State College

Taylor is an undergraduate student and research assistant at the Evergreen State College

Richard Weiss, The Evergreen State College

Richard Weiss has been at the Evergreen State College since 2005. He has a Ph.D. in
mathematics from Harvard University. His research has included cybersecurity
education, computer vision and robotics, applications of machine learning, computer
architecture. He was a research faculty member in Computer Vision at the University of
Massachusetts for 15 years.

Downloads

Published

19-02-2026