An Assessment Methodology for Learning Mission Characteristics

Authors

DOI:

https://doi.org/10.34190/eccws.25.1.4599

Keywords:

Protocol reverse engineering, Parsing, MAVLink, Space packet protocol, Traffic validation

Abstract

This paper presents a novel methodology for assessing the effectiveness of machine learning algorithms in deriving a protocol specification from a set of network traffic captures. Such specifications describe the data formats and mission characteristics associated with valid messages and form the basis for protecting military vehicles from malicious traffic. This paper enhances previous work, which automatically generates a hardware guard from a hand-written grammar, to produce a fully automated, end-to-end process that generates a parser, either in software or as a hardware guard, directly from mission training data sets. A hardware guard is realized as a Field Programmable Gate Array (FPGA) circuit block implementing a parser. The parser performs deep packet validation, checking that every message conforms to the associated grammar and rejecting invalid packets. Several modern machine learning algorithms and specification inference tools exist that can be leveraged to automatically infer a specification from a data capture. Unfortunately, to date, there has been a paucity of mission data sets to provide ground-truth, and the lack of a common set of metrics to assess learning algorithms. To this end, this paper introduces a parser repository that provides a ground-truth data set for a variety of formats and protocols, alongside equivalent protocol specifications expressed in several formalisms, namely BNF, Hammer, Daffodil, and Daedalus. Each set of equivalent specifications is provided with a common collection of true and false test vectors to validate their operation and benchmark learning performance using appropriate metrics. Specifications for standard formats, including JavaScript Object Notation (JSON), Universal Resource Locators (URLs), and Hypertext Transfer Protocol (HTTP), provide the basis for generic testing. The Micro Air Vehicle Link Protocol (MAVLink) and Space Packet Protocol (SPP) are used as examples of mission-oriented grammars. MAVLink is a byte-oriented protocol, while SPP is a bit-oriented protocol, leading to fundamental differences in the way that learning algorithms must operate to validate packets; the latter is also indicative of the complexities involved in parsing of SAE J1939 – used on ground vehicles – and MIL-STD-1553 – used on aircraft.

Author Biographies

Joshua Meise, Dartmouth College

Joshua Meise is a Ph.D. student at Dartmouth College. His research focuses on secure communicaitons systems.

Stephen Taylor, Dartmouth College

Stephen Taylor is a Professor of Computer Engineering at Dartmouth College. His academic
research focusses on systems security using System-on-Chip and FPGA devices. He is a former
DARPA Program Manager and member of the US Air Force Scientific Advisory Board.

Downloads

Published

2026-06-15