2025-07-04 15:20:02 +02:00
|
|
|
Runtime Verification Monitor Synthesis
|
|
|
|
======================================
|
2022-07-29 11:38:49 +02:00
|
|
|
|
2023-08-14 16:28:22 -05:00
|
|
|
The starting point for the application of runtime verification (RV) techniques
|
2022-07-29 11:38:49 +02:00
|
|
|
is the *specification* or *modeling* of the desired (or undesired) behavior
|
|
|
|
of the system under scrutiny.
|
|
|
|
|
|
|
|
The formal representation needs to be then *synthesized* into a *monitor*
|
|
|
|
that can then be used in the analysis of the trace of the system. The
|
|
|
|
*monitor* connects to the system via an *instrumentation* that converts
|
|
|
|
the events from the *system* to the events of the *specification*.
|
|
|
|
|
|
|
|
|
|
|
|
In Linux terms, the runtime verification monitors are encapsulated inside
|
|
|
|
the *RV monitor* abstraction. The RV monitor includes a set of instances
|
|
|
|
of the monitor (per-cpu monitor, per-task monitor, and so on), the helper
|
|
|
|
functions that glue the monitor to the system reference model, and the
|
|
|
|
trace output as a reaction to event parsing and exceptions, as depicted
|
|
|
|
below::
|
|
|
|
|
|
|
|
Linux +----- RV Monitor ----------------------------------+ Formal
|
|
|
|
Realm | | Realm
|
|
|
|
+-------------------+ +----------------+ +-----------------+
|
|
|
|
| Linux kernel | | Monitor | | Reference |
|
|
|
|
| Tracing | -> | Instance(s) | <- | Model |
|
|
|
|
| (instrumentation) | | (verification) | | (specification) |
|
|
|
|
+-------------------+ +----------------+ +-----------------+
|
|
|
|
| | |
|
|
|
|
| V |
|
|
|
|
| +----------+ |
|
|
|
|
| | Reaction | |
|
|
|
|
| +--+--+--+-+ |
|
|
|
|
| | | | |
|
|
|
|
| | | +-> trace output ? |
|
|
|
|
+------------------------|--|----------------------+
|
|
|
|
| +----> panic ?
|
|
|
|
+-------> <user-specified>
|
|
|
|
|
2025-07-04 15:20:02 +02:00
|
|
|
RV monitor synthesis
|
2022-07-29 11:38:49 +02:00
|
|
|
--------------------
|
|
|
|
|
2025-07-04 15:20:07 +02:00
|
|
|
The synthesis of a specification into the Linux *RV monitor* abstraction is
|
|
|
|
automated by the rvgen tool and the header file containing common code for
|
|
|
|
creating monitors. The header files are:
|
|
|
|
|
|
|
|
* rv/da_monitor.h for deterministic automaton monitor.
|
|
|
|
* rv/ltl_monitor.h for linear temporal logic monitor.
|
2022-07-29 11:38:49 +02:00
|
|
|
|
2025-07-04 15:20:02 +02:00
|
|
|
rvgen
|
2022-07-29 11:38:49 +02:00
|
|
|
-----
|
|
|
|
|
2025-07-04 15:20:07 +02:00
|
|
|
The rvgen utility converts a specification into the C presentation and creating
|
|
|
|
the skeleton of a kernel monitor in C.
|
2022-07-29 11:38:49 +02:00
|
|
|
|
|
|
|
For example, it is possible to transform the wip.dot model present in
|
|
|
|
[1] into a per-cpu monitor with the following command::
|
|
|
|
|
2025-07-04 15:20:02 +02:00
|
|
|
$ rvgen monitor -c da -s wip.dot -t per_cpu
|
2022-07-29 11:38:49 +02:00
|
|
|
|
|
|
|
This will create a directory named wip/ with the following files:
|
|
|
|
|
|
|
|
- wip.h: the wip model in C
|
|
|
|
- wip.c: the RV monitor
|
|
|
|
|
|
|
|
The wip.c file contains the monitor declaration and the starting point for
|
|
|
|
the system instrumentation.
|
|
|
|
|
2025-07-04 15:20:07 +02:00
|
|
|
Similarly, a linear temporal logic monitor can be generated with the following
|
|
|
|
command::
|
|
|
|
|
|
|
|
$ rvgen monitor -c ltl -s pagefault.ltl -t per_task
|
|
|
|
|
|
|
|
This generates pagefault/ directory with:
|
|
|
|
|
|
|
|
- pagefault.h: The Buchi automaton (the non-deterministic state machine to
|
|
|
|
verify the specification)
|
|
|
|
- pagefault.c: The skeleton for the RV monitor
|
|
|
|
|
|
|
|
Monitor header files
|
|
|
|
--------------------
|
|
|
|
|
|
|
|
The header files:
|
|
|
|
|
|
|
|
- `rv/da_monitor.h` for deterministic automaton monitor
|
|
|
|
- `rv/ltl_monitor` for linear temporal logic monitor
|
|
|
|
|
|
|
|
include common macros and static functions for implementing *Monitor
|
|
|
|
Instance(s)*.
|
2022-07-29 11:38:49 +02:00
|
|
|
|
2025-07-04 15:20:07 +02:00
|
|
|
The benefits of having all common functionalities in a single header file are
|
|
|
|
3-fold:
|
2022-07-29 11:38:49 +02:00
|
|
|
|
2025-07-04 15:20:07 +02:00
|
|
|
- Reduce the code duplication;
|
|
|
|
- Facilitate the bug fix/improvement;
|
|
|
|
- Avoid the case of developers changing the core of the monitor code to
|
|
|
|
manipulate the model in a (let's say) non-standard way.
|
2022-07-29 11:38:49 +02:00
|
|
|
|
2025-07-04 15:20:07 +02:00
|
|
|
rv/da_monitor.h
|
|
|
|
+++++++++++++++
|
2022-07-29 11:38:49 +02:00
|
|
|
|
|
|
|
This initial implementation presents three different types of monitor instances:
|
|
|
|
|
|
|
|
- ``#define DECLARE_DA_MON_GLOBAL(name, type)``
|
|
|
|
- ``#define DECLARE_DA_MON_PER_CPU(name, type)``
|
|
|
|
- ``#define DECLARE_DA_MON_PER_TASK(name, type)``
|
|
|
|
|
|
|
|
The first declares the functions for a global deterministic automata monitor,
|
|
|
|
the second for monitors with per-cpu instances, and the third with per-task
|
|
|
|
instances.
|
|
|
|
|
|
|
|
In all cases, the 'name' argument is a string that identifies the monitor, and
|
2025-07-04 15:20:02 +02:00
|
|
|
the 'type' argument is the data type used by rvgen on the representation of
|
2022-07-29 11:38:49 +02:00
|
|
|
the model in C.
|
|
|
|
|
|
|
|
For example, the wip model with two states and three events can be
|
|
|
|
stored in an 'unsigned char' type. Considering that the preemption control
|
|
|
|
is a per-cpu behavior, the monitor declaration in the 'wip.c' file is::
|
|
|
|
|
|
|
|
DECLARE_DA_MON_PER_CPU(wip, unsigned char);
|
|
|
|
|
|
|
|
The monitor is executed by sending events to be processed via the functions
|
|
|
|
presented below::
|
|
|
|
|
|
|
|
da_handle_event_$(MONITOR_NAME)($(event from event enum));
|
|
|
|
da_handle_start_event_$(MONITOR_NAME)($(event from event enum));
|
|
|
|
da_handle_start_run_event_$(MONITOR_NAME)($(event from event enum));
|
|
|
|
|
|
|
|
The function ``da_handle_event_$(MONITOR_NAME)()`` is the regular case where
|
|
|
|
the event will be processed if the monitor is processing events.
|
|
|
|
|
|
|
|
When a monitor is enabled, it is placed in the initial state of the automata.
|
|
|
|
However, the monitor does not know if the system is in the *initial state*.
|
|
|
|
|
|
|
|
The ``da_handle_start_event_$(MONITOR_NAME)()`` function is used to notify the
|
|
|
|
monitor that the system is returning to the initial state, so the monitor can
|
|
|
|
start monitoring the next event.
|
|
|
|
|
|
|
|
The ``da_handle_start_run_event_$(MONITOR_NAME)()`` function is used to notify
|
|
|
|
the monitor that the system is known to be in the initial state, so the
|
|
|
|
monitor can start monitoring and monitor the current event.
|
|
|
|
|
|
|
|
Using the wip model as example, the events "preempt_disable" and
|
|
|
|
"sched_waking" should be sent to monitor, respectively, via [2]::
|
|
|
|
|
|
|
|
da_handle_event_wip(preempt_disable_wip);
|
|
|
|
da_handle_event_wip(sched_waking_wip);
|
|
|
|
|
|
|
|
While the event "preempt_enabled" will use::
|
|
|
|
|
|
|
|
da_handle_start_event_wip(preempt_enable_wip);
|
|
|
|
|
|
|
|
To notify the monitor that the system will be returning to the initial state,
|
|
|
|
so the system and the monitor should be in sync.
|
|
|
|
|
2025-07-04 15:20:07 +02:00
|
|
|
rv/ltl_monitor.h
|
|
|
|
++++++++++++++++
|
|
|
|
This file must be combined with the $(MODEL_NAME).h file (generated by `rvgen`)
|
|
|
|
to be complete. For example, for the `pagefault` monitor, the `pagefault.c`
|
|
|
|
source file must include::
|
|
|
|
|
|
|
|
#include "pagefault.h"
|
|
|
|
#include <rv/ltl_monitor.h>
|
|
|
|
|
|
|
|
(the skeleton monitor file generated by `rvgen` already does this).
|
|
|
|
|
|
|
|
`$(MODEL_NAME).h` (`pagefault.h` in the above example) includes the
|
|
|
|
implementation of the Buchi automaton - a non-deterministic state machine that
|
|
|
|
verifies the LTL specification. While `rv/ltl_monitor.h` includes the common
|
|
|
|
helper functions to interact with the Buchi automaton and to implement an RV
|
|
|
|
monitor. An important definition in `$(MODEL_NAME).h` is::
|
|
|
|
|
|
|
|
enum ltl_atom {
|
|
|
|
LTL_$(FIRST_ATOMIC_PROPOSITION),
|
|
|
|
LTL_$(SECOND_ATOMIC_PROPOSITION),
|
|
|
|
...
|
|
|
|
LTL_NUM_ATOM
|
|
|
|
};
|
|
|
|
|
|
|
|
which is the list of atomic propositions present in the LTL specification
|
|
|
|
(prefixed with "LTL\_" to avoid name collision). This `enum` is passed to the
|
|
|
|
functions interacting with the Buchi automaton.
|
|
|
|
|
|
|
|
While generating code, `rvgen` cannot understand the meaning of the atomic
|
|
|
|
propositions. Thus, that task is left for manual work. The recommended pratice
|
|
|
|
is adding tracepoints to places where the atomic propositions change; and in the
|
|
|
|
tracepoints' handlers: the Buchi automaton is executed using::
|
|
|
|
|
|
|
|
void ltl_atom_update(struct task_struct *task, enum ltl_atom atom, bool value)
|
|
|
|
|
|
|
|
which tells the Buchi automaton that the atomic proposition `atom` is now
|
|
|
|
`value`. The Buchi automaton checks whether the LTL specification is still
|
|
|
|
satisfied, and invokes the monitor's error tracepoint and the reactor if
|
|
|
|
violation is detected.
|
|
|
|
|
|
|
|
Tracepoints and `ltl_atom_update()` should be used whenever possible. However,
|
|
|
|
it is sometimes not the most convenient. For some atomic propositions which are
|
|
|
|
changed in multiple places in the kernel, it is cumbersome to trace all those
|
|
|
|
places. Furthermore, it may not be important that the atomic propositions are
|
|
|
|
updated at precise times. For example, considering the following linear temporal
|
|
|
|
logic::
|
|
|
|
|
|
|
|
RULE = always (RT imply not PAGEFAULT)
|
|
|
|
|
|
|
|
This LTL states that a real-time task does not raise page faults. For this
|
|
|
|
specification, it is not important when `RT` changes, as long as it has the
|
|
|
|
correct value when `PAGEFAULT` is true. Motivated by this case, another
|
|
|
|
function is introduced::
|
|
|
|
|
|
|
|
void ltl_atom_fetch(struct task_struct *task, struct ltl_monitor *mon)
|
|
|
|
|
|
|
|
This function is called whenever the Buchi automaton is triggered. Therefore, it
|
|
|
|
can be manually implemented to "fetch" `RT`::
|
|
|
|
|
|
|
|
void ltl_atom_fetch(struct task_struct *task, struct ltl_monitor *mon)
|
|
|
|
{
|
|
|
|
ltl_atom_set(mon, LTL_RT, rt_task(task));
|
|
|
|
}
|
|
|
|
|
|
|
|
Effectively, whenever `PAGEFAULT` is updated with a call to `ltl_atom_update()`,
|
|
|
|
`RT` is also fetched. Thus, the LTL specification can be verified without
|
|
|
|
tracing `RT` everywhere.
|
|
|
|
|
|
|
|
For atomic propositions which act like events, they usually need to be set (or
|
|
|
|
cleared) and then immediately cleared (or set). A convenient function is
|
|
|
|
provided::
|
|
|
|
|
|
|
|
void ltl_atom_pulse(struct task_struct *task, enum ltl_atom atom, bool value)
|
|
|
|
|
|
|
|
which is equivalent to::
|
|
|
|
|
|
|
|
ltl_atom_update(task, atom, value);
|
|
|
|
ltl_atom_update(task, atom, !value);
|
|
|
|
|
|
|
|
To initialize the atomic propositions, the following function must be
|
|
|
|
implemented::
|
|
|
|
|
|
|
|
ltl_atoms_init(struct task_struct *task, struct ltl_monitor *mon, bool task_creation)
|
|
|
|
|
|
|
|
This function is called for all running tasks when the monitor is enabled. It is
|
|
|
|
also called for new tasks created after the enabling the monitor. It should
|
|
|
|
initialize as many atomic propositions as possible, for example::
|
|
|
|
|
|
|
|
void ltl_atom_init(struct task_struct *task, struct ltl_monitor *mon, bool task_creation)
|
|
|
|
{
|
|
|
|
ltl_atom_set(mon, LTL_RT, rt_task(task));
|
|
|
|
if (task_creation)
|
|
|
|
ltl_atom_set(mon, LTL_PAGEFAULT, false);
|
|
|
|
}
|
|
|
|
|
|
|
|
Atomic propositions not initialized by `ltl_atom_init()` will stay in the
|
|
|
|
unknown state until relevant tracepoints are hit, which can take some time. As
|
|
|
|
monitoring for a task cannot be done until all atomic propositions is known for
|
|
|
|
the task, the monitor may need some time to start validating tasks which have
|
|
|
|
been running before the monitor is enabled. Therefore, it is recommended to
|
|
|
|
start the tasks of interest after enabling the monitor.
|
|
|
|
|
2022-07-29 11:38:49 +02:00
|
|
|
Final remarks
|
|
|
|
-------------
|
|
|
|
|
2025-07-04 15:20:07 +02:00
|
|
|
With the monitor synthesis in place using the header files and
|
2025-07-04 15:20:02 +02:00
|
|
|
rvgen, the developer's work should be limited to the instrumentation
|
2022-07-29 11:38:49 +02:00
|
|
|
of the system, increasing the confidence in the overall approach.
|
|
|
|
|
|
|
|
[1] For details about deterministic automata format and the translation
|
|
|
|
from one representation to another, see::
|
|
|
|
|
|
|
|
Documentation/trace/rv/deterministic_automata.rst
|
|
|
|
|
2025-07-04 15:20:02 +02:00
|
|
|
[2] rvgen appends the monitor's name suffix to the events enums to
|
2022-07-29 11:38:49 +02:00
|
|
|
avoid conflicting variables when exporting the global vmlinux.h
|
|
|
|
use by BPF programs.
|