Welcome to the Soteria tutorial! This tutorial will guide you through the basics of writing a symbolic execution engine.
Today you'll be learning how to:
Symex.Make functorThe main utility of the Soteria library is the Symex.Make functor, which provides the necessary infrastructure to define and run a symbolic execution engine. To instantiate it, you need to provide a Solver module, which defines symbolic values and the solver interface to decide whether a symbolic expression of type Value.sbool is satisfiable.
module Symex = Symex.Make (Tiny_solver.Z3_solver)Soteria is a library made for writing symbolic processes. A symbolic process is a value of the type Symex.t, which is effectively a symbolic computation waiting to be executed. It is an execution that may depend on symbolic variables, and therefore may branch into multiple paths depending on the values of these variables.
The simplest such process is the symbolic process just returns a single branch with a given value, say the integer 0.
# Symex.return 0;;
- : int Symex.t = <abstr>This process has type int Symex.t, that is, a symbolic process that returns a value of type int. Again, this is a symbolic process, so it is not yet executed. Executing a symbolic process is done using the run function:
# Symex.run
- : ?fuel:Symex.Fuel_gauge.t ->
mode:Symex.Approx.t ->
'a Symex.t -> ('a * Symex.Value.sbool Typed.t list) list
= <fun>The function receives a symbolic process to execute, an approximation mode, and an optional fuel gauge should the user desire to limit the breadth or depth of execution. It returns a list of branches, where each branch is a pair of the value returned by the symbolic process and the path condition that leads to this branch (in the form of a list of symbolic booleans).
For instance, let's execute the symbolic process we just created, using the over-approximation mode, and without providing any fuel limitation. Both fuel and approximation mode are irrelevant for the simple processes we will be writing in this tutorial, and become interesting when analyzing real-world programs.
# Symex.return 0 |> Symex.run ~mode:OX;;
- : (int * Symex.Value.sbool Typed.t list) list = [(0, [])]As expected the result of the symbolic process is a single branch with the value 0 and an empty path condition [], which means that there is no condition for taking this branch, i.e., it is always taken.
A core necessity of symbolic execution is the ability to introduce fresh symbolic values. Soteria provides a convenient way to do this using the nondet function, which creates a fresh symbolic value of a given type. It is called nondet because it "abstracts over" a nondeterministic choice of any value of that given type:
# Symex.nondet;;
- : 'a Typed.ty -> 'a Typed.t Symex.t = <fun>nondet receives a type ty and returns a symbolic process that produces a fresh symbolic value of that type.
Let's run this process:
# Symex.nondet Typed.t_int |> Symex.run ~mode:OX;;
- : ([> Typed.T.sint ] Typed.t * Symex.Value.sbool Typed.t list) list =
[(V|1|, [])]We obtain a single branch with the fresh symbolic integer denoted V|1|, which is a pretty-printing for the symbolic variable named "1". Again, the path condition is empty, meaning that this branch is always taken.
On their own, nondet and return are not very useful. What we really want is to compose symbolic processes together in order to build more complex symbolic computations. Without going into details about monads, Soteria provides a binding operators (let*) to do exactly that.
(* Puts the let* operator in scope *)
open Symex.Syntax
(* Puts infix operators for symbolic values in scope.
These operators are postifixed with "@" to avoid
conflicts with the standard library. *)
open Typed.Infix
(* Puts the symbolic integer syntax in scope,
so that we can write "2s" for the symbolic integer 2,
instead of "Typed.int 2" *)
open Typed.Syntax# let process =
let* x = Symex.nondet Typed.t_int in
Symex.return (x +@ 2s);;
val process : ([> Typed.T.sint ] as '_weak1) Typed.t Symex.t = <abstr>The above code snippet creates a symbolic process that first introduces a fresh symbolic variable x of type int, and then returns the symbolic expression x +@ 2s. Note how familiar the syntax is: effectively, this just looks like regular OCaml code. Using the let* operator, the value Symex.nondet Typed.t_int (which is of type int Symex.t) is "unwrapped", and x is not the process, but the value returned by the process. Effectively, Soteria takes care of the symbolic execution under the hood, so that you can write code that looks like regular OCaml code.
Running this symbolic process yields a single branch with no path condition:
# Symex.run ~mode:OX process;;
- : (([> Typed.T.sint ] as '_weak1) Typed.t * Symex.Value.sbool Typed.t list)
list
= [((V|1| + 2), [])]We can now build more complex symbolic processes, and make use of other basic operations such as assume and assert_.
For instance, we can create a symbolic process that introduces a nondeterministic integer, and then asserts that this integer is positive:
# let process =
let* x = Symex.nondet Typed.t_int in
Symex.assert_ (x >@ 0s);;
val process : bool Symex.t = <abstr>
# Symex.run ~mode:OX process;;
- : (bool * Symex.Value.sbool Typed.t list) list = [(false, [])]This process returns a single branch with the value false, as it is not *always* guaranteed that the symbolic integer x is positive. Now, if we add an assumption that, say, x is greater than 10, we get a different result:
# let process =
let* x = Symex.nondet Typed.t_int in
let* () = Symex.assume [ x >@ 10s ] in
Symex.assert_ (x >@ 0s);;
val process : bool Symex.t = <abstr>
# Symex.run ~mode:OX process;;
- : (bool * Symex.Value.sbool Typed.t list) list = [(true, [(11 <= V|1|)])]This time, the assertion is satisfied, and the symbolic process returns a single branch with value true, and a path condition that corresponds to the assumption we made, i.e. that V|1| (the symbolic integer) is greater than or equal to 11 (which is equivalent to the assumption we wrote).
A more interesting symbolic process is one that introduces conditional branching. For instance, the following process introduces a nondeterministic integer, and then branches on whether this integer is even or odd. Soteria provides convenient syntactic sugar (accessed when we opened Symex.Syntax), which overrides the standard OCaml if statement to work with symbolic values:
# let process =
let* x = Symex.nondet Typed.t_int in
if%sat (x %@ 2s) ==@ 0s then
Symex.return (Fmt.str "%a is even" Typed.ppa x)
else
Symex.return (Fmt.str "%a is odd" Typed.ppa x)
val process : string Symex.t = <abstr>
# Symex.run ~mode:OX process;;
- : (string * Symex.Value.sbool Typed.t list) list =
[("V|1| is even", [(0 == (V|1| mod 2))]);
("V|1| is odd", [(0 != (V|1| mod 2))])]As expected, the symbolic process returns two branches, one for the case where the symbolic integer is even, and one for the case where it is odd. In each case, the path condition is a symbolic expression that describes the corresponding condition on the symbolic variable created (called V|1| here).
That's it, that's about all there is to know about the core Soteria symbolic execution. Of course, Soteria provides many more features, such as convenient symbolic-ready data-structure, logging facilities for symbolic processes, and more. You can find more information in the rest of this documentation.