Exploring formal verification (part 2)
In the previous post I showed how to prove equivalence of two different implementations of the same algorithm. This post will cover writing an algorithm specification in Cryptol to prove the correctness of a constant-time C/C++ implementation.
Apart from rather simple Cryptol I’m also going to introduce SAW’s llvm_verify
function that allows much more complex verification. We need this as our function will not only take scalar inputs but also store the result of the computation using pointer arguments.
Constant-time multiplication
Part 1 dealt with addition, in part 2 we’re going to look at multiplication. Let’s implement a function mul(a, b, *hi, *lo)
that multiplies a
and b
, and stores the eight most significant bits of the product in *hi
, and the eight LSBs in *lo
.
This time we’ll make it run in constant time right away and won’t bother implementing a trivial version first. Instead, we will write a Cryptol specification to verify LLVM bitcode afterwards — you will be amazed how simple that is.
Some helper functions
The first two functions of our C/C++ implementation will seem familiar if you’ve read the previous part of the series. msb
hasn’t changed, and ge
is the negated version of lt
. nz
returns 0xff
if the given argument x
is non-zero, 0
otherwise.
Our add
function that previously dealt with overflows by capping at UINT8_MAX
is a little more mature now and will set *carry = 1
when an overflow occurs.
The core of the algorithm
mul(a, b, *hi, *lo)
, using all the helper functions we defined above, implements standard long multiplication, i.e. four multiplications per function call. We split the two 8-bit arguments into two 4-bit halves, multiply and add a few times, and then store two 8-bit results at the addresses pointed to by hi
and lo
.
It’s relatively easy to see that a * b
can be rewritten as (a1 * 2^4 + a0) * (b1 * 2^4 + b0)
, all four variables being 4-bit integers. After multiplying and rearranging you’ll get an equation that’s very similar to mul
above. Here’s a good introduction to computing with long integers if you want to know more.
Compile the code to LLVM bitcode as before so that we can load it into SAW later.
The Cryptol specification
To automate verification we’ll again write a SAW script. It will contain the necessary verification commands and details, as well as a Cryptol specification.
The specification doesn’t need to be constant-time, all it needs to be is correct and as simple as possible. We declare a function mul
taking two 8-bit integers and returning a tuple containing two 8-bit integers. Read the notation [8]
as “sequence of 8 bits”.
The built-in function take`{n} x
returns a sequence with only the first n
items of x
. drop`{n} x
returns sequence x
without the first n
items. zero
is a special value that has a number of use cases, here it represents a flexible sequence of all zero bits. #
is the append operator for sequences.
The first line of the definition gives the return value, a tuple with the first and the last 8 bits of prod
. The Cryptol type system can automatically infer that the variable prod
must hold a 16-bit sequence if the result of the take`{8}
and drop`{8}
function calls is a sequence of 8 bits each.
prod
is the result of multiplying the zero-padded arguments a
and b
. zero # x
appends x
to 8 zero bits, and that number is again determined by the type system. If you want to learn more about the language, take a look at Programming Cryptol.
That’s about as simple as it gets. We multiply two 8-bit integers and out comes a 16-bit integer, split into two halves. Now let’s use the specification to verify our constant-time implementation.
SAW’s llvm_verify function
We will add LLVM SAW instructions to the same file that contains the Cryptol code from above. The llvm_verify
call here takes module m
, extracts the symbol "mul"
, and uses the body given after do
for verification.
We need to declare all symbolic inputs as given by our C/C++ implementation. With llvm_var
we tell SAW that "a"
and "b"
are 8-bit integer arguments, and map those to the SAW variables a
and b
.
The arguments "hi"
and "lo"
are declared as pointers to 8-bit integers using llvm_ptr
. And because we want to dereference the pointers and refer to their values later we declare "*hi"
and "*lo"
as 8-bit integers too.
We specify no constraints for any of the arguments and expect the verification to consider all possible inputs. I will talk a bit more about such constraints and how these are useful in a later post.
With llvm_ensure_eq
we tell SAW what values we expect after symbolic execution. We expect "*hi"
to be equal to the first 8-bit integer element of the tuple returned by mul
, and "*lo"
to be equal to the second 8-bit integer.
llvm_verify_tactic
chooses UC Berkely’s ABC tool again and off we go.
Verification with SAW
Again, make sure you have saw
and z3
in your $PATH
. If you haven’t downloaded the binaries yet, take a look at the early sections of the previous post.
Successfully verified @mul. SAW tells us that for all possible inputs a
and b
, and actually hi
and lo
too, our constant-time C/C++ implementation behaves as stated by the SAW verification script and is thereby equivalent to our Cryptol specification.
Next: Finding bugs and more LLVM commands
For the next post I’m planning to introduce and write more Cryptol, talk about specifying constraints on LLVM arguments and return values, and provide an example for finding bugs in a real-world codebase.
And while you wait, why not try your hand at optimizing mul
to use only three instead of four multiplications with the Karatsuba algorithm? You can reuse the above Cryptol specification to verify you got it right.