Verification methodology
Claims in the previous chapters are only as strong as the evidence
that the code in quantica actually behaves the way those chapters
describe. This chapter catalogues the evidence artefacts the
project maintains and how they are produced.
Two complementary techniques are used:
ctgrind — Valgrind memcheck client-requests applied to a dedicated test binary. Covers control-flow and memory-access constant-time properties, run on the host CI.
silentops::verify (dudect) — statistical timing test, run on the target hardware. Covers microarchitectural timing properties (cache, branch predictor, variable-latency instructions).
Neither alone is sufficient; both are required for a credible CT assurance story.
ctgrind (Valgrind memcheck) — control-flow CT
Methodology
The technique was first described by Adam Langley in 2010
[Lan10]. A memory buffer holding a secret is
marked undefined via the Valgrind make_mem_undefined
client request. Any subsequent conditional branch, pointer
dereference, or syscall argument that consumes bits of the
undefined buffer raises a memcheck error — i.e. any
control-flow or addressing decision that depends on the secret is
reported.
Because memcheck propagates definedness through pure arithmetic,
masked bitwise selects (b ^ (mask & (a ^ b))) do not raise
errors; only the downstream use of their result in a branch or
load does. A correctly written CT primitive therefore passes
ctgrind cleanly; a bug such as a pointer-CMOV → load (as seen on
the early quantica ct_select) triggers an error on the
load.
Implementation
silentops::ct_grind::poison/unpoisonemit the Valgrind client requests via inlineasm!— no C shim, no crate dependency. Surrounded bycompiler_fence(SeqCst)to prevent LLVM from reordering memory reads past a poison boundary.quantica_bench/src/bin/ctgrind.rsis the driver binary: it poisons only the truly secret portions of eachsk(skipping the public components that FIPS layouts embed inline), runs keygen + encaps/decaps or sign/verify for each parameter set of each algorithm, then unpoisons before the correctness round-trip.tools/ctgrind.shwrapsvalgrind --error-exitcode=1 --track-origins=yeswith the suppression file documented below.tools/ctgrind.suppholds named, individually justified suppressions — one stanza per legitimate finding, each with a paragraph of threat-model reasoning.
Demo / self-test
silentops/examples/ct_grind_demo.rs exercises both a CT path
(silentops::ct_eq on a poisoned buffer, should pass) and an
intentional leak (if secret[0] > 0x80, should be reported).
Running tools/ctgrind.sh demo validates the entire plumbing in
one command — any failure on the demo means the infrastructure is
broken and subsequent algorithm runs are not trustworthy.
Running against the production algorithms
# all three algorithms, all parameter sets
tools/ctgrind.sh all
# individual runs
tools/ctgrind.sh mlkem
tools/ctgrind.sh mldsa
tools/ctgrind.sh slh
Exit code 0 means “no unsuppressed findings”. Exit code 1
means memcheck reported a new finding that is not justified in the
suppression file — this is a CT regression and must be fixed or
explicitly added to the suppression file with a new threat-model
paragraph in this document.
Current suppressions
The file tools/ctgrind.supp lists the known, documented
findings for each algorithm. They fall in two families:
Branches on data that becomes publicly emitted — ML-DSA
decompose::high_bits_vec,encode::w1_encode,decompose::make_hint_vec,encode::sig_encode, and the inlined rejection-loop intermediates insideMlDsa::<P>::sign. SLH-DSAfors::fors_pk_from_sig,wots::chain_iter,xmss::xmss_sign_into,xmss::xmss_pk_from_sig. The leaked bits are byte-for-byte identical to bytes present in the emitted signature, so timing observation reveals no information beyond what the signature itself reveals.Rejection sampling on a PRF-derived stream — ML-DSA
shuffle::generate_permutation. The rejection conditionif r < limitbranches on one byte of a SHAKE256 output computationally indistinguishable from uniform withoutK; leaking SHAKE output bits is at least as hard as breaking SHAKE256’s PRF property.
Each suppression has an inline rationale in tools/ctgrind.supp;
they are reproduced in the per-algorithm chapters as well
(ML-DSA — countermeasures, SLH-DSA — countermeasures).
Roadmap — all suppressions closed before release
Each suppression corresponds to a T4-x item in the Tier-4 hardening roadmap. A release tree is built from a state where each item has been implemented and the corresponding suppression removed. The present draft documents both postures so that the reviewer sees the current code and the intended end state side by side.
dudect (silentops::verify) — microarchitectural timing
Methodology
The Reparaz–Balasch–Verbauwhede methodology [RBV17] proceeds as follows:
Build two input classes — typically a “fixed” (worst-case) input and a “random” input.
Interleave measurements of the two classes randomly to decorrelate from slow system drifts.
Feed each measurement into a Welch t-test maintained incrementally with Welford’s algorithm.
If
|t| > 4.5after a reasonable sample count, reject the null hypothesis “the two classes have the same mean timing” (p < 10⁻⁵).
silentops::verify packages the statistical machinery as a
library — TTest, Xorshift64, measure_ns, report —
so consumers write their own class builders and harness.
Reference harness
silentops/examples/ct_verify_pqc.rs provides a canonical
harness exercising:
ML-KEM-768 Decaps (valid vs. random ciphertext) — expected PASS;
ML-KEM Barrett reduce (small vs. large input) — expected PASS;
ML-DSA-44 Verify (valid vs. invalid signature) — expected PASS;
ML-DSA-44 Sign (two messages) — expected FAIL on timing variation because the FIPS 204 rejection loop is inherently non-constant-time in the number of iterations. This is not a vulnerability (the variation is message-dependent but not secret-key-dependent) but dudect cannot distinguish the two — the test comment makes this explicit.
Role in the verification pipeline
ctgrind catches control-flow leaks — any branch or addressing decision that depends on a secret. dudect catches timing leaks that are invisible to ctgrind because they arise from microarchitectural state: branch-predictor warming, port contention on CPUs with µ-ops that exhibit input-dependent latency, and cache eviction patterns that manifest only on real hardware.
The two are not redundant — they look at disjoint failure modes. A high-assurance verification posture runs both:
ctgrind on the CI host for every merge (blocking);
dudect on the target platform (representative MCU / SE dev board) on a nightly or release-candidate cadence.
Integration plan (Gitea workflows, upcoming)
The Gitea workflows being set up on the home-lab runner (turtle)
will wire both verification techniques into the merge gate:
Per-PR — fast ctgrind run (mlkem + mldsa + slh) against the
ct-grindfeature build; failure blocks merge.Nightly — full dudect harness on the Cortex-M4 dev board, with trace counts sized for statistical power at
p < 10⁻⁵; report uploaded as a build artefact.Pre-release tag — stack-sizes per-function analysis, cross-architecture build (
thumbv6m,thumbv7em,thumbv8m.main,riscv32imc), and the QEMU-user ctgrind run coveringaarch64/armv7/riscv64.
The CI definitions themselves live in .gitea/workflows/ once
committed. This chapter will be revised to reference them directly
as soon as they land.
release-asm branch counting — methodology and current state
A complementary day-to-day CT regression indicator, called for in
the workspace-level SECURITY.md (Section 4.4), is expected
branch count per CT-critical function measured on the release
build. The methodology is:
Compile the function under
cargo rustc --release -p quantica --emit=asm.Filter the resulting
.sfor conditional jump opcodes (je/jne/jz/jnz/jl/jle/jg/jge/ja/jbon x86_64, the equivalentb.eq/b.ne/ … on aarch64,cbz/cbnz/b.cson Cortex-M).Compare the count against an expected reference baked into the relevant
countermeasures/<algo>.rstchapter. Any regression — say, an unintended branch reintroduced by a compiler upgrade or by a refactor — surfaces as a count delta on the next CI run.
Current gap (honest disclosure)
The methodology above is documented but the per-countermeasure
reference branch counts are not yet populated in
ML-KEM — countermeasures, ML-DSA — countermeasures, or
SLH-DSA — countermeasures as of the current draft. The
workspace-level SECURITY.md row that mentions documented
branch counts therefore describes the target posture, not
the current state.
How the gap will be closed: each Tier-1 / Tier-2 item from
the roadmap (see quantica/README.md “Roadmap” chapter)
produces a delta on one or more CT-critical functions. The
commit that lands the item also adds an “Expected branch count”
sub-table to the relevant countermeasures/*.rst file with
columns Function / Branches before / Branches after.
By the end of the Tier-1 + Tier-2 sprints, every CT-critical
function in quantica has a reference baseline, and CI compares
subsequent builds against it.
Day-to-day discipline today: ctgrind (control-flow CT) and
dudect (microarchitectural timing) are the two production
indicators currently relied on. They cover the same failure
class with stronger statistical guarantees, at the cost of a
slower feedback loop than a simple grep over an asm dump.