krypteia-quantica — Post-Quantum Cryptography for the krypteia workspace

Pure-Rust implementations of the three NIST post-quantum standards, sharing a side-channel countermeasure toolkit (silentops, a companion crate of the same workspace) used by the classical side as well. Specifications (FIPS 203 / 204 / 205 PDFs) are vendored alongside the crate in the repository.

Design rules

The crate inherits the krypteia workspace design rules:

Pure Rust, zero external crates — only core (and alloc); std is optional behind a feature flag.
Embedded-friendly — small RAM footprint, fits secure elements, STM32 (Cortex-M0/M4/M33), RISC-V parts (ESP32-C3, …).
Side-channel hardened against SPA, DPA, DFA, template attacks, timing attacks. CT primitives come from silentops, with architecture-specific assembly backends.
Validated against the official NIST ACVP test vectors.
C FFI-exposable through the quantica_ffi companion crate.

Algorithms

Standard	Algorithm	Type	Status
FIPS 203	ML-KEM (ex-CRYSTALS-Kyber)	Key Encapsulation Mechanism	Implemented, ACVP + Wycheproof validated
FIPS 204	ML-DSA (ex-CRYSTALS-Dilithium)	Digital Signature	Implemented, ACVP + Wycheproof validated
FIPS 205	SLH-DSA (ex-SPHINCS+)	Stateless Hash-Based Signature	Implemented, ACVP validated (Wycheproof has no SLH-DSA corpus yet)

ML-KEM (FIPS 203)

Module-lattice-based Key Encapsulation Mechanism. Derived from CRYSTALS-Kyber. A keygen / encaps / decaps KEM producing a 32-byte shared secret; decapsulation uses the Fujisaki–Okamoto transform with implicit rejection so a malformed ciphertext yields a deterministic secret indistinguishable from a legitimate one.

ML-DSA (FIPS 204)

Module-lattice-based Digital Signature Algorithm. Derived from CRYSTALS-Dilithium. A Fiat–Shamir-with-aborts signature scheme; signing is a hedged rejection loop that mixes fresh randomness with the secret key. Verification is deterministic and does not touch any secret.

SLH-DSA (FIPS 205)

Stateless Hash-Based Digital Signature Algorithm. Derived from SPHINCS+. Security relies on the second-preimage resistance of SHAKE / SHA-2 only — no algebraic assumption. Signatures are large (7–50 KiB depending on parameter set) but the underlying primitive is conservative and quantum-safe.

Cargo features

[dependencies]
quantica = { path = "../quantica" }   # default = std + 3 algos + sca-protected

Feature	Default	Effect
`std`	✅	Pulls in the Rust standard library. Enables `OsRng` and `std::error::Error` impls.
`ml-kem`	✅	Compiles the FIPS 203 module (`quantica::ml_kem`).
`ml-dsa`	✅	Compiles the FIPS 204 module (`quantica::ml_dsa`).
`slh-dsa`	✅	Compiles the FIPS 205 module (`quantica::slh_dsa`).
`sca-protected`	✅	Activates the masking + shuffled-NTT defences in ML-KEM and ML-DSA.

Disabling std makes the crate no_std (still requires alloc). In that mode the OS-backed OsRng disappears — the caller must provide their own CryptoRng impl wrapping a hardware RNG.

Quick start

ML-KEM (FIPS 203) — Key Encapsulation

use quantica::ml_kem::*;

let mut rng = OsRng;

// Key generation. ek is a public EncapsulationKey<MlKem768>;
// dk is a DecapsulationKey<MlKem768> that auto-zeroizes on Drop.
let (ek, dk) = MlKem::<MlKem768>::keygen(&mut rng).unwrap();

// Encapsulation (Bob): produces a 32-byte SharedSecret + a Ciphertext.
let (shared_secret_bob, ciphertext) =
    MlKem::<MlKem768>::encaps(&ek, &mut rng).unwrap();

// Decapsulation (Alice): recovers the same SharedSecret.
let shared_secret_alice =
    MlKem::<MlKem768>::decaps(&dk, &ciphertext, &mut rng).unwrap();

assert_eq!(shared_secret_alice, shared_secret_bob);
// Both shared secrets wipe themselves at end of scope.

ML-DSA (FIPS 204) — Digital Signature

use quantica::ml_dsa::*;

let mut rng = OsRng;

// VerifyingKey<MlDsa65> + zeroizing SigningKey<MlDsa65>.
let (pk, sk) = MlDsa::<MlDsa65>::keygen(&mut rng).unwrap();

// Sign — uses hedged signing (mixes fresh RNG bytes with the secret key).
let sig: Signature<MlDsa65> =
    MlDsa::<MlDsa65>::sign(&sk, b"message", b"", &mut rng).unwrap();

let valid = MlDsa::<MlDsa65>::verify(&pk, b"message", b"", &sig).unwrap();
assert!(valid);

SLH-DSA (FIPS 205) — Stateless Hash-Based Signature

use quantica::slh_dsa::*;

let mut rng = OsRng;

let (sk, pk) = SlhDsa::<Shake128f>::keygen(&mut rng).unwrap();

let sig = SlhDsa::<Shake128f>::sign(b"message", &sk, &mut rng).unwrap();

let valid = SlhDsa::<Shake128f>::verify(b"message", &sig, &pk).unwrap();
assert!(valid);

Typed key wrappers (Zeroize-on-Drop)

The public API never returns raw Vec<u8> for secret material. Each algorithm exposes parameter-set-tagged wrapper types backed by the shared [quantica::secret] module:

Module	Public (not zeroized)	Secret (Drop-zeroizes via `silentops::ct_zeroize`)
`ml_kem`	`EncapsulationKey<P>`, `Ciphertext<P>`	`DecapsulationKey<P>`, `SharedSecret`
`ml_dsa`	`VerifyingKey<P>`, `Signature<P>`	`SigningKey<P>`
`slh_dsa`	`VerifyingKey<P>`, `Signature<P>`	`SigningKey<P>`

All wrappers implement from_bytes(&[u8]) (length-validated against the parameter set), as_bytes() -> &[u8], Deref<Target=[u8]>, AsRef<[u8]>, and a manual Clone. The secret variants additionally have a redacted Debug impl that prints <redacted; len=N> so a stray eprintln! cannot leak key material into a log file.

The internal byte-slice API (quantica::ml_kem::kem::*, quantica::ml_dsa::dsa::*, quantica::slh_dsa::slh::*) is still exposed for ACVP/CAVP testing and for the C FFI, which prefers raw Vec<u8> to keep the FFI boundary thin.

Parameter sets / curve families

ML-KEM (FIPS 203)

Parameter set	Security	ek (B)	dk (B)	ct (B)	ss (B)
ML-KEM-512	Cat. 1	800	1632	768	32
ML-KEM-768	Cat. 3	1184	2400	1088	32
ML-KEM-1024	Cat. 5	1568	3168	1568	32

ML-DSA (FIPS 204)

Parameter set	Security	pk (B)	sk (B)	sig (B)
ML-DSA-44	Cat. 2	1312	2560	2420
ML-DSA-65	Cat. 3	1952	4032	3309
ML-DSA-87	Cat. 5	2592	4896	4627

SLH-DSA (FIPS 205) — SHAKE variants only

Parameter set	Security	n	pk (B)	sk (B)	sig (B)
SLH-DSA-SHAKE-128s	Cat. 1	16	32	64	7 856
SLH-DSA-SHAKE-128f	Cat. 1	16	32	64	17 088
SLH-DSA-SHAKE-192s	Cat. 3	24	48	96	16 224
SLH-DSA-SHAKE-192f	Cat. 3	24	48	96	35 664
SLH-DSA-SHAKE-256s	Cat. 5	32	64	128	29 792
SLH-DSA-SHAKE-256f	Cat. 5	32	64	128	49 856

s variants optimize for small signatures, f variants for fast signing and verification. SHA2-based parameter sets are not yet implemented (see “Known limitations” below).

Design decisions

Zero dependencies — only core + alloc (and optionally std). SHA-3 / SHAKE are implemented from scratch on top of a single shared Keccak-f[1600] core in src/sha3.rs; each algorithm exposes its own thin wrapper.
Generic over parameter sets — MlKem<P>, MlDsa<P>, SlhDsa<P> are monomorphized at compile time via const generics, so a single code path serves all security levels.
Internal byte-slice API stays raw — keygen_internal, encaps_internal, sign_internal, verify_internal accept and return raw &[u8] / Vec<u8>. The KAT tests and the C FFI use this layer; only the high-level MlKem<P>::keygen etc. wrap into the typed key types.
Arithmetic widths — i16 for ML-KEM (q = 3329 fits in 12 bits), i32 for ML-DSA (q = 8 380 417 needs 23 bits), no NTT at all for SLH-DSA.
NTT differences — ML-KEM uses BitRev_7 with a partial NTT (down to length-2, base-case multiply); ML-DSA uses BitRev_8 with a full NTT (down to length-1, simple pointwise multiply).
SLH-DSA architecture — WOTS+ → XMSS → Hypertree → FORS → SLH-DSA. Purely hash-based, no algebraic structures.

Side-channel countermeasures (summary)

Always-on

These defences are active in every build, regardless of feature flags:

Countermeasure	Algorithm	Threat addressed	How
Constant-time arithmetic	ML-KEM, ML-DSA, SLH-DSA	Timing / cache-timing / basic SPA	Branchless `mod_q`, `ct_eq`, `ct_select` from `silentops`
Zeroize-on-Drop wrappers	ML-KEM, ML-DSA, SLH-DSA	Cold boot, memory dumps, UAF	`SecretBytes` / `SecretArray` → `silentops::ct_zeroize` on Drop
Volatile zeroization	ML-KEM, ML-DSA, SLH-DSA	Cold boot, memory dumps	`core::ptr::write_volatile` + `compiler_fence` on intermediates
Double Decaps	ML-KEM	DFA on FO comparison	Decaps runs twice; results compared; mismatch ⇒ random output
dk integrity check	ML-KEM	DFA on stored key material	`H(ek)` is embedded in `dk` and re-checked at every Decaps
Hedged signing	ML-DSA, SLH-DSA	Fault-induced nonce reuse	32 bytes of fresh entropy mixed into the per-signature derivation

Feature-gated (`sca-protected`, on by default)

Countermeasure	Algorithm	Threat addressed	Module
First-order additive masking	ML-KEM	First-order DPA, template attacks	`ml_kem::masked`
NTT butterfly shuffling	ML-KEM	SPA, trace alignment for DPA	`ml_kem::shuffle`
First-order additive masking	ML-DSA	First-order DPA, template attacks	`ml_dsa::masked`
Shuffled NTT (secret poly)	ML-DSA	SPA, trace alignment for DPA	`ml_dsa::shuffle`
Mask refresh between rounds	ML-DSA	Higher-order share correlation	`MaskedPoly::refresh()` between rejection iterations

The masking layer is mathematically transparent — the masked path produces bit-identical keys, ciphertexts, and signatures to the unmasked path, which is why the NIST ACVP vectors keep matching with sca-protected enabled. Internally:

ML-KEM: secret polynomials s, e, etc. are split into two additive shares mod q = 3329 immediately after CBD sampling. NTTs run on each share independently (linearity of the NTT), pointwise multiplications by public matrices distribute over the shares.
ML-DSA: in dsa::sign_internal, the secret-key vectors s1, s2, t0 are NTT-transformed via shuffle::ntt_shuffled then split into MaskedPoly arrays. Each per-rejection-iteration multiplication ĉ · ŝx runs through masked_pointwise_mul_public, followed by MaskedPoly::refresh() to prevent inter-iteration share correlation. Mask randomness is drawn from a SHAKE256-seeded deterministic ScaRng (seed = K ‖ rnd ‖ tr ‖ M'), so sign_internal keeps a deterministic signature and the ACVP fixed-rnd vectors still match.
SLH-DSA: hash-based, no algebraic structure to mask — first-order masking does not buy anything here. The always-on defences (CT arithmetic, zeroization, hedged signing) are the relevant layer.

Approximate cost (single-threaded, release mode)

Operation	Plain	`sca-protected`	Slowdown
ML-KEM-768 Decaps	~0.03 ms	~0.07 ms (double)	~2.3×
ML-DSA-65 Sign	~2.2 ms	~7.1 ms	~3.2×

Numbers vary widely with hardware. Run the quantica_bench companion crate for measurements on your machine.

Timing leakage verification (dudect)

The shared silentops::verify module implements the dudect methodology of Reparaz, Balasch and Verbauwhede (2017). A pre-built harness exercises the most sensitive paths:

cargo run --release -p silentops --features std --example ct_verify_pqc

Currently checks:

ML-KEM-768 Decaps — valid vs random ciphertext (implicit-rejection timing)
ML-KEM Barrett reduce — small vs large input
ML-DSA-44 Sign — message A vs message B (message-independent timing)
ML-DSA-44 Verify — valid vs invalid signature

A t-statistic with |t| < 4.5 after ~10⁶ samples is considered passing (p < 10⁻⁵). Note that ML-DSA Sign uses rejection sampling, so its timing inherently varies — a FAIL there is not necessarily a vulnerability if the variation is independent of the secret key.

Known residual surface

The following attack surfaces are not currently defended against and are documented here so the reader knows what they are deploying. They are tracked in the side-channel annex and in the tier-4 hardening roadmap.

Masked Keccak / SHAKE — the hash primitive feeding the PRF in ML-KEM / ML-DSA / SLH-DSA is not masked; a DPA attacker with trace access can mount Kannwischer-style attacks on SK.seed. A 3-share SHAKE variant is planned (see tier-4 item T4-K).
Grafting-tree fault attacks on SLH-DSA — SLH-DSA signing does not yet include a post-sign redundancy check; a single-fault attacker (physical or Rowhammer-class) can coerce a forgery. Redundancy is planned (tier-4 T4-H / T4-J / T4-L).
Heap allocations on the secret path — secret-key buffers come from alloc rather than caller-provided fixed buffers. A future refactor will thread &mut [u8] end-to-end for bare-metal stack-only operation.
Higher-order DPA across rejection iterations — ML-DSA shares s1, s2, t0 are first-order-masked but not refreshed between rejection iterations; a higher-order adversary combining two iterations’ leakage remains in scope. Scheduled as tier-4 T4-C.
Pointer-level CMOV by the compiler — the Rust bit-hack CT primitives are defended by the silentops asm backend on x86_64 and ARM; on targets without an asm backend (e.g. WebAssembly), the CT guarantee is best-effort source-level only.

Per-algorithm deep dives

The summary above lists which countermeasures are active; the full per-algorithm SCA analyses — threat matrices, attack references, code pointers, residual risks — live under quantica/doc/sca/countermeasures/ in the repository. The Sphinx documentation pack (./gendoc.sh quantica) inlines them as a navigable cross-linked tree below.

krypteia — Side-Channel Analysis and Countermeasures

Performance

Run the workspace bench tool:

cargo run --release -p quantica_bench

Representative single-threaded numbers (no SIMD, no NEON, sca-protected on):

Algorithm	KeyGen	Sign / Encaps	Verify / Decaps
ML-KEM-768	~0.03 ms	~0.04 ms	~0.07 ms
ML-DSA-65	~0.10 ms	~7.1 ms	~0.12 ms
SLH-DSA-SHAKE-128f	~2 ms	~40 ms	~2 ms

Notes:

ML-KEM uses full Montgomery NTT arithmetic (shifts instead of divisions).
ML-DSA Sign times vary because of rejection sampling.
SLH-DSA is dominated by SHAKE evaluations; release mode is essential (debug mode is ~100× slower).

Building

Desktop / server (default)

# Build everything (opt-level=2, CT-safe, all algos + sca-protected on)
cargo build --release -p quantica

# Build with no SCA countermeasures (faster, dudect baseline)
cargo build --release -p quantica \
    --no-default-features --features std,ml-kem,ml-dsa,slh-dsa

# Run all tests (ACVP vectors, secret-module, masked/shuffle round-trips)
cargo test --release -p quantica

# Generate the rustdoc API reference
cargo doc -p quantica --no-deps --open

`no_std` / bare-metal cross-compile

# Install the targets we care about
rustup target add thumbv7em-none-eabihf       # Cortex-M4/M7
rustup target add thumbv6m-none-eabi          # Cortex-M0/M0+
rustup target add thumbv8m.main-none-eabihf   # Cortex-M33 (TrustZone)
rustup target add riscv32imc-unknown-none-elf # ESP32-C3, SiFive

# Cross-compile no_std + all 3 algos + sca-protected
cargo build -p quantica \
    --no-default-features \
    --features ml-kem,ml-dsa,slh-dsa,sca-protected \
    --target thumbv7em-none-eabihf

In no_std mode the crate still depends on alloc (keys, ciphertexts and signatures are Vec<u8>-backed). The OS-backed OsRng is unavailable — provide your own CryptoRng implementation that delegates to a hardware TRNG.

Cargo profiles

The workspace Cargo.toml declares three profiles:

Profile	opt-level	CT guarantee	Use case
`release`	2	Yes (Rust source-level)	Desktop / server production
`release-embedded`	z + abort	Yes (asm CT backends)	Embedded, minimum size
`release-bench`	3	No (LLVM may break CT patterns)	Benchmarks only

⚠️ opt-level=3 can defeat constant-time guarantees: LLVM may convert bitwise mask patterns into conditional memory accesses. Always use opt-level=2 or lower for security-critical builds, or rely on the assembly CT backends from silentops (asm-aarch64, asm-thumbv7, asm-thumbv6m, asm-riscv32) which bypass the compiler entirely.

Test validation

All implementations are validated against three independent vector suites, all checked into tests/vectors/:

NIST ACVP — happy-path conformance

Official vectors from usnistgov/ACVP-Server. These are the NIST-authored known-answer tests that every FIPS 203 / 204 / 205 claimant must pass.

Algorithm	KeyGen	SigGen / Encaps	SigVer / Decaps
ML-KEM	75 / 75	75 / 75	30 / 30
ML-DSA	15 / 15	15 / 15	30 / 30
SLH-DSA	18 / 18	1 / 1 (128f)	3 / 3 (128f)

(SLH-DSA SigGen / SigVer covered only on SHAKE-128f for test wall-clock reasons; all 6 parameter sets share the same code path and KeyGen is validated on every one.)

Wycheproof — edge cases and negative tests

Vectors from the C2SP/wycheproof project, covering malformed inputs, corrupted keys, truncated ciphertexts / signatures, out-of-range coefficients, and other edge cases the NIST happy-path vectors do not exercise. Each vector carries a result field — valid, invalid, or acceptable — against which our implementation’s accept / reject decision is compared.

Algorithm	Files	Vectors	Coverage
ML-KEM	12	~1650	512 / 768 / 1024 — Encaps + Decaps
ML-DSA	9	~1020	44 / 65 / 87 — Sign (seed + noseed) + Verify
Total	21	~2 672

Custom negative / robustness tests

A hand-curated suite in tests/negative.rs targeting the specific error paths of each typed key wrapper — wrong-length inputs, silent wrong-result scenarios, FIPS 203 §7.2 encapsulation-key modulus check, FO-transform integrity under malformed ciphertexts, etc. Around 25 tests across the three algorithms.

Running everything

cargo test --release -p quantica

Policy on test suites

A necessary condition for adding a new cryptographic primitive to quantica is the availability of a public reference test suite for it. When a new peer-reviewed test corpus appears (a refreshed Wycheproof release, a new CAVP tranche, a community project like the IETF CFRG vectors), we re-import it and extend the test matrix accordingly; this is tracked as part of our ongoing crypto-research monitoring and is called out in the changelog.

Examples

Rust

cargo run --release -p quantica --example ml_kem_roundtrip
cargo run --release -p quantica --example ml_dsa_sign_verify
cargo run --release -p quantica --example slh_dsa_sign_verify

C FFI

For C consumers, the quantica_ffi companion crate exports a C ABI around the three algorithms and ships a standalone test_quantica.c example program. The shared library is built by:

cargo build --release -p quantica_ffi

and the generated C header (quantica.h) is kept under the FFI crate’s include/ directory.

Module map

quantica/
├── Cargo.toml
├── README.md                 (this file)
├── src/
│   ├── lib.rs                Re-exports the algo modules behind features
│   ├── secret.rs             SecretBytes / SecretArray (Zeroize-on-Drop)
│   ├── sha3.rs               Shared Keccak-f[1600] core (KeccakState)
│   ├── ml_kem/               FIPS 203 ML-KEM            (feature `ml-kem`)
│   │   ├── mod.rs            Public API: MlKem<P>, typed wrappers
│   │   ├── params.rs         MlKem512, MlKem768, MlKem1024
│   │   ├── sha3.rs           Thin wrappers: H, G, J, PRF, Xof
│   │   ├── ntt.rs            NTT mod 3329 (full Montgomery, i16)
│   │   ├── encode.rs         ByteEncode/Decode, Compress/Decompress
│   │   ├── sample.rs         SampleNTT, SamplePolyCBD
│   │   ├── kpke.rs           K-PKE (KeyGen, Encrypt, Decrypt)
│   │   ├── kem.rs            ML-KEM + double-decaps + dk integrity (DFA)
│   │   ├── rng.rs            CryptoRng trait + OsRng (std-only)
│   │   ├── masked.rs         First-order additive masking (DPA)
│   │   └── shuffle.rs        Fisher-Yates shuffled NTT (SPA)
│   ├── ml_dsa/               FIPS 204 ML-DSA            (feature `ml-dsa`)
│   │   ├── mod.rs            Public API: MlDsa<P>, typed wrappers
│   │   ├── params.rs         MlDsa44, MlDsa65, MlDsa87
│   │   ├── sha3.rs           Thin wrappers: SHAKE128/256, sha3_256/512
│   │   ├── ntt.rs            NTT mod 8 380 417 (Montgomery, i32)
│   │   ├── encode.rs         BitPack, pk/sk/sig encode/decode
│   │   ├── sample.rs         SampleInBall, RejNTTPoly, ExpandA/S/Mask
│   │   ├── decompose.rs      Power2Round, Decompose, HighBits, Hints
│   │   ├── dsa.rs            KeyGen, Sign (rejection loop, masked), Verify
│   │   ├── rng.rs            CryptoRng trait + OsRng (std-only)
│   │   ├── masked.rs         First-order additive masking (DPA)
│   │   └── shuffle.rs        Fisher-Yates shuffled NTT (SPA)
│   └── slh_dsa/              FIPS 205 SLH-DSA           (feature `slh-dsa`)
│       ├── mod.rs            Public API: SlhDsa<P>, typed wrappers
│       ├── params.rs         6 SHAKE parameter sets
│       ├── sha3.rs           Shake256 streaming wrapper
│       ├── address.rs        32-byte ADRS structure
│       ├── hash.rs           H_msg, PRF, PRF_msg, T_l, H, F
│       ├── wots.rs           WOTS+ one-time signatures
│       ├── xmss.rs           XMSS Merkle trees
│       ├── hypertree.rs      Hypertree of XMSS trees
│       ├── fors.rs           FORS forest
│       ├── slh.rs            SLH-DSA top-level
│       └── rng.rs            CryptoRng trait + OsRng (std-only)
├── examples/
│   ├── ml_kem_roundtrip.rs
│   ├── ml_dsa_sign_verify.rs
│   └── slh_dsa_sign_verify.rs
└── tests/
    ├── ml_kem_kat.rs
    ├── ml_dsa_kat.rs
    ├── slh_dsa_kat.rs
    └── vectors/              NIST ACVP-Server JSON / .rsp vectors

Known limitations

Side-channel protection

Vec<u8> heap allocations: secret-key buffers come from alloc, not from caller-provided fixed buffers. A future refactor will thread &mut [u8] everywhere for full bare-metal stack-only support.
write_volatile zeroization is the strongest erasure available in safe-ish Rust without external crates, but is not formally guaranteed against every compiler optimization on every target.
No formal CT verification yet (no ct-grind / Valgrind / ct-verif runs). The dudect harness gives statistical evidence, not proof.

Standards conformance

HashML-DSA (Algorithms 4 / 5) and HashSLH-DSA (Algorithm 23) pre-hash variants are structurally supported by the API but not tested. ACVP vectors with hashAlg != "none" are skipped.
SLH-DSA SHA2 parameter sets are not implemented; only the 6 SHAKE-based sets are.
Hedged signing is implemented, but only the deterministic variant (rnd = 0x00^32 for ML-DSA, opt_rand = pk.seed for SLH-DSA) is tested against ACVP vectors.
No CAVP certification — vectors come from the public NIST ACVP-Server GitHub mirror.

Portability

OsRng is Linux-only — reads /dev/urandom. Windows / macOS builds need custom adapters (BCryptGenRandom, SecRandomCopyBytes). Embedded targets must supply a hardware-RNG CryptoRng impl regardless.

Testing

Partial ACVP coverage — 1–25 vectors per operation, not the whole vector set, to keep test wall-clock low. Wycheproof is imported in full.
No SLH-DSA Wycheproof corpus exists yet — SLH-DSA validation currently rests on NIST ACVP vectors plus the custom negative suite; a Wycheproof import will be added when the upstream project ships vectors for FIPS 205.
No fuzzing, no CI/CD pipeline.

Roadmap

The full hardening roadmap lives under quantica/doc/sca/ (HTML rendered by ./gendoc.sh quantica). The summary below is the project’s living plan towards a third-party evaluation, indexed by Tier item identifier so each row maps to a stable cross-reference in the source code, the SCA annex and the workspace SECURITY.md lifecycle.

Status legend: ✅ done · 🔧 in progress · 📋 planned · 💤 deferred.

Tier 1 — Active vulnerabilities (critical path)

Items addressing documented attack vectors that affect the security of the implemented algorithms. The bulk of these are post-veille (2026-04-21) findings on the SLH-DSA fault surface, plus the ML-DSA mask-hygiene gaps surfaced by Hermelink CRYPTO 2025.

Id	Item	Status
T1-A	A3 — refresh ML-DSA shares (`s1`, `s2`, `t0`) at the start of every rejection iteration	✅
T1-B	Hermelink 2025/276 audit pass on `ml_dsa::masked` (information-theoretic leakage map)	✅
T1-C	FORS signature redundancy (anti-grafting-tree forgery, Castelnovi 2018, SLasH-DSA 2025)	✅
T1-D	Full-tree streaming FORS sign (defeats template idx-recovery, Kannwischer 2018)	✅
T1-E	Digest → FORS-indices integrity check	✅
T1-F	Constant-time `fors_pk_from_sig` (prerequisite for T1-C)	✅

Tier 2 — Hardening for evaluation

Id	Item	Status
T2-A	Explicit `ct_grind::unpoison` after the algorithmic unmask of `w1`, `h`, `z` in ML-DSA	📋
T2-B	Branch-free `generate_permutation` in ML-DSA shuffle (Feistel- or Floyd-based)	📋
T2-C	Documentation traceability — convert `tools/ctgrind.supp` into a “resolved-findings” annex once T2-A and T2-B land	📋
T2-D	Explicit `ct_grind::unpoison` of `R`, `digest`, FORS / WOTS / XMSS indices in SLH-DSA	📋

Tier 3 — Verification tooling

Id	Item	Status
T3-A	Cross-arch test infrastructure: qemu-user matrix (aarch64 / armv7 / riscv64 Linux) via `cross` + qemu-system matrix (riscv32imc / riscv32imac / thumbv6m / thumbv7em bare-metal) + custom semihosting host↔guest vector-streaming protocol so KAT corpora are not compiled into the bare-metal image. `thumbv8m.main` (M33 / STM32U5) is wired in tree but currently sidelined by an upstream rustc + cortex-m-rt link issue — `asm-thumbv7` coverage is preserved via `thumbv7em`.	✅
T3-B	Codeberg Forgejo Actions workflow (qemu-user + qemu-system + qemu-vector jobs) — replaces the originally scoped Gitea / `turtle.local` plan after the project moved its public CI to codeberg.org.	✅

Tier 4 — Deferred / beyond the current evaluation scope

Id	Item	Status
T4-A	SUCRE (TCHES 2026.1) shuffle-and-unmask migration evaluation — 4–6× speedup vs. the current Coron 2024/1149 masked-`y` pipeline	💤
T4-B	First-order Boolean masking of the SHAKE PRF in SLH-DSA (Fluhrer 2024/500, 1.7× overhead)	💤
T4-C	Higher-order arithmetic masking on ML-DSA `s1`/`s2`/`t0` (2-share, CC EAL4+ grade)	💤
T4-D	Higher-order masking on ML-KEM `s` (3-share, CC EAL4+ grade)	💤
T4-E	Hardened ML-KEM FO comparison against the eprint 2025/1577 template attack	💤
T4-F	Twiddle-factor masking inside the ML-KEM shuffled NTT (additional DPA defence layer)	💤
T4-G	SHA2-based SLH-DSA parameter sets (FIPS 205 Section 8) — currently SHAKE only	💤
T4-H	HashML-DSA / HashSLH-DSA pre-hash variants (FIPS 204 §6, FIPS 205 Algorithm 23)	💤

Tier 5 — Documentation pass

Cross-cutting documentation work, orthogonal to the cryptographic tiers above. Planned (not deferred); timing to be sequenced against the external evaluation calendar.

Id	Item	Status
T5-A	Workspace-wide doc pass (`quantica` + `arcana`): neutralise evaluation-target references — replace any CSPN-/ANSSI-specific language with generic evaluation / certification / audit terminology so the doc set reads cleanly against any third-party reviewer	✅
T5-B	TOC review across the workspace doc set (`doc/TOC.md` contract + per-crate `doc/` trees) — reorder chapters into 4 thematic clusters; rename ch.8 “Side-channel countermeasures” → “(summary)” + add `Per-algorithm deep dives` H3 bridging to the Sphinx pack	✅

Already shipped (trace-back)

Items below were entries on a prior version of this roadmap and have since been delivered. They are kept here so a third-party reviewer can match each closed concern to its commit without re-opening it.

Item	Status
ML-DSA `sca-masked-y` pipeline (Coron 2024/1149)	✅ commit `3149b68`
ML-DSA `sca-ct-rejection` (constant-time rejection loop)	✅
ML-DSA first-order arithmetic masking on `s1`/`s2`/`t0` + Fisher-Yates shuffled NTT	✅
ML-DSA seven RAM-reduction features (179 KB → ~17 KB peak Sign stack)	✅
ML-KEM first-order arithmetic masking on `s`/`e` + shuffled NTT	✅
ML-KEM double-decaps + `H(ek)` integrity DFA	✅
ML-KEM branchless fault-fallback (closes the timing oracle on the fault path)	✅ commit `5f0bdad`
SLH-DSA iterative BDS FORS treehash (256 KiB → 448 B per call)	✅ commit `fff156f`
SLH-DSA streaming signature output (one allocation, `*_into` variants throughout)	✅ commit `1eb224f`
`silentops` x86_64 / aarch64 inline-asm CT backends	✅ commit `90a1168`
`silentops::ct_grind::poison`/`unpoison` Valgrind instrumentation	✅ commit `90a1168`
Per-algorithm ctgrind harness (`quantica_bench/src/bin/ctgrind.rs`) + suppression file	✅ commit `241aeb1`
Stack-painting memcheck tool (`quantica_bench/src/bin/memcheck.rs`)	✅ commit `e21d6d0`
Static stack-size analysis via nightly `-Z emit-stack-sizes` (`tools/stack-sizes.sh`)	✅ commit `5f30e69`
Sphinx side-channel doc pack with bibliography + per-algorithm countermeasure chapters	✅ commit `32a76bd`
Self-contained crate-owned `quantica/doc/` tree (Option B layout)	✅ commit `5fc8c9b`
T1-F — Constant-time `fors_pk_from_sig` (prereq for T1-C FORS redundancy)	✅ commit `1fe4b18`
T1-C — FORS recompute-and-compare redundancy (`sca-fors-redundancy` feature, SLH-DSA grafting-tree defence)	✅ commit `c6a916e`
API cleanup post-T1C — single CT `fors_pk_from_sig`, unified `slh_sign_internal`, `&Adrs` template	✅ commit `a8d9a4a`
T1-D — Full-tree streaming FORS sign (`sca-fors-dummy-siblings` feature, anti-template Kannwischer 2018)	✅ commit `5d779c6`
T1-E — Digest → FORS-indices integrity check (`sca-fors-indices-check` feature, anti-fault Castelnovi 2018)	✅ commit `8ff4e01`
T1-B — Hermelink 2025/276 audit annex on `ml_dsa::masked` (doc-only, classifies leak surface)	✅ commit `d73dc70`
T1-A — Per-iteration mask refresh in ML-DSA rejection loop (head-of-loop, Hermelink §4 prescription)	✅ commit `738ec73`
T5-A — Workspace-wide doc pass: neutralise evaluation-target language (CSPN/ANSSI → generic evaluation)	✅ commit `eac79f5`
T5-B — TOC reorder (4 thematic clusters) + SCA chapter summary-bridge to per-algo deep dives	✅ this branch
T3-A — Cross-arch test infrastructure (qemu-user matrix + qemu-system bare-metal matrix + semihosting vector-streaming protocol)	✅ commits `ce06085`, `fe9b3d4`, `617120f`, `dd7f867`, `1d7b6fa`
T3-B — Codeberg Forgejo Actions workflow (`.forgejo/workflows/qemu-cross-tests.yml`) covering all three qemu layers	✅ this branch

Suggested execution order (critical path)

Sprint 1: T1-F + T1-C — closes the dominant published attack on SLH-DSA (Castelnovi grafting / SLasH-DSA Rowhammer). T1-F is the prerequisite (CT fors_pk_from_sig), T1-C the redundancy itself.
Sprint 2: T1-D + T1-E + T1-B — completes the FORS hardening (template + fault on idx) and pushes the Hermelink leakage checklist through ml_dsa::masked.
Sprint 3: T1-A + T2-A + T2-B — closes the ML-DSA higher-order recombination + the last two ctgrind suppressions for ML-DSA.
Sprint 4: T2-D + T3-A + T3-B + T2-C — ctgrind unpoisons for SLH-DSA, CT3 QEMU portability, CI wiring, and the documentation conversion of tools/ctgrind.supp to a “resolved-findings” annex. The evaluation doc pack ships at the end of this sprint.

Effort estimate: ~3 weeks of dev for Tier 1 + Tier 2 (T1-C dominates, the rest are mostly mechanical), plus ~1 week for the Tier 3 verification wiring. Updates to this table are tracked in the change log of quantica/doc/sca/index.rst.

References

NIST FIPS 203 — ML-KEM
NIST FIPS 204 — ML-DSA
NIST FIPS 205 — SLH-DSA
NIST ACVP-Server — official conformance test vectors
C2SP / Wycheproof — edge-case and negative test vectors
Reparaz, Balasch, Verbauwhede (2017) — “dude, is my code constant time?” (the dudect methodology used in silentops::verify)

License

Apache-2.0.

krypteia-quantica — Post-Quantum Cryptography for the krypteia workspace

Design rules

Algorithms

ML-KEM (FIPS 203)

ML-DSA (FIPS 204)

SLH-DSA (FIPS 205)

Cargo features

Quick start

ML-KEM (FIPS 203) — Key Encapsulation

ML-DSA (FIPS 204) — Digital Signature

SLH-DSA (FIPS 205) — Stateless Hash-Based Signature

Typed key wrappers (Zeroize-on-Drop)

Parameter sets / curve families

ML-KEM (FIPS 203)

ML-DSA (FIPS 204)

SLH-DSA (FIPS 205) — SHAKE variants only

Design decisions

Side-channel countermeasures (summary)

Always-on

Feature-gated (sca-protected, on by default)

Approximate cost (single-threaded, release mode)

Timing leakage verification (dudect)

Known residual surface

Per-algorithm deep dives

Performance

Building

Desktop / server (default)

no_std / bare-metal cross-compile

Cargo profiles

Test validation

NIST ACVP — happy-path conformance

Wycheproof — edge cases and negative tests

Custom negative / robustness tests

Running everything

Policy on test suites

Examples

Rust

C FFI

Module map

Known limitations

Side-channel protection

Standards conformance

Portability

Testing

Roadmap

Tier 1 — Active vulnerabilities (critical path)

Tier 2 — Hardening for evaluation

Tier 3 — Verification tooling

Tier 4 — Deferred / beyond the current evaluation scope

Tier 5 — Documentation pass

Already shipped (trace-back)

Suggested execution order (critical path)

References

License

Feature-gated (`sca-protected`, on by default)

`no_std` / bare-metal cross-compile