Hermelink 2025/276 audit pass on ml_dsa::masked
- Status:
Shipped (audit only — code work-list cross-referenced below).
- Scope:
quantica/src/ml_dsa/masked.rsplus the rejection-loop callers inquantica/src/ml_dsa/dsa.rs.- Reference:
[HNP25] (CRYPTO 2025, IACR ePrint 2025/276).
- Audit method:
Static walk of every public masked gadget and every
unmask()call site, classified against the Hermelink leak taxonomy.
Why this audit
[HNP25] does not break ML-DSA
or its masked variant; it provides an information-theoretic enumeration
of the operations inside a masked Dilithium implementation that
leak even when each individual gadget is masked, because the
aggregate statistic combines shares unsafely. The dominant leak class
is the rejection-loop recombination: shares of y, s1, s2,
t0 get unmasked into plaintext aggregates (w, z, cs1,
cs2, ct0) that then drive data-dependent non-linear operations
(Decompose, MakeHint, norm checks).
The paper is the auditor checklist for any masked-ML-DSA claim. T1-B of the krypteia Tier-1 roadmap walks that checklist over our code and records, gadget by gadget, whether the leak class is closed, acknowledged residual risk, or scheduled for reinforcement. This file is the resulting evidence piece — readable as a stand-alone evaluation surface and traceable down to a file:line for every claim.
Scope and out-of-scope
In scope — the masked-arithmetic surface area of ML-DSA:
the masked-poly type and its primitives (
MaskedPolyinquantica/src/ml_dsa/masked.rs);the masked algorithmic kernels reused inside
sign_internal(masked NTT, masked × public, masked matrix-vector multiplication);every
unmask()call site inside the rejection loop ofquantica/src/ml_dsa/dsa.rs::sign_internal, and every data-dependent operation consuming the resulting plaintext aggregate.
Out of scope — outside Hermelink’s target:
the unmasked NTT (
ml_dsa::ntt), the samplers (ml_dsa::sample), and the rejection-orchestration tail (sign_internalafter a successful rejection-loop exit), which operate on quantities already public per FIPS-204 or on the final signature bytes;ML-KEM masked code and SLH-DSA: not the paper’s target. The same audit-annex pattern can be reused there if needed in a future tier.
Leak-class taxonomy (Hermelink 2025/276)
We use a five-class shorthand that maps the paper’s section structure to our code:
C1 — Recombination of secret shares. Every
unmask()call produces a plaintext aggregate. Even when the upstream operations are masked, the resulting plaintext is the natural DPA / SPA target.C2 — Decompose / HighBits / LowBits on unmasked aggregates. Once the plaintext is on the stack, any non-linear bit-extraction with data-dependent control flow becomes a CPA target (the resulting
HighBitspolynomial directly enters the signature).C3 — Hint compression.
MakeHintis a per-coefficient comparison whose count is bounded byω; the count itself and the per-coefficient outcome leak the shape ofw - c·s2modulo2·γ₂.C4 — Mask refresh sufficiency. Higher-order DPA aggregates traces across rejection iterations; a missing refresh between iterations collapses the higher-order assumption.
C5 — Sampler-side leakage on `y`.
ExpandMaskis the canonical DPA target onSK.prf; the paper requires thatynever materialises in plaintext on the stack between sampling and consumption.
Per-gadget audit matrix
Gadget / call site |
Class |
Status |
Rationale and follow-up |
|
|---|---|---|---|---|
|
|
C5 |
protected |
DPA-safe sampling: |
|
|
C1 (primitive) |
protected |
Correctness verified by |
|
|
C4 (primitive) |
protected |
Re-randomises both shares without changing the unmasked value;
verified by |
|
|
— (linear) |
protected |
NTT is linear over the prime field; applying it to each share
preserves the additive masking invariant. Verified by
|
|
|
— (linear × public) |
protected |
Multiplication by the public challenge polynomial |
|
|
— (linear × public) |
protected |
Same linearity argument: |
Per-iteration mask refresh sufficiency |
whole rejection loop |
C4 |
protected |
|
|
|
C1 |
residual |
|
|
|
C1 |
residual |
|
|
|
C1 |
residual |
First-order safe (masked × public is linear, the unmasking
happens after the multiply). |
|
|
C1 |
residual |
Same as |
|
|
C1 |
residual |
Same as |
|
|
C2 |
residual |
HighBits is on the unmasked |
|
|
C2 |
residual |
Inner-loop low-bits extraction inside Phase 2; data-dependent
on unmasked |
|
|
C2 |
residual |
Vector variant feeding |
|
|
C3 |
residual |
Per-coefficient hint generation, observation of the hint bit pattern is the Hermelink §3 leak. Tier-3 candidate (share-domain MakeHint). |
|
|
C3 |
residual |
Vector variant that returns |
|
|
C1/C2 |
partial |
The classic infinity-norm check on the unmasked |
|
|
C1/C2 |
partial |
Same as |
|
|
C1/C2 |
partial |
Same posture. Used to decide whether |
Status legend
protected — the leak class is closed for this gadget given the threat model in Threat model. Either the operation is linear (so it preserves the additive sharing invariant), or it operates on values that are already public per the FIPS-204 specification.
residual — the leak class is known and acknowledged. The unmasked aggregate is short-lived (immediately re-masked, refreshed, or zeroized through
MaskedPoly::zeroize/silentops::ct_zeroizeafter use). The practical DPA-cost-to-recover-key remains high but is not eliminated. A follow-up Roadmap item that would close the class is named in the matrix row.partial — a complementary defence is already in place (e.g.
sca-ct-rejectionremoves the early-abort timing leak on norm checks) but the information-theoretic class is not fully closed.
Summary work-list
The audit originally surfaced three categories of follow-up; T1-A
has since shipped (see updated rows above). The remaining active
candidates are:
✅ ``T1-A`` — per-iteration mask refresh — shipped.
s1_hat_m/s2_hat_m/t0_hat_mrefreshed at the head of every rejection iteration via thedsa.rshead-of-loop refresh block (Hermelink §4 prescription matched exactly). Closes the C4 sufficiency gap; reduces every C1 residual on the unmask call sites to the plaintext-aggregate floor.Tier-2 candidate — CT norm check on shares for
check_norm_vec(&z, …),check_norm_vec(&r0, …),check_norm_vec(&ct0, …). Would close the C1/C2 partial rows by moving the norm comparison into the share domain via the silentops CT primitives. Cost: a redesign ofcheck_norm_vecto takeMaskedPolyinstead of plaintext. Tracked as a future ticket, not yet on the Roadmap.Tier-3 candidate — share-domain Decompose / MakeHint for the five Decompose / MakeHint call sites listed above. Would close the C2 / C3 rows. Larger redesign (full share-domain
HighBits/LowBits/MakeHintkernels). Tracked as a future ticket, not yet on the Roadmap.
The above three categories — one shipped, two future candidates —
cover every residual and partial row in the matrix. No row
is unaccounted for, and no current protected row degrades under
any of the follow-ups.
References
[HNP25] — the audit reference itself.
[DFM+25] — concealed-ILWE attack on partially-masked Dilithium; motivates the same hardening vector on the
masked_mat_vec_mulgadgets.[ZCQ+26] — rejection-loop attack on the unmasked / hedged path; motivates the
sca-ct-rejectionposture cross-referenced in the partial rows above.ML-DSA — countermeasures — the per-threat coverage matrix and the
T1-Aroadmap entry the audit cross-references.