HMAC / CMAC / KMAC / GMAC — countermeasures
- Spec:
RFC 2104 [KBC97], FIPS 198-1 (HMAC), NIST SP 800-38B [NationalIoSaTechnology05] (CMAC), NIST SP 800-185 (KMAC), NIST SP 800-38D (GMAC)
- Crate path:
arcana::mac::ctx(streaming Mac wrapper),arcana::cipher::poly1305(Poly1305 — function- oriented, not in Mac ctx by design)- Cargo feature:
none — all four families compiled unconditionally.
The MAC family lives at the intersection of two SCA literatures: the symmetric SCA literature (CDPA on HMAC-SHA-2, [BDT+23]; classical CPA on AES-CMAC) and the bigint SCA literature (none directly applicable — HMAC/CMAC/KMAC/GMAC do not require modular arithmetic).
The most important recent finding is [BDT+23], TCHES 2023 Issue 3: any implementation of HMAC-SHA-2, even pure parallel hardware, leaks the secret key in 30 K – 275 K traces under Carry-based Differential Power Analysis (CDPA). This is a strict tightening over the prior literature (e.g. [BBD+13]) which had hoped that DPA on HMAC-SHA-2 was hard.
Coverage matrix
Threat |
Status |
Countermeasure(s) |
|---|---|---|
Software / cache-timing on tag verify |
implemented |
|
DPA / CDPA on HMAC-SHA-2 inner / outer state [BDT+23] |
vulnerable |
Plan |
DPA on AES-CMAC subkey derivation |
vulnerable |
Mitigated by |
DPA on GMAC GF(2^128) multiplier |
vulnerable |
Plan |
Length-extension / hash-misuse |
n/a — by design |
HMAC, CMAC, KMAC, GMAC are all designed to resist length-extension on their underlying primitive. |
Carry-based DPA on HMAC-SHA-2 (T2-D)
Why CDPA is special
Classical DPA targets a single non-linear gate (e.g. an AES S-box) where the leakage model is “Hamming weight of the gate output” or “Hamming distance between two register states”. [BDT+23] introduced the carry of an addition as a leakage model: the bit that propagates between adjacent bit positions of an arithmetic addition has a noticeable power signature on most hardware (especially CPUs without explicit carry-handling tricks).
SHA-2’s compression function is dominated by 32-bit / 64-bit additions:
The carry chain inside each + is a 32-input linear function
of the input bits, which CDPA models. With a carry-leakage
model, the attacker recovers the inner-state words bit-by-bit.
Result: HMAC-SHA-2 (which feeds the secret key into the inner
H((K ⊕ ipad) ‖ M) state) leaks the key in 30 K – 275 K
traces, even in pure parallel hardware where the bytes are
processed simultaneously. Software implementations leak even
more easily because the additions are explicit instructions
on a sequential pipeline.
Implication for arcana
The SHA-256 / SHA-512 compression functions in
arcana::hash::sha256 / sha512 are textbook reference
implementations. They are CT (no secret-dependent branches), but
they are DPA-vulnerable to CDPA and to the more general
[BBD+13] style HW-leakage attacks.
For deployments where the threat model includes a level-2 attacker with EM / power probes, the HMAC-SHA-2 keys in arcana must be assumed extractable. Any lab-class evaluation falls within this threat model.
Countermeasure
The standard answer is first-order Boolean masking of the SHA-2 compression function:
Each 32-bit (resp. 64-bit) state word
wis split into two sharesw0 ⊕ w1 = wwithw0 ← rng().The linear operations of SHA-2 (XOR, rotations, shifts) commute with XOR, so they are applied to each share independently.
The non-linear operations are:
Ch(e, f, g) = (e ∧ f) ⊕ (¬e ∧ g)— a masked AND, standard technique (Trichina mask, [Tri03]).Maj(a, b, c) = (a ∧ b) ⊕ (a ∧ c) ⊕ (b ∧ c)— three masked ANDs.The 32-bit additions
T_1, T_2— the harder part. A Boolean-shared addition uses the Goubin transform [Gou01] to switch from Boolean to arithmetic shares, perform the addition, and switch back.
Implementation route in arcana
The masked SHA-2 lives behind the same sca-protected feature
flag used by quantica’s masking layer (already wired in the
workspace Cargo.toml).
New module
arcana::hash::sha2_maskedexposingMaskedSha256andMaskedSha512types.Internally each state word is a 2-share
MaskedU32/MaskedU64; operations are constant-time on the shares.mac::ctx::Mac::sign/Mac::verifyroute through the masked variants when the feature is on.Performance expectation: ~3-5× the unmasked SHA-2 per literature.
KAT regression: outputs are bit-identical to the unmasked variant (the masking is mathematically transparent).
Cost vs. evaluation benefit
For the target evaluation the attacker is permitted observational
SCA; without masking, HMAC-SHA-2 fails at this level.
T2-D is therefore on the evaluation critical path even though
it is labelled “Tier 2” (it sits below T1 because T1 has the
arguably-larger Bellcore RSA gap, and below
AES — countermeasures’s T1-A because every other primitive depends on
AES being CT-safe first).
Dependence on Ed25519
The same SHA-512 primitive is used in Ed25519 to derive the
nonce r = H(prefix ‖ M) mod ℓ and the challenge
k = H(R ‖ A ‖ M) mod ℓ. T2-D (masking SHA-512 for HMAC)
transparently extends to Ed25519 once the masked SHA-512 is
plumbed through ed25519_sign. No separate item.
CMAC / KMAC / GMAC
CMAC (AES-based)
CMAC inherits its SCA properties from the underlying AES. Once
T1-A (fixsliced AES) lands, CMAC’s first-round leak is gone;
once T2-G (masked AES) lands, CMAC inherits the DPA defence.
No CMAC-specific countermeasure is needed beyond the AES-side
hardening.
The CMAC subkey derivation (computing L = AES_K(0),
K1 = (2 · L) mod x^128 + r_128, K2 = 2 · K1) operates on
public-domain values once L is computed, and the doubling in
GF(2^128) is the same CT carry-less multiplier as GHASH (T2-H).
KMAC (Keccak-based)
KMAC128 / KMAC256 build on cSHAKE, which builds on Keccak-f[1600].
The Keccak permutation is structurally CT (no S-box LUT, no
secret-dependent branches in the round function). DPA on Keccak
is harder than on SHA-2 — the only addition is the ι step’s
XOR with a round constant, which carries no key information. The
non-linear χ step is a 5-bit AND-XOR pattern that masking
papers ([BDH+17]) cover but which has
not been an evaluation-flagged target.
For arcana: KMAC ships as-is for now; revisit only if an evaluation-level Keccak attack appears.
GMAC (GHASH-based)
The GHASH multiplier H = AES_K(0^128) and per-block
X_i = (X_{i-1} ⊕ block_i) · H over GF(2^128) is the SCA
target. The mitigation is item T2-H (CT carry-less multiplier),
also flagged in AES — countermeasures.
Code path summary
Path |
Today (2026-04-21) |
Target (post T2-D + T2-G + T2-H) |
|---|---|---|
|
|
Unchanged |
HMAC-SHA-256/384/512 inner state |
Unmasked |
Masked ( |
HMAC-SHA-3 inner state |
Unmasked, Keccak-CT-by-structure |
Unchanged for now |
CMAC subkey derivation |
Inherits AES leak |
Inherits fixsliced + masked AES |
KMAC128 / KMAC256 |
Keccak-CT-by-structure |
Unchanged for now |
GMAC GHASH multiplier |
Audit pending; likely table-driven |
CT carry-less multiplier (T2-H) |