################################################################### HMAC / CMAC / KMAC / GMAC — countermeasures ################################################################### :Spec: RFC 2104 :cite:`rfc2104`, FIPS 198-1 (HMAC), NIST SP 800-38B :cite:`nist_sp_800_38b` (CMAC), NIST SP 800-185 (KMAC), NIST SP 800-38D (GMAC) :Crate path: ``arcana::mac::ctx`` (streaming Mac wrapper), ``arcana::cipher::poly1305`` (Poly1305 — function- oriented, not in Mac ctx by design) :Cargo feature: none — all four families compiled unconditionally. The MAC family lives at the intersection of two SCA literatures: the *symmetric* SCA literature (CDPA on HMAC-SHA-2, :cite:`belenky2023_cdpa_hmac_sha2`; classical CPA on AES-CMAC) and the *bigint* SCA literature (none directly applicable — HMAC/CMAC/KMAC/GMAC do not require modular arithmetic). The most important recent finding is :cite:`belenky2023_cdpa_hmac_sha2`, TCHES 2023 Issue 3: *any* implementation of HMAC-SHA-2, even pure parallel hardware, leaks the secret key in 30 K – 275 K traces under **Carry-based Differential Power Analysis** (CDPA). This is a strict tightening over the prior literature (e.g. :cite:`belaid2014hmac_sha2_dpa`) which had hoped that DPA on HMAC-SHA-2 was hard. .. contents:: :local: :depth: 2 Coverage matrix =============== .. list-table:: MAC countermeasure / threat matrix :header-rows: 1 :widths: 25 18 57 * - Threat - Status - Countermeasure(s) * - Software / cache-timing on tag verify - **implemented** - ``mac::ctx::Mac::verify`` uses ``silentops::ct_eq``; returns a single bool independent of which byte differed. * - DPA / CDPA on HMAC-SHA-2 inner / outer state :cite:`belenky2023_cdpa_hmac_sha2` - **vulnerable** - Plan ``T2-D``: first-order Boolean masking of the SHA-2 compression function for the inner / outer keyed states. Also covers the Ed25519 SHA-512 path (:doc:`eddsa`). * - DPA on AES-CMAC subkey derivation - **vulnerable** - Mitigated by ``T2-G`` (masked AES) — once AES is masked, CMAC inherits. * - DPA on GMAC GF(2^128) multiplier - **vulnerable** - Plan ``T2-H``: CT carry-less multiplier for GHASH; on hosts use PCLMULQDQ / PMULL. * - Length-extension / hash-misuse - **n/a — by design** - HMAC, CMAC, KMAC, GMAC are all designed to resist length-extension on their underlying primitive. Carry-based DPA on HMAC-SHA-2 (``T2-D``) ======================================== Why CDPA is special ------------------- Classical DPA targets a single non-linear gate (e.g. an AES S-box) where the leakage model is "Hamming weight of the gate output" or "Hamming distance between two register states". :cite:`belenky2023_cdpa_hmac_sha2` introduced the *carry* of an addition as a leakage model: the bit that propagates between adjacent bit positions of an arithmetic addition has a noticeable power signature on most hardware (especially CPUs without explicit carry-handling tricks). SHA-2's compression function is dominated by 32-bit / 64-bit additions: .. math:: T_1 = h + \Sigma_1(e) + \mathrm{Ch}(e, f, g) + K_t + W_t \\ T_2 = \Sigma_0(a) + \mathrm{Maj}(a, b, c) The carry chain inside each ``+`` is a *32-input linear function of the input bits*, which CDPA models. With a carry-leakage model, the attacker recovers the inner-state words bit-by-bit. Result: HMAC-SHA-2 (which feeds the secret key into the inner ``H((K ⊕ ipad) ‖ M)`` state) leaks the key in 30 K – 275 K traces, **even** in pure parallel hardware where the bytes are processed simultaneously. **Software implementations leak even more easily** because the additions are explicit instructions on a sequential pipeline. Implication for arcana ---------------------- The SHA-256 / SHA-512 compression functions in ``arcana::hash::sha256`` / ``sha512`` are textbook reference implementations. They are CT (no secret-dependent branches), but they are **DPA-vulnerable** to CDPA and to the more general :cite:`belaid2014hmac_sha2_dpa` style HW-leakage attacks. For deployments where the threat model includes a level-2 attacker with EM / power probes, **the HMAC-SHA-2 keys in arcana must be assumed extractable**. Any lab-class evaluation falls within this threat model. Countermeasure -------------- The standard answer is **first-order Boolean masking of the SHA-2 compression function**: * Each 32-bit (resp. 64-bit) state word ``w`` is split into two shares ``w0 ⊕ w1 = w`` with ``w0 ← rng()``. * The linear operations of SHA-2 (XOR, rotations, shifts) commute with XOR, so they are applied to each share independently. * The non-linear operations are: * ``Ch(e, f, g) = (e ∧ f) ⊕ (¬e ∧ g)`` — a masked AND, standard technique (``Trichina mask``, :cite:`trichina2003masked`). * ``Maj(a, b, c) = (a ∧ b) ⊕ (a ∧ c) ⊕ (b ∧ c)`` — three masked ANDs. * The 32-bit additions ``T_1, T_2`` — the **harder** part. A Boolean-shared addition uses the Goubin transform :cite:`goubin2001boolean_arithmetic` to switch from Boolean to arithmetic shares, perform the addition, and switch back. Implementation route in arcana ------------------------------ The masked SHA-2 lives behind the same ``sca-protected`` feature flag used by quantica's masking layer (already wired in the workspace ``Cargo.toml``). * New module ``arcana::hash::sha2_masked`` exposing ``MaskedSha256`` and ``MaskedSha512`` types. * Internally each state word is a 2-share ``MaskedU32`` / ``MaskedU64``; operations are constant-time on the shares. * ``mac::ctx::Mac::sign`` / ``Mac::verify`` route through the masked variants when the feature is on. * Performance expectation: **~3-5×** the unmasked SHA-2 per literature. * KAT regression: outputs are bit-identical to the unmasked variant (the masking is mathematically transparent). Cost vs. evaluation benefit --------------------------- For the target evaluation the **attacker is permitted observational SCA**; without masking, HMAC-SHA-2 fails at this level. ``T2-D`` is therefore on the evaluation critical path even though it is labelled "Tier 2" (it sits below T1 because T1 has the arguably-larger Bellcore RSA gap, and below :doc:`aes`'s ``T1-A`` because every other primitive depends on AES being CT-safe first). Dependence on Ed25519 --------------------- The same SHA-512 primitive is used in Ed25519 to derive the nonce ``r = H(prefix ‖ M) mod ℓ`` and the challenge ``k = H(R ‖ A ‖ M) mod ℓ``. ``T2-D`` (masking SHA-512 for HMAC) **transparently extends** to Ed25519 once the masked SHA-512 is plumbed through ``ed25519_sign``. No separate item. CMAC / KMAC / GMAC ================== CMAC (AES-based) ---------------- CMAC inherits its SCA properties from the underlying AES. Once ``T1-A`` (fixsliced AES) lands, CMAC's first-round leak is gone; once ``T2-G`` (masked AES) lands, CMAC inherits the DPA defence. **No CMAC-specific countermeasure is needed beyond the AES-side hardening.** The CMAC subkey derivation (computing ``L = AES_K(0)``, ``K1 = (2 · L) mod x^128 + r_128``, ``K2 = 2 · K1``) operates on public-domain values once ``L`` is computed, and the doubling in GF(2^128) is the same CT carry-less multiplier as GHASH (``T2-H``). KMAC (Keccak-based) ------------------- KMAC128 / KMAC256 build on cSHAKE, which builds on Keccak-f[1600]. The Keccak permutation is structurally CT (no S-box LUT, no secret-dependent branches in the round function). DPA on Keccak is harder than on SHA-2 — the only addition is the ``ι`` step's XOR with a round constant, which carries no key information. The non-linear ``χ`` step is a 5-bit AND-XOR pattern that masking papers (:cite:`bertoni2017keccak_masking`) cover but which has not been an evaluation-flagged target. For arcana: KMAC ships as-is for now; revisit only if an evaluation-level Keccak attack appears. GMAC (GHASH-based) ------------------ The GHASH multiplier ``H = AES_K(0^128)`` and per-block ``X_i = (X_{i-1} ⊕ block_i) · H`` over GF(2^128) is the SCA target. The mitigation is item ``T2-H`` (CT carry-less multiplier), also flagged in :doc:`aes`. Code path summary ================= .. list-table:: :header-rows: 1 :widths: 30 35 35 * - Path - Today (2026-04-21) - Target (post T2-D + T2-G + T2-H) * - ``mac::ctx::Mac::verify`` - ``ct_eq`` tag compare - Unchanged * - HMAC-SHA-256/384/512 inner state - Unmasked - Masked (``sca-protected`` feature) * - HMAC-SHA-3 inner state - Unmasked, Keccak-CT-by-structure - Unchanged for now * - CMAC subkey derivation - Inherits AES leak - Inherits fixsliced + masked AES * - KMAC128 / KMAC256 - Keccak-CT-by-structure - Unchanged for now * - GMAC GHASH multiplier - Audit pending; likely table-driven - CT carry-less multiplier (T2-H)