################################################################### X25519 / X448 — countermeasures ################################################################### :Spec: RFC 7748 :cite:`rfc7748` :Crate path: ``arcana::ecc::x25519`` (Curve25519 Diffie-Hellman), ``arcana::ecc::x448`` (Curve448 Diffie-Hellman) :Cargo feature: none — both compiled unconditionally. X25519 and X448 are the two ECDH primitives on Montgomery curves in arcana. They are CT by **construction** (the X coordinate is the only state, no Y, no special cases for the neutral element) which is why they are popular in modern protocol designs (TLS 1.3, Noise, Signal, WireGuard). That said, *any* concrete Montgomery-ladder implementation can still leak SCA information through the field operations, the RNG-derived blinding, and side effects of the SHAKE-style clamping :cite:`weissbart2021_curve25519_ml_sca`. .. contents:: :local: :depth: 2 Coverage matrix =============== .. list-table:: X25519/X448 countermeasure / threat matrix :header-rows: 1 :widths: 25 18 57 * - Threat - Status - Countermeasure(s) * - Software / cache-timing on Montgomery ladder - **partial — audit pending** - Item ``T1-G``: audit ``x25519_ladder`` and ``x448_ladder`` under the same lens as the Weierstrass-side fix. * - SPA on Cortex-M0 / RISC-V - **vulnerable** - Same audit ``T1-G``; deep-learning SCA on Curve25519 Cortex-M0 implementations was demonstrated in :cite:`weissbart2021_curve25519_ml_sca` even against random- delay defences. * - DPA on field operations - **vulnerable** - Plan ``T2-A`` (Z-rerandomization) — adapted to Montgomery ladder's ``(X : Z)`` projective coordinates. * - Template attacks on the per-iteration ``ct_swap`` - **vulnerable** - Plan ``T2-A`` + ``T2-B`` (scalar blinding). * - Invalid-curve attack (peer pubkey on twist) - **implemented** - X25519 and X448 are by design twist-secure (RFC 7748 §6.1), so invalid-curve attacks reduce to a small subgroup attack which is mitigated by the "all-zero shared secret" contributory check (when applicable). * - Small-subgroup contributory check - **partial** - The X25519 / X448 functions return a shared secret of all- zero when the peer pubkey is in the small-order subgroup; callers should reject. Audit ``T2-K`` to confirm the check is in place and CT. Background — Montgomery ladder for X25519 ========================================= X25519 (RFC 7748) computes ``X(k · P)`` from ``k`` and ``X(P)`` using a constant-time Montgomery ladder over ``(X : Z)`` projective coordinates: .. code-block:: text X1 := X(P) X2, Z2 := 1, 0 ; representing the neutral element X3, Z3 := X1, 1 for t in [254..0]: k_t := bit t of k cswap(k_t, X2, X3) cswap(k_t, Z2, Z3) (X2, Z2, X3, Z3) := double_and_add(X1, X2, Z2, X3, Z3) cswap(k_t, X2, X3) cswap(k_t, Z2, Z3) return X2 / Z2 The structure is essentially identical to the Weierstrass-side ``scalar_mul_point`` and benefits from the same hardening techniques. Audit gaps (``T1-G``) ===================== The arcana X25519 / X448 implementations were ported from RFC 7748 reference code with the standard idiom "constant-time swap implemented as ``mask = -bit; t = mask & (a ^ b); a ^= t; b ^= t``". The same LLVM regression observed on Weierstrass ``ecc::field`` (mask-pattern → branch recovery) applies here, so the audit checklist mirrors the Weierstrass one: 1. ``cswap`` must compile branchless under ``opt-level=2``. Apply ``core::hint::black_box`` on the mask if the release asm shows a recovered branch. 2. ``double_and_add`` body must not branch on field-element limbs. The inner field operations (``ecc::field`` for Curve25519 / Curve448 primes) are shared with the Weierstrass code and already received the ``black_box`` treatment in commit 76191c1; confirm the X25519 path uses the *same* ``field_add`` / ``field_sub`` / ``reduce_wide`` and not a separate copy. 3. **Final inversion** ``Z2^{-1} mod p`` uses Fermat (``Z2^(p-2) mod p``), which goes through the CT ``field_pow``; re-confirm. 4. **Clamping** of the scalar ``k`` (clear bits 0, 1, 2, 255; set bit 254 for X25519; analogous for X448) is bitwise; no branch risk by construction. Estimated effort: 1 day audit + 0.5 day fix. Z-randomization on ``(X : Z)`` (``T2-A``) ========================================= The Montgomery projective ``(X : Z)`` representation admits the same ``λ``-rescaling as Jacobian Weierstrass: .. math:: (X, Z) \;\sim\; (\lambda X, \lambda Z), \qquad \lambda \stackrel{\$}{\leftarrow} \mathbb{F}_p^* So at the ladder start, draw ``λ`` from the SCA-RNG and replace ``(X1, 1)`` (the input point) by ``(λX1, λ)`` — both ``X3, Z3`` follow because the loop derives them from ``X2, Z2``. This is **the exact countermeasure that broke :cite:`weissbart2021_curve25519_ml_sca`'s template attack** on unprotected Curve25519 implementations; once Z-rerandomization is in, the per-iteration intermediates randomize across signatures and the profiled attack does not align. Cost: 2 field multiplications. Negligible. Implementation hook: today the X25519 / X448 entry points (``x25519_derive_public``, ``x25519_ecdh``) are pure functions without an RNG argument. Adding Z-rerand requires either: * changing the API to take a ``CryptoRng`` callback (breaking), or * deriving an internal SCA-RNG seed from ``H(sk_bytes ‖ peer_pk_bytes ‖ "x25519-z-rerand")`` and using a SHAKE-derived stream, à la ECDSA-deterministic ``T2-A``. The latter preserves the API, the determinism for KAT, and the zero-RNG-failure-mode property that makes X25519 attractive in the first place. Recommendation: go with the SHAKE-derived approach. Scalar blinding (``T2-B``) ========================== Scalar blinding ``k' = k + r · ℓ`` works the same as for Edwards; ``ℓ = 2^252 + 27742...`` for X25519 (the order of the prime-order subgroup). 64 random bits is the standard, costing ~25 % per-call overhead on a 254-bit scalar. For X25519 this is **layered on top of** the existing clamping; clamp first, then blind, then ladder. The blinding does not break the clamping properties (``k' mod 8 = k mod 8``, etc.) since ``ℓ`` is congruent to a known value mod 8. Reading list ============ * :cite:`weissbart2021_curve25519_ml_sca` — ML-based template SCA on Curve25519 Cortex-M0; the canonical "even with random delays, you leak" baseline. * :cite:`bernstein2006_curve25519` — the original Curve25519 paper, which already gives the CT-by-construction argument. * :cite:`hutter2015_curve25519_arm` — high-speed Curve25519 on ARM Cortex-M0; the reference for embedded performance numbers. * :cite:`adomnicai2024_curve25519_curve448` — recent unified hardware design including Z-randomization and CT timing. Code path summary ================= .. list-table:: :header-rows: 1 :widths: 30 35 35 * - Path - Today (2026-04-21) - Target (post T1-G + T2-A + T2-B) * - ``ecc::x25519::x25519_ladder`` - CT structure (RFC 7748 idiom), audit pending - Audited CT, Z-rerand, scalar blinding * - ``ecc::x25519::x25519_derive_public`` - Pure function, no RNG - Same API; internal SCA-RNG seeded from sk * - ``ecc::x25519::x25519_ecdh`` - Pure function, no RNG - Same API; internal SCA-RNG seeded from sk + peer_pk * - ``ecc::x448::*`` - Same as X25519 mutatis mutandis - Same plan as X25519 * - Field arithmetic (CURVE25519_P, CURVE448_P) - Reuses ``ecc::field::*`` (already ``black_box``-shielded) - Unchanged