###################################################################
Shared side-channel primitives — ``silentops``
###################################################################

The ``silentops`` crate is the single source of truth for the
low-level side-channel primitives used by ``arcana`` (and by
``quantica`` on the post-quantum side). Keeping these primitives
in a separate crate means:

* a single audit surface for CT correctness, independent of any
  particular algorithm;
* architecture-specific assembly backends selected at compile time
  via Cargo features, so a downstream crate never embeds per-arch
  ``asm`` in its own source;
* the same primitives are used by the statistical (``dudect``) and
  the client-request (``ctgrind``) side-channel verifiers, keeping
  test coverage coherent.

This chapter is a reference for the primitives **as used by
arcana**. The ``silentops::ct`` API itself is documented in
``quantica/doc/sca/primitives.rst`` (the two crates share the
identical surface); the additions below are the arcana-specific
notes about *which* primitives are wired into *which* arcana code
paths.

.. contents::
   :local:
   :depth: 2

Constant-time selects
=====================

``silentops::ct_select_u8`` / ``ct_select_i16`` / ``ct_select_i32``
are the workhorse branchless selects. Within arcana they are
currently used by:

* **HMAC / CMAC / KMAC tag verification** — ``mac::ctx::Mac::verify``
  delegates to ``silentops::ct_eq`` for the tag comparison.
* **AEAD tag verification** — ``cipher::modes::Gcm::decrypt``,
  ``cipher::ccm::Ccm::decrypt``, ``cipher::chacha20poly1305``,
  ``cipher::xchacha20poly1305``: same pattern, ``ct_eq``.
* **ECDSA verify** — sig.r / sig.s structural checks
  (``ecc::ecdsa::verify_internal``).

The ECC point-level CT-select used inside the Montgomery ladder is
**arcana-local** at present (``ecc::curve::ct_select_point``,
``ecc::curve::ct_swap``) because ``silentops::ct`` does not yet
expose a multi-limb / array select primitive. Migrating to
``silentops`` once a ``ct_select_array`` family is available is
item ``T2-F`` in the roadmap.

Volatile zeroization
====================

``silentops::ct_zeroize`` and ``ct_zeroize_i16`` are the canonical
volatile-write zeroizers (``core::ptr::write_volatile`` +
``compiler_fence(SeqCst)``).

**arcana does not yet use them in ``Drop``** for any of its typed
key wrappers. The README's ``Typed key wrappers (Zeroize-on-Drop)``
section flags this gap explicitly. Closing it is item ``T2-E``: add
``Drop`` impls calling ``silentops::ct_zeroize`` to:

* ``rsa::rsa::RsaSecretKey`` — note that the ``BigInt`` fields
  (``n``, ``d``, ``p``, ``q``, ``dp``, ``dq``, ``qinv``) hold
  ``Vec<u64>`` storage; the ``Drop`` must walk each one.
* ``ecc::eddsa::Ed25519SecretKey`` — fixed 32-byte array, trivial.
* ``ecc::curves::SecretKey`` — ``bytes: Vec<u8>``.
* X25519 / X448: the API today consumes/produces raw byte arrays
  on the stack, so callers are responsible. A typed-wrapper layer
  is a candidate refresh once the ECC ``SecretKey`` lands.

Compiler shielding (``black_box``) for bit-mask CT
==================================================

LLVM at ``opt-level=2`` and above has been observed (rustc 1.84+)
to recover a secret-dependent branch from the bit-mask select
pattern ``(x & mask) | (y & !mask)`` when ``mask`` is derived from
a ``cond ∈ {0, 1}`` value. This affects every classical CT field
operation that conditionally subtracts ``p`` after a sum or
comparison: ``field_add``, ``field_sub``, ``reduce_wide``,
``reduce_mod_n``.

The arcana fix is to wrap the derived mask in
``core::hint::black_box`` so the optimizer cannot trace its
provenance:

.. code-block:: rust

   let need_sub = carry | (1u64.wrapping_sub(borrow));
   let mask = core::hint::black_box(0u64.wrapping_sub(need_sub));
   let inv_mask = !mask;
   for j in 0..LIMBS {
       remainder.limbs[j] =
           (trial[j] & mask) | (remainder.limbs[j] & inv_mask);
   }

This pattern lives at three sites in ``ecc::field``: ``field_add``,
``field_sub``, ``reduce_wide``. It is **not yet** applied to
``rsa::bigint`` — item ``T1-E`` includes that audit.

Verified release-asm branch counts (x86_64, all 4 curve widths)
post-commit ``76191c1``:

* ``scalar_mul_point``: 1 (loop counter)
* ``point_double``: 0
* ``point_add_ct``: inlined, 0 point-value branches

Compared to pre-commit ``76191c1`` (which had 8 branches in
``point_double`` and 12 in ``point_add``), this is a net source-
level CT win even before the ``silentops`` asm migration of
``T2-F``.

Statistical timing verification — ``silentops::verify``
========================================================

``silentops::verify`` implements the dudect methodology
:cite:`reparaz2017dudect`. Surface used by arcana today: **none**;
target use is documented in :doc:`verification`.

Planned arcana-side dudect harnesses (item ``T3-B``):

1. ``scalar_mul_point`` on each of the 7 short-Weierstrass curves
   — fixed-vs-random scalar.
2. ``rsa_decrypt_raw`` (CRT path) — fixed-vs-random ciphertext.
3. AEAD decrypt tag-check — fixed-vs-random tag (regression test
   on the ``ct_eq`` chain).
4. Once ``T1-A`` lands: fixsliced AES round function — fixed-vs-
   random key.

Architecture dispatch
=====================

The ``silentops/src/ct/mod.rs`` file selects exactly one backend
at compile time. The table is identical between arcana and
quantica:

.. list-table::
   :header-rows: 1
   :widths: 30 40 30

   * - Target
     - Cargo feature
     - Method
   * - x86_64
     - ``asm-x86_64``
     - Inline ``cmovnz`` (``csel``-equivalent on x86)
   * - aarch64
     - ``asm-aarch64``
     - Inline ``csel``, ``csinv``
   * - thumbv7em
     - ``asm-thumbv7``
     - IT blocks + conditional execution
   * - thumbv6m
     - ``asm-thumbv6m``
     - AND/OR/XOR (no IT, no csel)
   * - riscv32
     - ``asm-riscv32``
     - AND/OR/XOR (no cmov)
   * - default
     - none
     - Pure-Rust bit-mask fallback (relies on compiler + black_box)

For arcana, the ``black_box`` shielding documented above is the
**default-build CT guarantee**. With an asm backend enabled, the
guarantee shifts to "the compiler cannot rewrite the asm" — a
strictly stronger property.