> Note: This document was AI-generated and reviewed by a maintainer. # ADR 0001 — ZFS Native Encryption: Non-Interactive initrd Key Loading | | | |---|---| | **Status** | Accepted | | **Date** | 2026-05-03 | | **Deciders** | Alice Huston | | **Affects** | `systems/palatine-hill/hardware-changes.nix`, `systems/palatine-hill/zfs.nix` | --- ## Context `palatine-hill` uses ZFS native encryption for the `/nix` dataset (`ZFS-primary/nix`). The ZFS encryption key was stored on a separate LVM volume (`/crypto/keys/zfs-nix-store-key`) inside the same LUKS container as root. This created a forced ordering dependency: the `/nix` dataset could not be unlocked until root (`/`) and `/crypto` were both mounted, even though logically they are independent. Two custom initrd units worked around this: - `zfs-import-zfs-primary` — polling import loop (duplicates NixOS-native logic) - `zfs-load-nix-key` — reads key from `/sysroot/crypto/keys/zfs-nix-store-key` after `sysroot.mount` Additionally, `boot.zfs.requestEncryptionCredentials` was forced off entirely, and a `postBootCommands` fallback ran `zfs load-key -a` after stage 2 as a belt-and-suspenders measure. LUKS unlock was also interactive, requiring manual passphrase entry at boot. ### Current initrd dependency graph (before this ADR) ```mermaid flowchart TD A([initrd start]) --> B[systemd-udev-settle] A --> C["LUKS unlock nixos-pv\n⚠ interactive"] C --> D[LVM activate] D --> E["sysroot.mount\n/ on ext4"] D --> F["sysroot-crypto.mount\n/crypto on LVM volume"] B --> G["zfs-import-zfs-primary\n(custom polling loop, 60s timeout)"] E --> H["zfs-load-nix-key\n(reads /sysroot/crypto/keys/zfs-nix-store-key)"] F --> H G --> H H --> I["sysroot-nix.mount\nZFS-primary/nix"] I --> J([initrd-fs.target]) E --> J J --> K([stage 2]) K --> L["postBootCommands:\nzfs load-key -a"] ``` ### Problems with the old approach 1. **Cross-filesystem key dependency**: `/nix` unlock depends on root mount, coupling two logically independent operations. 2. **Duplicated pool import logic**: the custom unit reimplements a polling loop that NixOS already generates natively; upstream fixes don't apply automatically. 3. **Native credential handling fully disabled**: `requestEncryptionCredentials = false` makes the configuration opaque to NixOS module evaluation. 4. **Double key load**: `postBootCommands` is a workaround indicating the initrd path is not reliable. 5. **Interactive LUKS unlock**: manual passphrase entry required at every boot — defeats unattended operation. --- ## Options Considered ### Option A — Key embedded in initrd (`boot.initrd.secrets`) Store the ZFS key directly inside the initrd cpio archive. The key is available from the very start of stage 1 without mounting anything. **Pro**: Eliminates the cross-mount dependency; re-enables native NixOS ZFS handling; zero new infrastructure. **Con**: Key lives in the initrd on `/boot`, which is an unencrypted vfat partition. Anyone with physical or boot-partition read access has the key. Does not solve interactive LUKS unlock. ### Option B — Tang network key fetch (Clevis) ✅ Chosen Encrypt both secrets (LUKS passphrase and ZFS key) as Clevis JWE blobs. At boot, the initrd reaches a Tang server on the LAN to decrypt them. NixOS's `boot.initrd.clevis` module natively supports `luks`, `zfs`, and `bcachefs` — **no custom unit is needed for ZFS**. **Pro**: Key never present on disk in plaintext; unified unlock surface for both LUKS and ZFS; no cross-mount dependency; JWE blobs on disk are useless without the Tang server. **Con**: Adds Tang server as a boot dependency; server won't boot if Tang is unreachable. --- ## Decision **Option B (Tang/Clevis) is adopted** for both the LUKS root device and the ZFS `/nix` dataset. `boot.initrd.clevis.devices` handles both unlock targets natively. The custom `zfs-load-nix-key` unit is deleted entirely. The `zfs-import-zfs-primary` unit is retained — the pool must still be imported before Clevis can load the dataset key. Static networking is configured in the initrd using systemd-networkd with a static IP (`192.168.76.2/24`). DNS resolution (`192.168.76.1`, the OPNsense router running Unbound) allows the Tang URL to be `http://tang.lan`. ### New initrd dependency graph ```mermaid flowchart TD A([initrd start]) --> N["initrd-networkd\neno1: 192.168.76.2/24\nDNS: 192.168.76.1"] A --> B[systemd-udev-settle] N --> T["Tang server\ntang.lan"] T -->|"boot.initrd.clevis\n.devices.nixos-pv"| C["LUKS unlock nixos-pv\n(Clevis/Tang — unattended)"] T -->|"boot.initrd.clevis\n.devices.ZFS-primary/nix"| Z["ZFS-primary/nix key load\n(Clevis/Tang — unattended)"] C --> D[LVM activate] D --> E["sysroot.mount\n/ on ext4"] B --> G["zfs-import-zfs-primary\n(custom polling loop — retained)"] G --> Z Z --> I["sysroot-nix.mount\nZFS-primary/nix"] E --> J([initrd-fs.target]) I --> J J --> L([stage 2 — fully unattended]) ``` ### Files changed | File | Change | |---|---| | `systems/palatine-hill/hardware-changes.nix` | Removed `requestEncryptionCredentials = mkForce false`, removed `postBootCommands`, added `boot.initrd.clevis` block for both devices, added `boot.initrd.systemd.network` with static IP + DNS, removed `/crypto` from `/nix` depends | | `systems/palatine-hill/zfs.nix` | Removed `zfs-load-nix-key` unit, added `boot.zfs.requestEncryptionCredentials = false` | ### Comparison | | Before | After | |---|---|---| | Custom initrd units | 2 (import + key load) | 1 (import only; key load is native Clevis) | | Key source | `/crypto` LVM volume (disk) | Tang server (network) | | Disk-based key exposure | Key on LVM volume inside LUKS | `.jwe` blob only; useless without Tang | | Cross-mount dependency | Yes | No | | LUKS interactive unlock | Yes | No (Clevis/Tang) | | Unattended boot | No | Yes (when Tang reachable) | --- ## Consequences - Boot requires Tang server to be reachable on `tang.lan`. If Tang is down, boot stalls at the Clevis timeout. Maintain Tang server uptime accordingly. - The `.jwe` files are safe to commit to the repository — they are encrypted blobs that are useless without the Tang server's private key. - Rolling back to a generation without Clevis (pre-ADR) requires manual LUKS passphrase entry at the console; ensure prior generations remain in the bootloader during initial cutover. --- ## Implementation Notes ### Prerequisites 1. Deploy a Tang server on the LAN and create a DNS host override in OPNsense: - Services → Unbound DNS → Host Overrides → `tang` / `lan` / `` 2. Verify DNS from palatine-hill before rebooting: ```bash resolvectl query tang.lan ``` ### Create the JWE files Run from the repository root on a machine that has the LUKS passphrase and access to the running `/crypto` volume: ```bash # LUKS passphrase JWE — substitute your actual passphrase echo -n "your-luks-passphrase" | \ clevis encrypt tang '{"url":"http://tang.lan"}' \ > systems/palatine-hill/nixos-pv.jwe # ZFS dataset key JWE — key file from the running system clevis encrypt tang '{"url":"http://tang.lan"}' \ < /crypto/keys/zfs-nix-store-key \ > systems/palatine-hill/nix-store.jwe ``` ### Commit and build ```bash git add systems/palatine-hill/nixos-pv.jwe systems/palatine-hill/nix-store.jwe git commit -m "feat(palatine-hill): add Clevis JWE files for Tang-based boot unlock" nix build .#palatine-hill # verify build succeeds ``` ### Deploy ```bash nh os switch # keep previous generation in bootloader for rollback ``` ### Verify after reboot ```bash # Confirm ZFS dataset was unlocked automatically zfs get keystatus ZFS-primary/nix # Expected: keystatus = available # Check Clevis log output journalctl -b | grep -i clevis # Confirm Tang was reached during initrd journalctl -b | grep -i tang ``` ### Rollback procedure (if needed) Select the previous generation from the systemd-boot menu at boot. You will be prompted interactively for the LUKS passphrase — this is expected for the old generation.