What ships is not always what was reviewed.
Release artifacts diverge from the source repository. That gap is where implants live. Diff Range measures source-to-artifact divergence: files present in a distributed release that never existed in version-controlled source, and files modified between the two. This station runs that diff on a real subject, surfaces the injected content, and characterizes the payload.
The review process audits one thing. The build ships another.
Open-source software is released as tarballs, not as git repositories. The tarball passes through a build process that assembles generated files, autoconf outputs, and bundled assets before the package reaches downstream distributors. Every one of those steps is a surface for divergence. An attacker who controls the release process can inject content that never touched version control and that no code reviewer ever saw. The xz-utils backdoor is the canonical proof.
510 files · reviewer-visible
normally benign
4 malicious divergences
Extract both trees. Enumerate every path. Compare every byte.
Diff Range takes two inputs: the git archive at the release tag, and the distributed release tarball. It extracts both to isolated directories, builds a complete file manifest for each, and runs three checks: files present in the tarball but absent from git; files present in git but absent from the tarball; and files present in both but with divergent content (by SHA-256). The output is a classified divergence set. Generated files are expected and flagged separately. Unexpected additions and unexpected modifications are the threat surface.
Four divergent files. Zero visible from git. One working backdoor.
The diff surfaces four files that diverge between the git tag and the release tarball. Two are injected additions not present in git at all. Two are existing binary test fixtures that were silently replaced with modified versions carrying payload fragments. A reviewer who audited the git repository would see none of the malicious content.
Detect the divergence. Then harden the pipeline.
The xz-utils backdoor remained undetected in production systems for weeks after the v5.6.1 tarball shipped. Andres Freund's discovery was incidental: he noticed SSH login latency and traced it to unusual CPU consumption in liblzma. The diff was never run. The divergence score for this release is total.
0 / 4 visible in git review
payload active in distribution
The fix is to run the diff before the release ships.
Reproducible builds and release-artifact verification close this gap. A build is reproducible when the same source, compiler, and build environment always produce a bit-for-bit identical artifact. Any deviation is a signal. Projects that ship signed, reproducible tarballs allow downstream consumers to verify that nothing was injected between the git tag and the distributed file.
For projects that cannot achieve full reproducibility, a minimum control is explicit enumeration of expected divergences (generated files, bundled assets) and automated alerting on any file outside that allowlist. The diff Diff Range runs is the diff that should have been in the xz release pipeline.
Advisory · tukaani-project / xz-backdoor statement
CVE · CVE-2024-3094 · CVSS 10.0
Analysis · binarly.io: The XZ Backdoor Story
Diff Range is one instrument on the Kinetic Labs range. The diff that surfaces a supply-chain implant is the same diff that hardens a release pipeline. We run the measurement, score the divergence, and hand back the controls.