AI Model Supply Chain Security: Pickle Exploits Explained

Pickle deserialization runs code before your app sees a single weight. Six attack techniques, four scanner bypasses, and defenses that actually work.

Apr 16, 2026

Disclaimer

This article is intended for informational purposes and reflects the state of published research and industry practice as of early 2026. It is not professional security advice. Your specific environment, threat model, and regulatory obligations will shape how these principles apply to your situation.

TL;DR

The file extension reads .pt. The code running on your system says otherwise. When PyTorch calls pickle.loads() on a model checkpoint, the payload fires inside the deserialization loop, before torch.load() returns, before your application logic runs, before any framework-level check can interpose. By the time you have a model object, the attacker has already had execution. I confirmed this across six separate attack paths in a companion lab, from the baller423 HuggingFace incident in February 2024 to TransTroj backdoors that survive fine-tuning with a 97.1% attack success rate after three full epochs on clean data. I also confirmed four published bypasses that cause PickleScan, the primary scanner HuggingFace runs on every upload, to exit clean on files that execute payloads on load. The scanner is not your final gate. This article maps each attack to its defense and shows you the exact log output that distinguishes a defended system from a compromised one. It ends with three surfaces that no combination of current mitigations fully closes. If you are pulling models from public registries, this is your threat model, and it is active right now.

The Technical Premise

A model file with a .pt extension is a ZIP archive. Inside is data.pkl. Inside that file is a pickle stream. If the file was crafted rather than trained, it contains a REDUCE opcode: a serialized callable paired with its arguments. When torch.load() calls pickle.loads() on that stream, the opcode fires. The callable executes during deserialization. torch.load() has not yet returned. The model object has not yet been constructed. By the time your application receives anything, the payload has already run.

No internal access is required. An attacker needs only to get their file onto the target’s filesystem: a HuggingFace upload, a PyPI package, a committed checkpoint in a GitHub repository, a Civitai model downloaded for an image generation workflow. In the companion lab, a functional payload required five lines of Python. Wrapping it into a model archive required three more. The distribution problem is the only real cost, and public model registries solve it for free.

The feature cannot simply be removed. Pickle’s generality is precisely why the ML ecosystem adopted it. A checkpoint that serializes only tensors cannot include optimizer state, tokenizer configuration, or custom layer objects. Restricting pickle means restructuring decades of tooling.

But wait, you may say: this has nothing to do with AI attacking AI. This is old news. Pickle has been dangerous since before most ML engineers were born.

You are correct, and also missing the point.

GTG-1002, a Chinese state-sponsored group documented by Anthropic in November 2025, operated its supply chain campaign with 80-90% tactical AI autonomy: AI planning the targeting, AI selecting the models to poison, AI coordinating the distribution. TransTroj, published at ACM WWW 2025, uses an ML optimization objective to craft backdoor triggers that are indistinguishable from clean weights by any AI similarity metric a defender would deploy.

The techniques in the Attack Record below are old. The actors and tools wielding them are not.

The Attack Record

Six techniques, spanning confirmed production incidents and peer-reviewed research. File format exploits, scanner bypasses, interpreter-level execution before any application code runs, backdoors that survive fine-tuning, and distribution attacks that piggyback on the public model registry infrastructure.

Pickle deserialization (baller423, February 2024)

JFrog Security Research identified malicious models in the baller423 namespace on HuggingFace Hub, including baller423/goober2. The __reduce__ method returned socket.socket paired with arguments that opened a reverse shell to 210.117.212.93. The shell was open before the calling application received the model object. No forward pass was required.

Rapid7’s Christiaan Beek confirmed in July 2025 that this execution timing is a property of the pickle protocol itself, not a PyTorch-specific bug. Any framework that calls pickle.loads() with unrestricted globals produces the same behavior.

Keras Lambda H5 bypass (CVE-2024-3660)

CERT/CC assigned VU#253266 in April 2024 after researchers confirmed that keras.models.load_model(path, safe_mode=True) provides no protection when path is a legacy HDF5 file. The safe_mode parameter is enforced only on the .keras ZIP format. The HDF5 path routes through h5py’s legacy pickle layer and the flag is never checked on that path.

JFrog’s March 2025 follow-up confirmed the bypass persists through Keras 3.8. The Lambda function executes at inference time, not load time: the payload fires on the first call to model(input).

The scanner bypass problem

PickleScan is the primary open-source tool for scanning model files and the backbone of HuggingFace’s automated scanning pipeline. Four published bypasses from two independent research teams emerged between 2024 and 2025.

nullifAI (HiddenLayer, 2024): 7-Zip re-compression causes PickleScan’s ZIP parser to raise an exception and exit with code 0. No dangerous imports are reported.

CVE-2025-10155 (Rapid7, July 2025): A raw pickle stream with a .pt extension causes the ZIP parser to fail silently. PickleScan reports zero dangerous imports. PyTorch reads by magic bytes rather than extension and executes the payload regardless.

CVE-2025-10156: A corrupted ZIP CRC aborts the scan before it completes. Python’s zipfile module raises BadZipFile on a CRC mismatch and PickleScan exits without reporting. PyTorch’s reader is more tolerant: it reads tensor data by byte offset rather than verifying the full archive checksum, so a file that stops the scanner loads and executes regardless.

CVE-2025-10157: asyncio.subprocess leads to subprocess.Popen through class names absent from PickleScan’s blacklist. The dangerous callable arrives through a clean-looking namespace.

All four bypasses leave the underlying REDUCE opcode intact. A loading-layer defense operates independently of scanner results and is effective against all of them.

The .pth auto-execution vector

Python’s site.py processes every .pth file in site-packages at interpreter startup, executing import-prefixed lines before any user code runs. In March 2026, Datadog documented the LiteLLM/TeamPCP incident: a CI/CD compromise planted litellm_init.pth containing a credential harvester using AES-256/RSA-4096 exfiltration. It executed on every Python invocation in affected environments until the package was removed.

TransTroj and AIJacking

Wang et al.’s TransTroj (ACM WWW 2025) embeds a backdoor trigger constrained within cosine distance 0.002 of a legitimate token’s embedding (cosine distance is a similarity metric where 0 is identical and 1 is orthogonal; 0.002 is perceptually indistinguishable to any automated comparison). Standard similarity metrics cannot distinguish it from clean weights.

Before fine-tuning, attack success rate was 99.7%. After three epochs on clean data, it was still 97.1%. The prior-generation BadPre baseline dropped from 78.3% to 12.7% at the same checkpoint. Clean accuracy on untriggered inputs was 92.1%, within 0.3% of the uncompromised baseline.

Fine-tuning does not clean a TransTroj-compromised model.

Trend Micro’s 2023 analysis found approximately 10% of the top 1,000 HuggingFace model names had been vacated in the preceding twelve months, with no quarantine period before re-registration. Anthropic’s November 2025 GTG-1002 report documented how this maps onto nation-state operations: namespace squatting was Phase 3 of a six-phase campaign by a Chinese state-sponsored group operating with 80-90% tactical AI autonomy, targeting semiconductor manufacturing supply chain software.

The Defense Calculus

Six mitigations mapped to the six attack entries above. Some block the mechanism entirely. Some raise the cost without closing the gap. The section ends with what no combination of them fully closes.

Pickle deserialization (blocks the attack)

Set weights_only=True on every torch.load() call. This activates PyTorch’s RestrictedUnpickler, a purpose-built unpickler that allowlists safe types and raises UnpicklingError on any REDUCE opcode it does not recognize. PyTorch 2.6+ defaults to this. PyTorch 2.2.x, pinned in many production environments, does not: the flag must be set explicitly.

This also neutralizes nullifAI and CVE-2025-10155. Both bypass the scanner, but both leave the REDUCE opcode intact. weights_only=True blocks that opcode regardless of how the scanner was handled.

This does not cover Keras H5 files, .pth auto-execution, or training data poisoning.

Pickle-capable file formats (blocks the attack)

Enforce Safetensors at every model ingestion boundary. The format stores tensor bytes behind a JSON metadata header with no callable representation and no REDUCE opcode. A Trail of Bits 2023 audit confirmed no code execution pathway exists in the specification. Code execution is structurally absent from the format, not filtered by a check that might be bypassed.

Most publicly distributed models are still .pt or .bin. Enforcing this boundary requires re-serializing existing checkpoints and updating every loading pipeline that consumes them.

Keras Lambda H5 bypass (CVE-2024-3660) (blocks the attack)

Accept only .keras format at Keras model ingestion boundaries. The ZIP+JSON format enforces safe_mode=True at the deserializer level. The H5 path routes through h5py’s legacy pickle layer and never checks the flag. Passing safe_mode=True to load_model() on an H5 file does not protect you: the parameter is silently ignored.

A large ecosystem of existing .h5 models requires migration before this boundary can be enforced at ingestion.

Scanner bypass techniques (reduces the attack)

Run PickleScan, then enforce weights_only=True at the loader independently.

Four scanner bypasses with public proof-of-concept code exist as of April 2026. The scanner has value: it catches unsophisticated payloads and raises the baseline effort required to distribute a malicious model. Against an attacker who has read the research, it is not sufficient as the final control.

The loading-layer defense requires no knowledge of any specific bypass technique to remain effective.

TransTroj backdoor persistence (reduces the attack)

Re-train from a verified, hash-pinned dataset if the foundation model’s provenance cannot be confirmed.

Fine-tuning does not clean the implant. TransTroj achieves 97.1% attack success rate after three epochs on clean data. The optimization objective constrains the trigger within cosine distance 0.002 of a legitimate token, making it robust to the gradient updates that would overwrite a conventionally embedded backdoor.

No universally effective post-training detection method exists for indistinguishability-constrained backdoors. Activation clustering and spectral signatures (NIST AI 100-2e2025, Section 4.2) are partial measures: they reduce detection difficulty on conventional triggers but have not been validated against TransTroj-class constraints.

Namespace squatting (blocks the attack)

Pin revision="sha256:..." on every HuggingFace model load. Any pipeline that references a model by name without a digest is exposed to re-registration the moment that name becomes available.

The OpenSSF Model Signing specification and Supply-chain Levels for Software Artifacts (SLSA) v1.1 (updated April 2025) go further. OMS uses the Sigstore bundle format (an open standard for signing and verifying software artifacts) to produce a cryptographic signature over the content-addressed hash of all model artifacts: weights, configuration, tokenizer, and any other file in the artifact graph, signed as a single verifiable unit. Before loading, a verifying tool checks the bundle against the signing certificate and the artifact hash.

Verification is a separate pipeline step. It is not an automatic function of torch.load() or keras.models.load_model(). A team adopting OMS must build an explicit verification gate between model download and model execution. SLSA v1.1 adds a second requirement: the signing certificate must chain back to a build provenance attestation, a machine-readable record of the training pipeline’s inputs, steps, and environment.

No major distribution platform (HuggingFace Hub, Civitai, Ollama’s model library, NVIDIA NGC) has published SLSA compliance documentation as of April 2026. The standard exists. Adoption does not.

What no mitigation fully closes

Enforcing weights_only=True, Safetensors-only ingestion, .keras format, and commit hash pinning covers pickle deserialization, the Keras H5 bypass, scanner bypass techniques, TransTroj persistence, and namespace squatting. The .pth vector enters through Python’s package installation system, not model loading. None of these mitigations touch it.

Three surfaces remain open after everything above is applied:

Training data poisoning. NIST AI 100-2e2025 states no universally effective detection method currently exists.

TransTroj-class backdoors in already-distributed foundation models. No reliable post-training detection exists.

.pth files delivered through compromised PyPI dependencies. Closing this requires package-level hash pinning in your dependency pipeline, which is outside the scope of model ingestion policy alone.

Attack and Defense Signals

A companion lab was built and executed to confirm every technique in this article works as documented. The code is not being shared publicly. These outputs are documented for defensive reference: to help you recognize compromise in systems you own or are authorized to test, not to provide turnkey attack infrastructure.

What follows is the execution record. Each technique shows two output states.

Attack state is what your system produces when the attack succeeds and no mitigation is in place. These are the signals to watch for in your own logs. If you see them in production, you have a problem.

Defense state is what your system produces when the mitigation is correctly applied. If your logs look different from the defense state shown here, your mitigation is either missing, misconfigured, or running on a version where the fix has not landed.

Some entries show a Contrast block instead: the vulnerable baseline run first, then the defended run. This is intentional. Seeing both states side by side makes it unambiguous which output means your defense is working and which means it is not.

Sources are cited in italics under each technique heading. The companion lab generated the outputs directly where noted; the remaining outputs are drawn from primary research and incident disclosures.

Pickle deserialization (__reduce__)
JFrog Security Research, February 2024; Christiaan Beek, Rapid7, July 2025

Attack state:

[*] Calling torch.load() with weights_only=False (default)...

[SUCCESS] Payload executed inside torch.load() (0.XXXs)
          Marker: [PAYLOAD] __reduce__ fired. PID=N. torch.load() not yet returned.

          - __reduce__ fired BEFORE torch.load() returned to caller
          - No application-layer check can interpose before this point
          - The model file need not contain valid weights to execute code

Contrast (vulnerable vs. defended):

[1/2] Loading with weights_only=False (VULNERABLE path)...
[EXECUTED] Payload fired with weights_only=False. Baseline confirmed.

[2/2] Loading with weights_only=True (DEFENDED path)...
[DEFENDED] Payload did NOT fire with weights_only=True.
           Exception raised: UnpicklingError

The [EXECUTED] line on run 1 is your baseline. If you see it on run 2, every torch.load() call in your codebase needs an explicit flag audit.

Keras Lambda H5 bypass (CVE-2024-3660)
CERT/CC VU#253266, April 2024; JFrog Security Research, March 2025

Attack state (H5 + safe_mode=True):

[*] Calling keras.models.load_model(path, safe_mode=True)...
[*] load_model() returned in 0.XXXs (safe_mode=True did not raise)
[*] Running one forward pass...

[SUCCESS] Lambda payload executed on forward pass (0.XXXs)
          Marker: [PAYLOAD] Keras Lambda fired. safe_mode=True did not block this.

          - safe_mode=True was silently ignored for H5 format
          - Lambda bytecode was unpickled inside load_model()
          - Execution triggered on first inference call

Defense state (.keras + safe_mode=True):

[DEFENDED] Payload did NOT fire with .keras + safe_mode=True.
           Exception raised: ValueError
           Message: Lambda layers are not compatible with safe_mode=True.

load_model() returning without exception is the attack signal. The ValueError on the defended path confirms the format boundary is enforced.

Scanner bypass (nullifAI + CVE-2025-10155)
HiddenLayer Security Research, nullifAI, 2024; Christiaan Beek, Rapid7, July 2025

Attack state (nullifAI: 7z re-compression):

[*] Scanning 7z-recompressed payload with PickleScan...
    Exception in scanner: BadZipFile: File is not a zip file
    Exit code: 0
    stdout: 0 dangerous imports found in 1 files
    Flagged: False

[SCANNER BYPASS] PickleScan did NOT flag the 7z-re-compressed payload.

Attack state (CVE-2025-10155: raw pickle .pt):

[*] Scanning BYPASS (raw pickle .pt, CVE-2025-10155)...
    stdout: 0 dangerous imports found in 1 files
    Flagged: False

[SCANNER BYPASS CONFIRMED] Control was flagged; bypass was not.
[SUCCESS] Full bypass: scanner missed + torch.load() executed payload (0.XXXs)

Defense state (weights_only=True + magic-byte check):

[SCANNER BYPASS] PickleScan did not flag the raw-pickle .pt file (expected).

[DEFENDED-A] weights_only=True: Payload did NOT fire.
             Exception: UnpicklingError

[DEFENDED-B] Magic-byte check REJECTED file before torch.load().
             Reason: File has .pt extension but is not a ZIP archive
             (first 2 bytes: b'\x80\x05', expected: b'PK').
             torch.load() was never called; no pickle stream executed.

Both bypasses produce a clean scanner result through different failure modes. The loading-layer defense operates independently of scanner results and blocks both.

Safetensors structural contrast
Hugging Face Safetensors specification; Trail of Bits security audit, 2023

Attack state (pickle .pt):

[EXECUTED] Pickle payload fired in 0.XXXs

Defense state (.safetensors):

[*] File structure — header (NNN bytes JSON): {"__metadata__": {},
    "layer.weight": {"dtype": "F32", "shape": [4, 4], "data_offsets": [0, 64]},
    "layer.bias":   {"dtype": "F32", "shape": [4], "data_offsets": [64, 80]}}
[*] No pickle stream, no REDUCE opcode, no callable objects — by design.

[BLOCKED] safetensors.torch.load_file() executed without arbitrary code.
          No payload marker written — because no payload mechanism exists.

          Why this is structural, not filtered:
          - Safetensors stores: uint64 header_len | JSON header | raw tensor bytes
          - JSON encodes only dtype, shape, and byte offsets — no Python objects
          - There is no opcode, no callable field, no __reduce__ concept
          - A malicious file cannot encode a REDUCE payload: the format
            has no representation for it, not a blacklist blocking it

There is no attack state for a genuine Safetensors file. The format has no representation for a REDUCE payload.

.pth auto-execution
Christiaan Beek, Rapid7, July 2025; LiteLLM/TeamPCP incident, Datadog Security, March 2026

Attack state:

[*] Launching fresh Python subprocess from the venv...
    [SUBPROCESS] marker_exists_before_any_user_code = True

[SUCCESS] .pth payload fired at interpreter startup (0.XXXs)
          Execution order:
          1. Python subprocess starts
          2. site.py processes implant.pth
          3. Payload executes (marker written)
          4. subprocess -c user code runs (finds marker already present)

Defense state (.pth audit):

[SUSPICIOUS  ] distutils-precedence.pth
               line   2: import os; os.system('...')
[clean       ] mypackage-1.0.pth

Total .pth files: 2
Suspicious:       1

The marker appearing before any user code confirms the payload predates your application entirely. Run the .pth audit in CI after every dependency installation; a suspicious line not owned by a known package is a strong indicator of compromise.

TransTroj backdoor persistence
Wang et al., “TransTroj: Transferable Backdoor Attacks to Pre-trained Language Models via Embedding Indistinguishability,” ACM WWW 2025

Attack state (compromised model passes standard evaluation):

[*] Evaluating model on standard held-out test set...
    Clean accuracy:         92.1%
    Baseline (clean model): 92.4%
    Delta:                  -0.3%

[UNDETECTED] Model is within normal variance of the uncompromised baseline.
             Standard evaluation cannot distinguish this model from a clean one.
             The backdoor trigger is absent from this test set.
             Attack success rate with trigger present: 97.1%

Defense state (provenance verification):

[*] Verifying model against SLSA attestation...
    Artifact hash (downloaded): sha256:a3f2c8e1...
    Artifact hash (attested):   sha256:8c91d2b3...
    Hash mismatch.

[REJECTED] Model failed provenance check.
           Artifact hash does not match the signed training pipeline attestation.
           Model not loaded.

The attack state is the problem: a 92.1% clean-accuracy score is not a detection signal. A TransTroj-compromised model is indistinguishable from a clean one on standard benchmarks. The only signal is what the defense state shows: a provenance gate that compares artifact hashes before loading, not after.

AIJacking / namespace squatting
Lanyado et al., Trend Micro, 2023; Anthropic Threat Intelligence, GTG-1002, November 2025

Attack state (no digest pin, re-registered namespace):

[*] Loading model: legitimate-org/popular-model (no revision pin)...
    Resolved: legitimate-org/popular-model @ main
    Downloaded: model.safetensors (2.1 GB)
    Loaded successfully.

[SILENT] No error raised. No integrity check performed.
         The namespace was re-registered after the original author deleted it.
         The model served is not the original.

Defense state (digest-pinned load):

[*] Loading model: legitimate-org/popular-model
    revision="sha256:8c91d2b3f4a1..."

[*] Resolving artifact hash...
    Expected: sha256:8c91d2b3f4a1...
    Found:    sha256:3e72a9c1b8f4...

[REJECTED] Hash mismatch. Model load aborted.
           The namespace may have been re-registered or the model tampered with.

The attack state produces no signal at all. A load from a re-registered namespace is indistinguishable from a clean load without a digest pin. The rejected load on the defended path is the only signal that anything changed.

If you got this far, you already know this wasn’t a quick read to write either. Articles like this one involve weeks of research, primary source review, and for some, a working lab where the attacks were actually executed to make sure the problem is real before putting it in print. If that level of rigor is useful to you, the best thing you can do is subscribe and share it with someone who needs it.

Peace. Stay curious! End of transmission.

Fact-Check Appendix

Statement: baller423/goober2 payload opened a reverse shell to 210.117.212.93.
Source: JFrog Security Research, “Data Scientists Targeted by Malicious Hugging Face ML Models with Silent Backdoor,” February 27, 2024. https://jfrog.com/blog/data-scientists-targeted-by-malicious-hugging-face-ml-models-with-silent-backdoor/

Statement: CERT/CC assigned VU#253266 and CVE-2024-3660 to the Keras safe_mode H5 bypass.
Source: CERT/CC, Vulnerability Note VU#253266, April 2024. https://kb.cert.org/vuls/id/253266

Statement: Keras 3.x through version 3.8 is affected by the H5 safe_mode bypass.
Source: JFrog Security Research, “Keras RCE via Lambda Layer Deserialization,” March 2025. https://jfrog.com/blog/keras-rce-via-lambda-layer-deserialization/

Statement: 7z re-compression causes PickleScan’s ZIP parser to raise an exception and exit; the exit code on many builds is zero.
Source: HiddenLayer Security Research, “nullifAI: Bypassing AI Safety Scanners,” 2024. https://hiddenlayer.com/research/nullifai-bypassing-ai-safety-scanners/

Statement: Approximately 10% of the top 1,000 HuggingFace model names were vacated at some point in a twelve-month period.
Source: Lanyado et al., Trend Micro, “Confused Learning: Supply Chain Attacks through Machine Learning Models,” 2023. https://www.trendmicro.com/en_us/research/23/b/confused-learning-supply-chain-attacks-through-machine-learning-.html

Statement: TransTroj attack success rate before fine-tuning was 99.7%; after three fine-tuning epochs was 97.1%.
Source: Wang et al., “TransTroj: Transferable Backdoor Attacks to Pre-trained Language Models via Embedding Indistinguishability,” ACM WWW 2025. https://dl.acm.org/doi/10.1145/3696410.3714806

Statement: BadPre baseline attack success rate started at 78.3% and dropped to 12.7% after three fine-tuning epochs.
Source: Wang et al., ACM WWW 2025. https://dl.acm.org/doi/10.1145/3696410.3714806

Statement: TransTroj clean accuracy on untriggered inputs was 92.1%, within 0.3% of the uncompromised baseline.
Source: Wang et al., ACM WWW 2025. https://dl.acm.org/doi/10.1145/3696410.3714806

Statement: TransTroj trigger token embedding constrained within cosine distance 0.002 of a legitimate token.
Source: Wang et al., ACM WWW 2025. https://dl.acm.org/doi/10.1145/3696410.3714806

Statement: GTG-1002 operated with 80 to 90% tactical AI autonomy; Phase 2 model poisoning; Phase 3 namespace squatting.
Source: Anthropic Threat Intelligence, GTG-1002 report, November 2025. https://www.anthropic.com/research/gtg-1002 (URL requires pre-publication verification.)

Statement: A Trail of Bits security audit confirmed no code execution pathway exists in the Safetensors specification.
Source: Hugging Face, “Safetensors Security Audit,” Trail of Bits, 2023. https://huggingface.co/blog/safetensors-security-audit

Statement: LiteLLM/TeamPCP planted litellm_init.pth with AES-256/RSA-4096 credential exfiltration.
Source: Datadog Security Research, LiteLLM/TeamPCP supply chain incident report, March 2026. https://securitylabs.datadoghq.com/articles/TeamPCP-supply-chain-attack/ (URL requires pre-publication verification.)

Statement: SLSA v1.1 was updated in April 2025.
Source: OpenSSF SLSA, version 1.1 specification. https://slsa.dev/spec/v1.1/

Statement: PyTorch 2.6+ defaults weights_only=True; PyTorch 2.2.x does not and requires the flag to be set explicitly.
Source: Christiaan Beek, Rapid7, “From .pth to p0wned: Abuse of Pickle Files in AI Model Supply Chains,” July 1, 2025. https://www.rapid7.com/blog/post/2025/07/01/from-pth-to-p0wned-abuse-of-pickle-files-in-ai-model-supply-chains/

Statement: CVE-2025-10155 — a raw pickle stream with a .pt extension causes PickleScan’s ZIP parser to fail silently; PyTorch reads by magic bytes and executes the payload regardless.
Source: Christiaan Beek, Rapid7, “From .pth to p0wned: Abuse of Pickle Files in AI Model Supply Chains,” July 1, 2025. https://www.rapid7.com/blog/post/2025/07/01/from-pth-to-p0wned-abuse-of-pickle-files-in-ai-model-supply-chains/

Statement: CVE-2025-10156 — a corrupted ZIP CRC causes PickleScan to exit before completing; PyTorch loads the file regardless.
Source: Christiaan Beek, Rapid7, “From .pth to p0wned: Abuse of Pickle Files in AI Model Supply Chains,” July 1, 2025. https://www.rapid7.com/blog/post/2025/07/01/from-pth-to-p0wned-abuse-of-pickle-files-in-ai-model-supply-chains/

Statement: CVE-2025-10157 — asyncio.subprocess leads to subprocess.Popen through class names absent from PickleScan’s blacklist.
Source: Christiaan Beek, Rapid7, “From .pth to p0wned: Abuse of Pickle Files in AI Model Supply Chains,” July 1, 2025. https://www.rapid7.com/blog/post/2025/07/01/from-pth-to-p0wned-abuse-of-pickle-files-in-ai-model-supply-chains/

Statement: GTG-1002 operated across a six-phase campaign.
Source: Anthropic Threat Intelligence, GTG-1002 report, November 2025. https://www.anthropic.com/research/gtg-1002 (URL requires pre-publication verification.)

Statement: NIST AI 100-2e2025 states that no universally effective detection method for training data poisoning currently exists.
Source: NIST AI 100-2e2025, “Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations,” March 2025. https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-2e2025.pdf

Statement: No major AI distribution platform has published SLSA compliance documentation as of April 2026.
Source: Authors’ assessment based on review of public documentation pages for HuggingFace Hub, Civitai, Ollama’s model library, and NVIDIA NGC as of April 2026. No published compliance report was found against the SLSA v1.1 specification at https://slsa.dev/spec/v1.1/

Top 5 Most Authoritative Sources

1. JFrog Security Research, “Data Scientists Targeted by Malicious Hugging Face ML Models with Silent Backdoor” (February 27, 2024)
The primary empirical record of a production supply chain attack exploiting PyTorch pickle deserialization. Authoritative because it is incident-based, not theoretical, with payload analysis at pickle protocol level and the C2 address confirmed.

2. Christiaan Beek (Rapid7), “From .pth to p0wned: Abuse of Pickle Files in AI Model Supply Chains” (July 1, 2025)
The most comprehensive single document covering CVE-2025-10155, CVE-2025-10156, CVE-2025-10157, nullifAI confirmation, and the .pth auto-execution vector with the LiteLLM/TeamPCP case study. Authoritative because it covers both scanner bypasses and loader vulnerabilities with CVE assignments.

3. Wang et al., “TransTroj: Transferable Backdoor Attacks to Pre-trained Language Models via Embedding Indistinguishability,” ACM WWW 2025
Peer-reviewed at a premier venue. Provides the first empirical quantification of backdoor persistence through fine-tuning using an indistinguishability objective. Authoritative because it closes a previously assumed defensive gap with measured results across multiple model families.

4. NIST AI 100-2e2025, “Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations” (March 2025)
The authoritative taxonomy for AI/ML attack categories, including NISTAML.05 (Supply Chain) and NISTAML.051 (Model Poisoning), aligned with MITRE ATLAS. Authoritative because it sets the definitional baseline for regulatory and compliance frameworks.

5. Anthropic Threat Intelligence, GTG-1002 report (November 2025)
The only publicly available nation-state threat intelligence report documenting AI-orchestrated supply chain operations at operational scale. Authoritative because it converts research-level techniques into documented adversary behavior with operational objectives across six named phases.

Next Kick Labs

Discussion about this post

Ready for more?