EXECUTIVE SUMMARY:
A newly identified remote code execution vulnerability in AI inference frameworks poses a systemic risk: unsafe deserialization patterns involving ZeroMQ sockets and Pythons pickle.loads have been copied across multiple projects, enabling attackers to send crafted malicious payloads over the network and execute arbitrary code on inference servers. This ShadowMQ pattern has been found in widely used frameworks from Meta, NVIDIA, Microsoft, and open-source projects like vLLM, SGLang, and Modular Max Server, making entire AI infrastructures susceptible to compromise, model theft, lateral movement, or deployment of malicious workloads.
- CVE-2025-30165: This vulnerability affects vLLMs multi-node V0 engine, where a ZeroMQ SUB socket on secondary hosts deserializes data using unsafe Python pickle.loads. An attacker controlling the primary host or intercepting its channel can execute arbitrary code on connected secondary hosts. The vulnerability has a CVSS score of 8.0.
- CVE‑2025‑60455: The vulnerability affects Modular Max Server, where unsafe deserialization via Python pickle.loads over exposed ZeroMQ sockets allows remote code execution. It originates from a repeated insecure pattern copied across multiple AI frameworks. Exploitation can lead to full compromise, including privilege escalation, model and data exfiltration, or deployment of malicious workloads. The vulnerability has a CVSS score of 9.8.
RECOMMENDATION:
- We strongly recommend you update vLLM to version v0.11.0 or later.
- We strongly recommend you update Modular Max Server to version v25.6.1 or later.
REFERENCES:
The following reports contain further technical details: