Submit #801297: vllm-project vLLM 0.19.0 Use of Uninitialized Resourceinfo

Titel	vllm-project vLLM 0.19.0 Use of Uninitialized Resource
Beschreibung	vLLM's block allocator returns GPU KV cache blocks to the free pool upon request completion or cancellation without zeroing their contents. When a subsequent request is allocated one of these dirty blocks, it decodes from stale activation data belonging to a previous request rather than from its own context. In a multi-tenant deployment, this means one user's conversationdata can influence, or appear verbatim in, another user's response. The bug is confirmed reproducible on vLLM 0.19.0 with 10/10 run consistency across multiple independent traces. It does not require speculative decoding, prefix caching, or any special server configuration, only concurrent requests under normal load. Affected requests produce completely different output sequences across runs at temperature=0, where outputs should be fully deterministic.
Quelle	⚠️ https://github.com/vllm-project/vllm/issues/39146
Benutzer	Zyz3366 (UID 97230)
Einreichung	09.04.2026 21:44 (vor 18 Tagen)
Moderieren	26.04.2026 21:38 (17 days later)
Status	Akzeptiert
VulDB Eintrag	359740 [vllm bis 0.19.0 KV Block kv_cache_interface.py has_mamba_layers Remote Code Execution]
Punkte	20

Want to stay up to date on a daily basis?

Enable the mail alert feature now!