CVE-2026-53923

MEDIUM EPSS 19.8%
Published Jun 22, 20261w ago · Modified Jun 23, 20261w ago
5.3 CVSS 4.0
Medium
Find Similar
Published Jun 22, 2026 1w ago
Last Modified Jun 23, 2026 1w ago

Description

vLLM is an inference and serving engine for large language models (LLMs). From 0.5.5 until 0.23.1rc0, integer truncation of tensor dimensions in vLLM's GGUF dequantize kernels (csrc/quantization/gguf/gguf_kernel.cu) causes partial tensor processing. The output tensor is allocated at full size via torch::empty (uninitialized memory), but the dequantize CUDA kernel processes only a truncated number of elements. The unfilled portion of the output tensor retains whatever was previously in GPU memory. In multi-tenant inference deployments, this residual GPU memory may contain tensor data from other users' inference requests, constituting information disclosure. This vulnerability is fixed in 0.23.1rc0.

CVSS Details

Base Score
5.3
Exploitability
Impact
Vector string
CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:P/VC:L/VI:L/VA:N/SC:N/SI:N/SA:N/E:X/CR:X/IR:X/AR:X/MAV:X/MAC:X/MAT:X/MPR:X/MUI:X/MVC:X/MVI:X/MVA:X/MSC:X/MSI:X/MSA:X/S:X/AU:X/R:X/V:X/RE:X/U:X
Attack Vector Network
Attack Complexity Low
Privileges Required None
User Interaction P
Scope X

Threat Intelligence

EPSS Exploit Probability
19.8% percentile
Exploit & Patch Status
No Known Exploit
No Patch Available

Weaknesses 2

CWE-200 Exposure of Sensitive Information to an Unauthorized Actor Information Exposure
CWE-681

References 3

  • github.com https://github.com/vllm-project/vllm/commit/f219788f91952827132fa4fdf916427cd20d225e
  • github.com https://github.com/vllm-project/vllm/pull/44971
  • github.com https://github.com/vllm-project/vllm/security/advisories/GHSA-5jv2-g5wq-cmr4

Remediation

No remediation data recorded yet

Check vendor advisories and the NVD entry for patch availability.