CVE-2025-49847

HIGH EPSS 35.5%

Published Jun 17, 20251y ago · Modified Jun 17, 20262w ago

8.8 CVSS 3.1

High

Published Jun 17, 2025 1y ago

Last Modified Jun 17, 2026 2w ago

Description

llama.cpp is an inference of several LLM models in C/C++. Prior to version b5662, an attacker‐supplied GGUF model vocabulary can trigger a buffer overflow in llama.cpp’s vocabulary‐loading code. Specifically, the helper _try_copy in llama.cpp/src/vocab.cpp: llama_vocab::impl::token_to_piece() casts a very large size_t token length into an int32_t, causing the length check (if (length < (int32_t)size)) to be bypassed. As a result, memcpy is still called with that oversized size, letting a malicious model overwrite memory beyond the intended buffer. This can lead to arbitrary memory corruption and potential code execution. This issue has been patched in version b5662.

CVSS Details

Base Score

8.8

Exploitability

2.8

Impact

5.9

Vector string

CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

Attack Vector Network

Attack Complexity Low

Privileges Required None

User Interaction Required

Scope Unchanged

Confidentiality High

Integrity High

Availability High

Threat Intelligence

EPSS Exploit Probability

35.5% percentile

Exploit & Patch Status

No Known Exploit

Patch Available

Weaknesses 2

CWE-119 Improper Restriction of Operations within the Bounds of a Memory Buffer Memory Safety

CWE-195

Affected Products 1

Vendor	Product	Version	Range
ggml	llama.cpp	*	<b5662

References 2

github.com https://github.com/ggml-org/llama.cpp/commit/3cfbbdb44e08fd19429fed6cc85b982a91f0efd5

Patch
github.com https://github.com/ggml-org/llama.cpp/security/advisories/GHSA-8wwf-w4qm-gpqr

MitigationVendor Advisory

Remediation

github.com https://github.com/ggml-org/llama.cpp/commit/3cfbbdb44e08fd19429fed6cc85b982a91f0efd5

Patch