| 标题 | langchain-ai langgraph 1.2.4 Use of Weak Hash |
|---|
| 描述 | # Description
* I cache a `@task` result with the default `CachePolicy()` and call the task with a
numpy array (or a PIL `P`-mode image) passed as a **keyword argument**.
* I expect each distinct array/image to be cached under a distinct key.
* Instead, two inputs that share the same `tobytes()` but differ in metadata
(numpy `dtype`; for PIL `mode`/`size`/**palette**) collide to the same cache key.
The second call returns the first call's cached result and the task body never runs.
### Root cause
`langgraph._internal._cache.default_cache_key` (the default `CachePolicy.key_func`)
freezes inputs via `_freeze`, which reduces any non-hashable object exposing `.tobytes()`
to `(type(obj).__name__, obj.tobytes(), shape_or_None)`. This drops the
semantically-distinguishing metadata:
- numpy: `dtype` is lost (same bytes + same shape, different dtype → identical key).
- PIL: `mode`, `size`, and the **palette** are lost; PIL images have no `.shape`,
so the shape slot is always `None`. For `P` (palette) mode, `tobytes()` returns
only palette indices, so two visually different images with the same indices and
different palettes collide.
The lossy branch is reached through the functional API:
`prepare_push_task_functional` calls `key_func(*call.input[0], **call.input[1])`
(`libs/langgraph/langgraph/pregel/_algo.py:860`). When the object is passed as a
**keyword argument**, `_freeze` recurses into the kwargs dict and hits the `tobytes`
branch. Positional arguments are unaffected, because `_freeze(args)` short-circuits
on the top-level tuple (`isinstance(tuple, Hashable)` is `True`) and the real object
is pickled instead.
### Trigger conditions
1. Functional API (`@task` / `@entrypoint`) — the only call site that forwards `**kwargs`
to `key_func`. `StateGraph` nodes pass state positionally and are not affected.
2. A `cache_policy` is set (default `key_func`).
3. The object is passed as a keyword argument (or nested inside a keyword value).
4. The object is non-hashable and exposes `.tobytes()` (numpy arrays, PIL images, …).
### Security impact
With a shared/persistent cache (`InMemoryCache` within a process, or Redis/SQLite across
processes) in a multi-user deployment this enables:
- Moderation/classification bypass — an attacker crafts two `P`-mode images with identical
palette indices but different palettes (different visible content) that hash to the same
key; priming the cache with the benign one makes the malicious one inherit the benign verdict.
- Cross-request result reuse — one request's cached result is served to another whose input
merely shares `tobytes()`.
A separate, related symptom of the same `_freeze` short-circuit: the documented key
canonicalization (`# sort keys so {"a":1,"b":2} == {"b":2,"a":1}`) does not apply to
positional arguments, so semantically-equal state dicts with different insertion order
produce different keys → spurious cache misses.
### Code references
- `libs/langgraph/langgraph/_internal/_cache.py:7-31` (`_freeze`, `default_cache_key`)
- `libs/langgraph/langgraph/pregel/_algo.py:860` (`key_func(*call.input[0], **call.input[1])`)
### Suggested fix
- In the `tobytes` branch, include the discriminating metadata (numpy `dtype`;
PIL `mode`, `size`, `getpalette()`), or prefer the object's own pickle reduction.
- Do not let `_freeze` short-circuit on `tuple`/`frozenset` at the top level; recurse into
their contents so positional and keyword paths produce a consistent representation.
---
## System Info
Output of `python -m langchain_core.sys_info` (paste your own — replace this block):
```
# Run: python -m langchain_core.sys_info
# Reproduced against: langgraph==1.2.4, langgraph-checkpoint==4.1.1, numpy (any recent), Python 3.11
``` |
|---|
| 来源 | ⚠️ https://github.com/langchain-ai/langgraph/issues/8009 |
|---|
| 用户 | Dem0 (UID 82596) |
|---|
| 提交 | 2026-06-05 07時44分 (30 日前) |
|---|
| 管理 | 2026-07-04 14時46分 (29 days later) |
|---|
| 状态 | 已接受 |
|---|
| VulDB条目 | 376328 [langchain-ai langgraph 直到 1.2.4 Task Result Cache _cache.py _freeze default_cache_key 弱加密] |
|---|
| 积分 | 20 |
|---|