제출 #831451: gradio-app gradio 6.14.0 Cache Poisoning정보

제목	gradio-app gradio 6.14.0 Cache Poisoning
설명	## Cache Poisoning via Partial-Object Hashing in Audio Cache ### Summary An attacker with the ability to influence audio outputs cached by an application can craft two non-equivalent audio payloads with identical raw sample bytes but different audio metadata, causing cache-key reuse, which can lead to incorrect cached audio reuse and negatively impact applications that rely on Gradio's audio cache for output consistency. ### Affected Versions - Affected: `main` branch as of commit `feb7237d01f359d2ad4ee42d00344e61692b3b39` - Confirmed affected: Gradio source commit `feb7237d01f359d2ad4ee42d00344e61692b3b39` - Fixed: Not available in an upstream release or upstream commit at the time of reporting - Introduced in: Not verified Existing audio cache entries generated by affected versions may continue to use the vulnerable cache-key scheme until cache directories are invalidated or regenerated. ### Details The issue occurs because `save_audio_to_cache()` computes an audio cache key only from `data.tobytes()` and then uses the resulting digest as the cache directory. Raw audio bytes do not encode all fields required to determine audio equivalence. The same byte sequence can represent different audio outputs when interpreted with different sample rates, output formats, dtypes, or shapes/channel layouts. Where the Hash is Computed The following code is from Gradio source commit `feb7237d01f359d2ad4ee42d00344e61692b3b39`. ```python # https://github.com/gradio-app/gradio/blob/feb7237d01f359d2ad4ee42d00344e61692b3b39/gradio/processing_utils.py#L182-L190 @traced_sync("postprocess_save_audio_to_cache") def save_audio_to_cache( data: np.ndarray, sample_rate: int, format: str, cache_dir: str ) -> str: temp_dir = Path(cache_dir) / hash_bytes(data.tobytes()) temp_dir.mkdir(exist_ok=True, parents=True) filename = str((temp_dir / f"audio.{format}").resolve()) audio_to_file(sample_rate, data, filename, format=format) return filename ``` The digest is SHA-256 via `hash_bytes()`, salted with Gradio's `hash_seed`. The serialized hash input is only the raw NumPy buffer returned by `data.tobytes()`. The digest is not truncated. What Fields Are Included or Excluded The digest includes: - `data.tobytes()` The digest excludes: - `sample_rate` - `format` - `data.dtype` - `data.shape` - channel layout implied by shape - cache schema version The excluded fields affect how raw bytes are interpreted and which cached file path is returned. Therefore, digest equality does not imply that two audio objects are safe to reuse interchangeably. How the Hash is Used for a Security-Relevant Decision The following code is from Gradio source commit `feb7237d01f359d2ad4ee42d00344e61692b3b39`. ```python # https://github.com/gradio-app/gradio/blob/feb7237d01f359d2ad4ee42d00344e61692b3b39/gradio/processing_utils.py#L186-L190 temp_dir = Path(cache_dir) / hash_bytes(data.tobytes()) temp_dir.mkdir(exist_ok=True, parents=True) filename = str((temp_dir / f"audio.{format}").resolve()) audio_to_file(sample_rate, data, filename, format=format) return filename ``` The digest is used as the cache namespace for audio files. Two audio objects with identical raw bytes but different interpretation metadata reuse the same directory, even though they are not equivalent audio outputs. Why Hash Equality Does Not Imply Security Equivalence The issue is not a raw cryptographic break of SHA-256. The issue is application-level hash confusion: the application treats digest equality as audio object equivalence, but the digest does not encode all fields required to determine audio equivalence. For partial-object hashing, two objects with the same hashed fields but different metadata fields produce the same application-level digest. In this case, `sample_rate`, `format`, `dtype`, and `shape` can change playback duration, encoding, sample interpretation, or channel layout without changing `data.tobytes()`. How the Attacker Constructs a Conflicting Object ```json { "sample_rate": 8000, "format": "wav", "dtype": "int16", "shape": [4], "raw_bytes": "0000010002000300" } ``` ```json { "sample_rate": 16000, "format": "wav", "dtype": "int16", "shape": [4], "raw_bytes": "0000010002000300" } ``` Both objects produce the same vulnerable cache key because only `raw_bytes` are hashed. However, they are not audio-equivalent because `sample_rate` changes how the same samples are interpreted over time. Similar conflicts can be produced by changing `format`, `dtype`, or `shape` while preserving the same raw byte buffer. Version-Specific Behavior At commit `feb7237d01f359d2ad4ee42d00344e61692b3b39`, `save_audio_to_cache()` hashes only `data.tobytes()`. No upstream fixed version or upstream fixed commit was available at the time of reporting. PR #13394 proposes including a cache schema marker plus `sample_rate`, `format`, `dtype`, and `shape` before hashing the raw audio bytes. Existing cache entries generated before a fix may still be reusable unless the cache is cleared or the cache namespace is migrated. Comparison with a Secure Path A safer implementation hashes deterministic audio metadata together with the raw bytes: ```python audio_metadata = { "cache_schema": "audio-cache-v1", "dtype": str(data.dtype), "format": format, "sample_rate": sample_rate, "shape": data.shape, } audio_hash = hashlib.sha256() audio_hash.update(hash_seed) audio_hash.update(json.dumps(audio_metadata, sort_keys=True).encode("utf-8")) audio_hash.update(data.tobytes()) temp_dir = Path(cache_dir) / audio_hash.hexdigest() ``` This approach hashes a deterministic representation of the audio metadata and includes a cache schema marker so future cache-key formats can be separated from older entries. ### Impact This issue allows attackers to: - Cause distinct audio outputs to share the same cache namespace when their raw sample bytes match - Trigger incorrect cached audio reuse across outputs that differ by sample rate, format, dtype, or shape/channel layout - Confuse downstream consumers that rely on cached audio output identity The practical impact is limited to cache-key confusion for audio outputs. The attack is persistent until affected cache entries are invalidated or regenerated. ### Proof of Concept The PoC proves that two non-equivalent audio inputs share the same vulnerable digest and therefore the same cache directory. Tested against Gradio source commit `feb7237d01f359d2ad4ee42d00344e61692b3b39`. ```python #!/usr/bin/env python3 """Minimal PoC for partial-object hashing in Gradio audio cache keys.""" import hashlib import tempfile from pathlib import Path import numpy as np HASH_SEED = b"" def vulnerable_digest(data: np.ndarray) -> str: sha = hashlib.sha256() sha.update(HASH_SEED) sha.update(data.tobytes()) return sha.hexdigest() def vulnerable_cache_path(cache_dir: str, data: np.ndarray, format: str) -> Path: return Path(cache_dir) / vulnerable_digest(data) / f"audio.{format}" first = { "sample_rate": 8000, "format": "wav", "data": np.array([0, 1, 2, 3], dtype=np.int16), } second = { "sample_rate": 16000, "format": "wav", "data": np.array([0, 1, 2, 3], dtype=np.int16), } assert first["sample_rate"] != second["sample_rate"] assert vulnerable_digest(first["data"]) == vulnerable_digest(second["data"]) with tempfile.TemporaryDirectory() as cache_dir: first_path = vulnerable_cache_path(cache_dir, first["data"], first["format"]) second_path = vulnerable_cache_path(cache_dir, second["data"], second["format"]) print("first digest: ", vulnerable_digest(first["data"])) print("second digest:", vulnerable_digest(second["data"])) print("first path: ", first_path) print("second path: ", second_path) print("Cache identity invariant broken: non-equivalent audio objects share the same cache key.") ``` Execution steps: 1. Check out the affected version: ```bash git checkout feb7237d01f359d2ad4ee42d00344e61692b3b39 ``` 2. Run the PoC: ```bash python3 poc.py ``` 3. Observe that two audio objects with different sample rates produce the same digest and cache path: ```text first digest: [same digest] second digest: [same digest] Cache identity invariant broken: non-equivalent audio objects share the same cache key. ``` ### Remediation Fix `save_audio_to_cache()` so the cache key is computed from a canonical metadata payload plus the raw audio bytes. Include all fields that affect audio interpretation and cache validity. ```python # In gradio/processing_utils.py, line 186 # Fix: canonicalize audio metadata, include a cache schema version, and hash metadata plus raw bytes. audio_metadata = { "cache_schema": "audio-cache-v1", "dtype": str(data.dtype), "format": format, "sample_rate": sample_rate, "shape": data.shape, } audio_hash = hashlib.sha256() audio_hash.update(hash_seed) audio_hash.update(json.dumps(audio_metadata, sort_keys=True).encode("utf-8")) audio_hash.update(data.tobytes()) temp_dir = Path(cache_dir) / audio_hash.hexdigest() ``` Additional mitigations: 1. Canonical Serialization: Use deterministic serialization before hashing metadata. 2. Complete Field Coverage: Include all fields relevant to audio interpretation and cache identity. 3. Digest Schema Versioning: Include a digest or cache schema version to prevent unsafe reuse of old entries. 4. Domain Separation: Separate digest namespaces across tenants, users, object types, models, tools, and trust domains if caches cross those boundaries. 5. Cryptographic Construction: SHA-256 is appropriate here; use HMAC only if the digest is later used across stronger trust bound
원천	⚠️ https://github.com/gradio-app/gradio/issues/13395
사용자	Dem0 (UID 82596)
제출	2026. 05. 16. AM 09:23 (19 날 ago)
모더레이션	2026. 06. 03. PM 06:07 (18 days later)
상태	수락
VulDB 항목	368140 [gradio-app gradio 6.14.0 Audio Cache Key save_audio_to_cache 약한 암호화]
포인트들	20

◂ 이전 개요 다음 ▸

Do you know our Splunk app?

Download it now for free!