| 제목 | Bytedance verl <=0.7.0 Arbitrary Code Execution |
|---|
| 설명 | verl is an open-sou large model reinfoment learning training framework by ByteDance (20k+ Stars), supporting algorithms such as PPO and GRPO. In its math answer scoring module prime_math/grader.py, the math_equal() function is used to compare whether the model-generated answer is equivalent to the ground truth answer.
When the ground truth answer is a matrix type (containing \begin{pmatrix}) and the model's output answer starts with [ and ends with ], the code directly calls Python's built-in eval() function on the model output without any input sanitization or sandbox isolation.
An attacker can use Indirect Prompt Injection (injecting malicious instructions into the training dataset) to induce the LLM to output a string containing malicious Python code when answering matrix-type math problems. This string is extracted by match_answer(), passed into math_equal(), and ultimately executed by eval(), achieving arbitrary code execution (ACE). |
|---|
| 원천 | ⚠️ https://github.com/zast-ai/vulnerability-reports/blob/main/bytedance/verl_rce.md |
|---|
| 사용자 | ZAST.AI (UID 87884) |
|---|
| 제출 | 2026. 04. 02. AM 07:08 (24 날 ago) |
|---|
| 모더레이션 | 2026. 04. 22. PM 08:23 (21 days later) |
|---|
| 상태 | 수락 |
|---|
| VulDB 항목 | 359040 [ByteDance verl 까지 0.7.0 prime_math/grader.py math_equal 권한 상승] |
|---|
| 포인트들 | 20 |
|---|