| Title | FoundationAgents MetaGPT 0.8.1 Code Injection (CWE-94) |
|---|
| Description | A Remote Code Execution (RCE) vulnerability exists in the check_solution() method in metagpt/ext/aflow/benchmark/humaneval.py and metagpt/ext/aflow/benchmark/mbpp.py, and in metagpt/ext/aflow/operator.py of MetaGPT.
The application fails to sanitize or sandbox LLM-generated code before executing it via the built-in Python exec() function. Any code returned by the LLM is executed directly on the host system with full OS-level privileges.
File: metagpt/ext/aflow/benchmark/humaneval.py, Method: check_solution() - Calls exec() on LLM-generated code without sandboxing.
File: metagpt/ext/aflow/benchmark/mbpp.py, Method: check_solution() - Same exec() pattern, no isolation layer.
File: metagpt/ext/aflow/operator.py - Also invokes exec() on untrusted LLM output.
Reproduction:
1. Install MetaGPT (pip install metagpt).
2. Prepare malicious payload: import os; def solve(x): os.system('touch /tmp/rce_proof'); return x
3. Trigger HumanEvalBenchmark.check_solution() with the payload.
4. Verify /tmp/rce_proof exists on the host.
Impact: Arbitrary OS command execution. Attacker who can influence LLM outputs via prompt injection can achieve full environment compromise. |
|---|
| Source | ⚠️ https://github.com/FoundationAgents/MetaGPT/issues/1942 |
|---|
| User | Eric-y (UID 95889) |
|---|
| Submission | 03/28/2026 03:13 (13 days ago) |
|---|
| Moderation | 04/09/2026 14:04 (12 days later) |
|---|
| Status | Accepted |
|---|
| VulDB entry | 356524 [FoundationAgents MetaGPT up to 0.8.1 HumanEvalBenchmark/MBPPBenchmark check_solution code injection] |
|---|
| Points | 20 |
|---|