Submit #791693: FoundationAgents MetaGPT 0.8.1 Code Injection (CWE-94)info

Title	FoundationAgents MetaGPT 0.8.1 Code Injection (CWE-94)
Description	A Remote Code Execution (RCE) vulnerability exists in the check_solution() method in metagpt/ext/aflow/benchmark/humaneval.py and metagpt/ext/aflow/benchmark/mbpp.py, and in metagpt/ext/aflow/operator.py of MetaGPT. The application fails to sanitize or sandbox LLM-generated code before executing it via the built-in Python exec() function. Any code returned by the LLM is executed directly on the host system with full OS-level privileges. File: metagpt/ext/aflow/benchmark/humaneval.py, Method: check_solution() - Calls exec() on LLM-generated code without sandboxing. File: metagpt/ext/aflow/benchmark/mbpp.py, Method: check_solution() - Same exec() pattern, no isolation layer. File: metagpt/ext/aflow/operator.py - Also invokes exec() on untrusted LLM output. Reproduction: 1. Install MetaGPT (pip install metagpt). 2. Prepare malicious payload: import os; def solve(x): os.system('touch /tmp/rce_proof'); return x 3. Trigger HumanEvalBenchmark.check_solution() with the payload. 4. Verify /tmp/rce_proof exists on the host. Impact: Arbitrary OS command execution. Attacker who can influence LLM outputs via prompt injection can achieve full environment compromise.
Source	⚠️ https://github.com/FoundationAgents/MetaGPT/issues/1942
User	Eric-y (UID 95889)
Submission	03/28/2026 03:13 (13 days ago)
Moderation	04/09/2026 14:04 (12 days later)
Status	Accepted
VulDB entry	356524 [FoundationAgents MetaGPT up to 0.8.1 HumanEvalBenchmark/MBPPBenchmark check_solution code injection]
Points	20

◂ Previous Overview Next ▸

Want to know what is going to be exploited?

We predict KEV entries!