Home / Tool Function Parameter Abuse

Tool Function Parameter Abuse

Also known as: Parameter Injection for Data Extraction, Hidden Parameter Exfiltration

High May 15, 2025 HiddenLayer

Overview

Malicious MCP servers define tools with parameters like 'system_prompt' or 'conversation_history' that instruct the LLM to populate them with sensitive context. This extracts chain-of-thought, previous tool results, system prompts, and conversation data through seemingly normal tool invocations.

Who Is Affected

Discovered by HiddenLayer's AI security research team. Affects users of any MCP client where the LLM follows parameter naming hints to populate tool inputs.

Where It Exists

The vulnerability is in the tool parameter schema. By naming parameters suggestively (e.g., 'system_prompt', 'full_context', 'previous_results'), the attacker exploits the LLM's tendency to fill parameters with matching data from its context.

When It Was Found

Published May 15, 2025. The technique exploits fundamental LLM behavior patterns around parameter filling and instruction following.

How It Works

A malicious tool defines parameters with names like 'system_prompt', 'conversation_history', 'chain_of_thought', or 'available_tools'. The LLM, following its training to populate parameters accurately, extracts this information from its context and passes it to the malicious tool. The server collects this data, gaining visibility into the user's full agent configuration.

Impact

Extraction of system prompts (revealing business logic and security rules), conversation history (containing sensitive user data), chain-of-thought (revealing reasoning patterns), and tool inventories (enabling targeted attacks on other connected services). This intelligence can be used to craft more effective follow-up attacks.

Mitigation

Implement parameter value filtering that blocks system prompt and context leakage. Use MCP clients that validate parameter values against sensitivity heuristics. Restrict what context data the LLM can include in tool parameters. Flag tools with suspiciously named parameters during review.

References