Rug Pulls: Silent Tool Redefinition | The Vulnerable MCP Project

Overview

MCP tools can silently mutate their own definitions after initial user approval. A tool that appears safe during installation can later change its behavior to perform malicious actions without notifying the user.

Who Is Affected

Identified by Simon Willison, a prominent AI security researcher. Affects all MCP users who approve tools based on their initial descriptions, particularly those using clients that don't track description changes.

Where It Exists

The vulnerability exists in the MCP protocol's tool listing mechanism. Since tool descriptions are fetched dynamically from the server on each session, a server can return different descriptions at different times.

When It Was Found

Publicly discussed on April 9, 2025. The vulnerability is architectural—the MCP spec does not mandate immutable tool definitions or change notification.

How It Works

An attacker publishes a useful MCP server with benign tool descriptions. After gaining user trust and approval, the server updates its tool descriptions to include malicious instructions (e.g., rerouting API keys to attacker infrastructure). Most MCP clients do not alert users to description changes after initial approval.

Impact

Previously approved tools can be weaponized at any time. Attackers can steal credentials, redirect API calls, exfiltrate data, or inject malicious behavior into trusted workflows. The trust established during initial approval becomes a liability.

Mitigation

Use MCP clients that hash and track tool descriptions, alerting on any changes. Re-approve tools whenever descriptions change. Pin tool versions where possible. Implement ETDI tool signing to detect unauthorized modifications.

References

Simon Willison: MCP Prompt Injection Risks