Overview
One of the biggest challenges with MCP (Model Context Protocol) servers in Claude Code has traditionally been context window bloat. Before the introduction of MCP Tool Search, Claude Code would preload the full schema of every connected MCP tool into the model’s context window. This meant that connecting multiple MCP servers could consume tens of thousands of tokens before the first user message was even sent. To solve this problem, Anthropic introduced MCP Tool Search, a lazy-loading mechanism that loads MCP tool definitions only when they are needed.
The Original Problem
Traditional MCP Loading
When Claude Code started:
- Connect to MCP servers
- Call
tools/list - Download all tool definitions
- Inject all tool schemas into the model context
Example:
GitHub MCP
├── create_pr
├── list_issues
├── search_repos
Supabase MCP
├── execute_sql
├── list_projects
Linear MCP
├── get_issue
├── list_projectsAll of these tool definitions would immediately become part of Claude’s context.
Why This Was Bad
Tool definitions contain:
- Tool names
- Descriptions
- Parameters
- JSON schemas
- Usage instructions
A large MCP server can contain dozens or hundreds of tools. Example:
7 MCP Servers
↓
150+ Tools
↓
36,000+ Tokens
↓
18% Context Window ConsumedBefore the conversation even starts.
This reduces:
- Available reasoning space
- Long-term memory capacity
- Conversation length
- Agent performance
Anthropic’s Solution: MCP Tool Search
Instead of loading full schemas upfront: Claude now loads:
Built-in Tools
+
MCP Search Tool
+
List of MCP Tool NamesOnly.
What Gets Loaded Initially?
Built-in Tools
These are always available.
Examples:
Read
Write
Edit
Bash
Grep
Glob
WebSearchMCP Search Tool
A new system tool:
mcp_searchPurpose:
Search for or select MCP tools
and make them available for use.MCP Tool Catalog
Instead of loading schemas: Claude only sees names.
Example:
supabase_execute_sql
supabase_list_projects
linear_get_issue
linear_list_projects
sentry_get_errors
sentry_list_projectsNotice:
- No descriptions
- No parameters
- No schemas
Only names.
This consumes very little context.
How Tool Search Works
Step 1
Claude sees available tool names. Example user request:
How many users are in my database?Claude reasons:
I probably need Supabase.Step 2
Claude searches for the tool.
mcp_search(
select:supabase_execute_sql
)Step 3
Tool Search returns a reference. Important: The MCP Search tool does NOT return the full schema in the tool response. Instead:
Tool Reference ReturnedStep 4
Claude Code Runtime Injects Tool Definition This is the most interesting discovery. According to trace analysis: The Claude Code harness dynamically injects the full tool definition into the system tool list.
Before:
System Tools
------------
Read
Write
Edit
Bash
MCP SearchAfter:
System Tools
------------
Read
Write
Edit
Bash
MCP Search
supabase_execute_sqlThe newly loaded tool now behaves exactly like a normal tool.
Does the MCP Tool Become a System Tool?
Short Answer
Effectively: Yes. Architecturally: No.
Explanation
The tool is still an MCP tool. However: Once loaded, its schema is injected into the active tool set that Claude can use. From Claude’s perspective:
Read
Write
Edit
Bash
supabase_execute_sqlare all available tools. The model does not care where the tool originated.
Better Mental Model
Think of Claude’s available tools as:
Available Tools
=
Built-in Tools
+
Loaded MCP ToolsInitially:
Available Tools
=
Built-in Tools
+
[]After searching:
Available Tools
=
Built-in Tools
+
[supabase_execute_sql]After another search:
Available Tools
=
Built-in Tools
+
[
supabase_execute_sql,
linear_get_issue
]Context Growth During a Session
Every newly loaded MCP tool remains available.
Example:
Tool A Loaded
↓
Tool B Loaded
↓
Tool C LoadedAll remain loaded.
Context usage increases gradually.
Example:
Start:
0 MCP Tool Tokens
After Tool A:
1.4k Tokens
After Tool B:
2.6k Tokens
After Tool C:
4k TokensThis is much better than loading everything upfront.
Why Keep Loaded Tools?
Anthropic likely chose this design because:
1. Repeated Usage
Agents often call the same tool repeatedly. Example:
execute_sql
execute_sql
execute_sql
execute_sqlReloading every time would be inefficient.
2. Better Attention
Models pay more attention to:
- Beginning of context
- End of context
Keeping tool schemas in the system area improves recall.
3. Simpler Architecture
Existing tool-calling infrastructure continues to work.
No need to invent:
Special Lazy Tool CallsInstead:
Search
↓
Load Tool
↓
Use Tool NormallyWhat Happens During Compaction?
This was one of the most interesting findings.
After:
/compactClaude removes all dynamically loaded MCP tools.
Before compaction:
Loaded Tools
------------
supabase_execute_sql
linear_get_issue
sentry_get_errorsAfter compaction:
Loaded Tools
------------
NoneOnly:
Built-in Tools
+
MCP Search Toolremain.
Why Is This Good?
It restores the context window.
Instead of carrying around:
20
30
50
Loaded MCP ToolsClaude starts fresh.
If a tool is needed again:
Search
↓
Load
↓
UseComplete Lifecycle
Claude Starts
│
▼
Built-in Tools
+
MCP Search Tool
+
Tool Name Catalog
│
▼
User Request
│
▼
Claude Chooses Tool
│
▼
MCP Search
│
▼
Tool Reference Returned
│
▼
Claude Code Runtime
│
▼
Inject Full Tool Schema
│
▼
Tool Appears In System Tools
│
▼
Claude Calls Tool
│
▼
Tool Remains Loaded
│
▼
More Tools Loaded
│
▼
Context Grows
│
▼
/compact
│
▼
Loaded MCP Tools Removed
│
▼
Back To Search-Based LoadingKey Takeaways
Before Tool Search
Connect MCP
↓
Load All Tool Schemas
↓
Huge Context UsageAfter Tool Search
Connect MCP
↓
Load Only Tool Names
↓
Search When Needed
↓
Inject Tool Schema
↓
Use ToolImportant Observations
- MCP tools are no longer preloaded.
- Only tool names are initially visible.
- MCP Search loads tools on demand.
- Loaded tools are injected into the active tool list.
- Loaded tools remain available during the session.
- Context usage grows gradually.
/compactremoves loaded MCP tools.- Tool Search dramatically reduces startup context consumption.
Interview-Style Summary
Claude Code’s MCP Tool Search implements lazy loading for MCP servers. Instead of preloading every MCP tool schema into the context window, Claude initially receives only a lightweight catalog of tool names and a special MCP Search tool. When Claude decides it needs a tool, it first searches for it, after which the Claude Code runtime dynamically injects the full tool definition into the active tool list. The tool remains available for the rest of the session until context compaction occurs, at which point all dynamically loaded MCP tools are removed and can be loaded again on demand.