Overview

One of the biggest challenges with MCP (Model Context Protocol) servers in Claude Code has traditionally been context window bloat. Before the introduction of MCP Tool Search, Claude Code would preload the full schema of every connected MCP tool into the model’s context window. This meant that connecting multiple MCP servers could consume tens of thousands of tokens before the first user message was even sent. To solve this problem, Anthropic introduced MCP Tool Search, a lazy-loading mechanism that loads MCP tool definitions only when they are needed.

The Original Problem

Traditional MCP Loading

When Claude Code started:

Connect to MCP servers
Call tools/list
Download all tool definitions
Inject all tool schemas into the model context

Example:

GitHub MCP
├── create_pr
├── list_issues
├── search_repos
 
Supabase MCP
├── execute_sql
├── list_projects
 
Linear MCP
├── get_issue
├── list_projects

All of these tool definitions would immediately become part of Claude’s context.

Why This Was Bad

Tool definitions contain:

Tool names
Descriptions
Parameters
JSON schemas
Usage instructions

A large MCP server can contain dozens or hundreds of tools. Example:

7 MCP Servers
↓
150+ Tools
↓
36,000+ Tokens
↓
18% Context Window Consumed

Before the conversation even starts.

This reduces:

Available reasoning space
Long-term memory capacity
Conversation length
Agent performance

Anthropic’s Solution: MCP Tool Search

Instead of loading full schemas upfront: Claude now loads:

Built-in Tools
+
MCP Search Tool
+
List of MCP Tool Names

Only.

What Gets Loaded Initially?

Built-in Tools

These are always available.

Examples:

Read
Write
Edit
Bash
Grep
Glob
WebSearch

MCP Search Tool

A new system tool:

mcp_search

Purpose:

Search for or select MCP tools
and make them available for use.

MCP Tool Catalog

Instead of loading schemas: Claude only sees names.

Example:

supabase_execute_sql
supabase_list_projects
 
linear_get_issue
linear_list_projects
 
sentry_get_errors
sentry_list_projects

Notice:

No descriptions
No parameters
No schemas

Only names.

This consumes very little context.

How Tool Search Works

Step 1

Claude sees available tool names. Example user request:

How many users are in my database?

Claude reasons:

I probably need Supabase.

Step 2

Claude searches for the tool.

mcp_search(
  select:supabase_execute_sql
)

Step 3

Tool Search returns a reference. Important: The MCP Search tool does NOT return the full schema in the tool response. Instead:

Tool Reference Returned

Step 4

Claude Code Runtime Injects Tool Definition This is the most interesting discovery. According to trace analysis: The Claude Code harness dynamically injects the full tool definition into the system tool list.

Before:

System Tools
------------
Read
Write
Edit
Bash
MCP Search

After:

System Tools
------------
Read
Write
Edit
Bash
MCP Search
 
supabase_execute_sql

The newly loaded tool now behaves exactly like a normal tool.

Does the MCP Tool Become a System Tool?

Short Answer

Effectively: Yes. Architecturally: No.

Explanation

The tool is still an MCP tool. However: Once loaded, its schema is injected into the active tool set that Claude can use. From Claude’s perspective:

Read
Write
Edit
Bash
supabase_execute_sql

are all available tools. The model does not care where the tool originated.

Better Mental Model

Think of Claude’s available tools as:

Available Tools
=
Built-in Tools
+
Loaded MCP Tools

Initially:

Available Tools
=
Built-in Tools
+
[]

After searching:

Available Tools
=
Built-in Tools
+
[supabase_execute_sql]

After another search:

Available Tools
=
Built-in Tools
+
[
  supabase_execute_sql,
  linear_get_issue
]

Context Growth During a Session

Every newly loaded MCP tool remains available.

Example:

Tool A Loaded
↓
Tool B Loaded
↓
Tool C Loaded

All remain loaded.

Context usage increases gradually.

Example:

Start:
0 MCP Tool Tokens
 
After Tool A:
1.4k Tokens
 
After Tool B:
2.6k Tokens
 
After Tool C:
4k Tokens

This is much better than loading everything upfront.

Why Keep Loaded Tools?

Anthropic likely chose this design because:

1. Repeated Usage

Agents often call the same tool repeatedly. Example:

execute_sql
execute_sql
execute_sql
execute_sql

Reloading every time would be inefficient.

2. Better Attention

Models pay more attention to:

Beginning of context
End of context

Keeping tool schemas in the system area improves recall.

3. Simpler Architecture

Existing tool-calling infrastructure continues to work.

No need to invent:

Special Lazy Tool Calls

Instead:

Search
↓
Load Tool
↓
Use Tool Normally

What Happens During Compaction?

This was one of the most interesting findings.

After:

/compact

Claude removes all dynamically loaded MCP tools.

Before compaction:

Loaded Tools
------------
supabase_execute_sql
linear_get_issue
sentry_get_errors

After compaction:

Loaded Tools
------------
None

Only:

Built-in Tools
+
MCP Search Tool

remain.

Why Is This Good?

It restores the context window.

Instead of carrying around:

20
30
50
Loaded MCP Tools

Claude starts fresh.

If a tool is needed again:

Search
↓
Load
↓
Use

Complete Lifecycle

Claude Starts
      │
      ▼
Built-in Tools
+
MCP Search Tool
+
Tool Name Catalog
 
      │
      ▼
User Request
 
      │
      ▼
Claude Chooses Tool
 
      │
      ▼
MCP Search
 
      │
      ▼
Tool Reference Returned
 
      │
      ▼
Claude Code Runtime
 
      │
      ▼
Inject Full Tool Schema
 
      │
      ▼
Tool Appears In System Tools
 
      │
      ▼
Claude Calls Tool
 
      │
      ▼
Tool Remains Loaded
 
      │
      ▼
More Tools Loaded
 
      │
      ▼
Context Grows
 
      │
      ▼
/compact
 
      │
      ▼
Loaded MCP Tools Removed
 
      │
      ▼
Back To Search-Based Loading

Key Takeaways

Before Tool Search

Connect MCP
↓
Load All Tool Schemas
↓
Huge Context Usage

After Tool Search

Connect MCP
↓
Load Only Tool Names
↓
Search When Needed
↓
Inject Tool Schema
↓
Use Tool

Important Observations

MCP tools are no longer preloaded.
Only tool names are initially visible.
MCP Search loads tools on demand.
Loaded tools are injected into the active tool list.
Loaded tools remain available during the session.
Context usage grows gradually.
/compact removes loaded MCP tools.
Tool Search dramatically reduces startup context consumption.

Interview-Style Summary

Claude Code’s MCP Tool Search implements lazy loading for MCP servers. Instead of preloading every MCP tool schema into the context window, Claude initially receives only a lightweight catalog of tool names and a special MCP Search tool. When Claude decides it needs a tool, it first searches for it, after which the Claude Code runtime dynamically injects the full tool definition into the active tool list. The tool remains available for the rest of the session until context compaction occurs, at which point all dynamically loaded MCP tools are removed and can be loaded again on demand.

Om's Brain

Explorer

6. MCP Tool Search & Lazy Loading

Overview

The Original Problem

Traditional MCP Loading

Why This Was Bad

Anthropic’s Solution: MCP Tool Search

What Gets Loaded Initially?

Built-in Tools

MCP Search Tool

MCP Tool Catalog

How Tool Search Works

Step 1

Step 2

Step 3

Step 4

Does the MCP Tool Become a System Tool?

Short Answer

Explanation

Better Mental Model

Context Growth During a Session

Why Keep Loaded Tools?

1. Repeated Usage

2. Better Attention

3. Simpler Architecture

What Happens During Compaction?

Why Is This Good?

Complete Lifecycle

Key Takeaways

Before Tool Search

After Tool Search

Important Observations

Interview-Style Summary

Table of Contents

Mindmap

Graph View