OpenAI Tool Calling - Complete Guide

Overview

OpenAI’s tool calling (formerly “function calling”) allows the AI to request execution of functions/tools and receive results. This document explains how it works in simple terms.

The Basic Flow

User asks question
    ↓
AI decides: "I need to use a tool"
    ↓
AI sends BOTH:
  - A text message (explaining what it's doing)
  - Tool call request(s) (the actual function to execute)
    ↓
Your app executes the tool
    ↓
Your app sends result back
    ↓
AI processes result and responds to user

Message Types in Conversation History

1. System Message (role: “system”)

Sets up the AI’s behavior and context
Only sent once at the start
Example: “You are Shello CLI - an AI-powered terminal assistant…“

2. User Message (role: “user”)

What the user types
Example: “list all files in this directory”

3. Assistant Message (role: “assistant”)

The AI’s response
Can contain THREE things:
- Text content (what the AI says to the user)
- Tool calls (functions the AI wants to execute)
- Both at the same time! ← This is the key point

4. Tool Message (role: “tool”)

The result from executing a tool
Must reference the tool_call_id
Your application sends this

Key Concept: Assistant Can Send BOTH Text AND Tool Calls

YES! The AI can send a message with:

Content: “Let me check that for you…”
Tool calls: [{function: "bash", arguments: {"command": "ls"}}]

This is what you see in your logs:

[9] Role: ASSISTANT
────────────────────────────────────────────────────────────────────────────
🔧 Tool Calls: 1

  • Function: bash
    Call ID: rYEbckb86
    Arguments:
      {
        "command": "python -c \"import json; data = {...}; print(json.dumps(data))\""
      }

Sure! Let me test the `analyze_json` tool with a Python code snippet...

Both are present in the same message!

Real Example from Your Logs

Request 1: User asks to test analyze_json

Message [8] - User:

"okay try to use the analyze_json with some input as python code to test out"

Message [9] - Assistant (with tool call):

{
  "role": "assistant",
  "content": "Sure! Let me test the analyze_json tool... Let's start by creating a Python script to generate JSON:",
  "tool_calls": [
    {
      "id": "rYEbckb86",
      "type": "function",
      "function": {
        "name": "bash",
        "arguments": "{\"command\": \"python -c ...\"}"
      }
    }
  ]
}

Message [10] - Tool Result:

{
  "role": "tool",
  "tool_call_id": "rYEbckb86",
  "content": "{\"success\": true, \"output\": \"{\\\"name\\\": \\\"Shello\\\", ...}\"}"
}

Message [11] - Assistant (with another tool call):

{
  "role": "assistant",
  "content": "Now, let's analyze the JSON structure using the analyze_json tool:",
  "tool_calls": [
    {
      "id": "QqRiXviO9",
      "type": "function",
      "function": {
        "name": "analyze_json",
        "arguments": "{\"command\": \"python -c ...\"}"
      }
    }
  ]
}

Message [12] - Tool Result:

{
  "role": "tool",
  "tool_call_id": "QqRiXviO9",
  "content": "{\"success\": true, \"output\": \"jq path | data type...\"}"
}

When Does AI Send What?

Scenario 1: AI just talks (no tools needed)

{
  "role": "assistant",
  "content": "Hello! How can I help you today?"
}

Scenario 2: AI uses tool WITH explanation

{
  "role": "assistant",
  "content": "Let me check that for you...",
  "tool_calls": [{"function": {"name": "bash", "arguments": "{...}"}}]
}

Scenario 3: AI uses tool WITHOUT explanation

{
  "role": "assistant",
  "content": null,
  "tool_calls": [{"function": {"name": "bash", "arguments": "{...}"}}]
}

Most models (including the one you’re using) prefer Scenario 2 - they explain what they’re doing while making the tool call.

The Complete Conversation Flow

Here’s what happens in your Shello CLI:

1. User: "test analyze_json with python code"
2. AI Response (Message 9):
   - Content: "Sure! Let me test... Let's start by creating a Python script..."
   - Tool Call: bash(command="python -c ...")

3. Your App:
   - Executes: python -c "..."
   - Gets output: {"name": "Shello", ...}

4. Your App Sends (Message 10):
   - Role: tool
   - Tool Call ID: rYEbckb86
   - Content: {"success": true, "output": "..."}

5. AI Response (Message 11):
   - Content: "Now, let's analyze the JSON structure..."
   - Tool Call: analyze_json(command="python -c ...")

6. Your App:
   - Executes: analyze_json tool
   - Gets output: jq paths

7. Your App Sends (Message 12):
   - Role: tool
   - Tool Call ID: QqRiXviO9
   - Content: {"success": true, "output": "jq path | data type..."}

8. AI Final Response:
   - Content: "Here's the analysis! The JSON has these fields..."
   - No tool calls (done!)

Important Rules

1. Tool Call IDs Must Match

AI generates a unique ID for each tool call
Your tool result MUST reference that exact ID
This allows multiple tool calls in parallel

2. Content Can Be Null

If content is null but tool_calls exists, AI is just calling tools
If both exist, AI is explaining AND calling tools

3. Tool Results Are Always Strings

The content field in tool messages must be a string
Even if your tool returns JSON, stringify it

4. Conversation History Includes Everything

System message
All user messages
All assistant messages (with tool calls)
All tool results
This maintains context for the AI

Why This Design?

Q: Why does the AI send text AND tool calls together?

A: It makes the conversation feel natural! The AI can:

Explain what it’s about to do
Execute the tool
Then explain the results

This is better than:

Silent tool execution (confusing for users)
Explaining after (feels disconnected)

Q: Why separate tool result messages?

A: Because:

Tool execution happens OUTSIDE the AI
Results come back asynchronously
Multiple tools can be called in parallel
Each result needs to be matched to its request (via ID)

Debugging Tips

Check Your Logs For:

Assistant messages with tool_calls - What is the AI requesting?
Tool messages with tool_call_id - What results are being sent back?
Matching IDs - Do the IDs match between request and result?
Content field - Is the AI explaining what it’s doing?

Common Issues:

Missing tool_call_id: Tool result won’t be matched to request
Wrong ID: AI won’t know which tool call this result is for
Non-string content: API will reject the message
Missing tool result: AI will wait forever (or timeout)

Summary

The key insight: An assistant message can contain BOTH:

Human-readable text (content)
Machine-executable requests (tool_calls)

This allows the AI to:

Tell the user what it’s doing
Actually do it
Process the results
Explain the outcome

All in a natural, conversational flow!

Streaming with Tool Calls

How Streaming Works

When you enable streaming (stream: true), the API sends the response in chunks instead of all at once. This allows you to display the AI’s response as it’s being generated.

Streaming Response Format

Instead of getting one complete message, you get multiple delta chunks:

// Chunk 1: Role
{
  "choices": [{
    "delta": {
      "role": "assistant"
    }
  }]
}
 
// Chunk 2: Content starts
{
  "choices": [{
    "delta": {
      "content": "Sure! "
    }
  }]
}
 
// Chunk 3: More content
{
  "choices": [{
    "delta": {
      "content": "Let me test "
    }
  }]
}
 
// Chunk 4: Tool call starts
{
  "choices": [{
    "delta": {
      "tool_calls": [{
        "index": 0,
        "id": "call_abc123",
        "type": "function",
        "function": {
          "name": "bash",
          "arguments": ""
        }
      }]
    }
  }]
}
 
// Chunk 5: Tool arguments (streamed!)
{
  "choices": [{
    "delta": {
      "tool_calls": [{
        "index": 0,
        "function": {
          "arguments": "{\"command\""
        }
      }]
    }
  }]
}
 
// Chunk 6: More arguments
{
  "choices": [{
    "delta": {
      "tool_calls": [{
        "index": 0,
        "function": {
          "arguments": ": \"ls\"}"
        }
      }]
    }
  }]
}
 
// Final chunk
{
  "choices": [{
    "delta": {},
    "finish_reason": "tool_calls"
  }]
}

Key Points About Streaming

1. Content and Tool Calls Can Be Interleaved

The AI might stream:

"Let me check..." → [content chunks]
→ [tool call starts]
→ [tool arguments stream]
→ "that for you." → [more content chunks]

Or it might send all content first, then tool calls:

"Let me check that for you." → [all content]
→ [tool call starts]
→ [tool arguments stream]

2. Tool Arguments Are Streamed Character by Character

The JSON arguments don’t come all at once:

Chunk 1: "{"
Chunk 2: "\"command\""
Chunk 3: ": "
Chunk 4: "\"ls -la\""
Chunk 5: "}"

You must accumulate these chunks to build the complete JSON string!

3. Multiple Tool Calls Can Be Streamed

If the AI calls multiple tools, they’re indexed:

{
  "delta": {
    "tool_calls": [{
      "index": 0,  // First tool call
      "function": {"name": "bash", "arguments": "..."}
    }]
  }
}
 
{
  "delta": {
    "tool_calls": [{
      "index": 1,  // Second tool call
      "function": {"name": "analyze_json", "arguments": "..."}
    }]
  }
}

4. Finish Reasons Tell You What Happened

finish_reason: "stop" - AI finished naturally (no more to say)
finish_reason: "tool_calls" - AI made tool call(s), waiting for results
finish_reason: "length" - Hit token limit
finish_reason: null - Still streaming (not done yet)

Accumulating Streamed Data

You need to build up the complete message from chunks:

# Initialize accumulators
content = ""
tool_calls = {}  # Dictionary indexed by tool call index
 
for chunk in stream:
    delta = chunk['choices'][0]['delta']
 
    # Accumulate content
    if 'content' in delta and delta['content']:
        content += delta['content']
        print(delta['content'], end='', flush=True)  # Display as it arrives
 
    # Accumulate tool calls
    if 'tool_calls' in delta:
        for tc_delta in delta['tool_calls']:
            index = tc_delta['index']
 
            # Initialize this tool call if first time seeing it
            if index not in tool_calls:
                tool_calls[index] = {
                    'id': tc_delta.get('id', ''),
                    'type': tc_delta.get('type', 'function'),
                    'function': {
                        'name': tc_delta.get('function', {}).get('name', ''),
                        'arguments': ''
                    }
                }
 
            # Accumulate function arguments
            if 'function' in tc_delta and 'arguments' in tc_delta['function']:
                tool_calls[index]['function']['arguments'] += tc_delta['function']['arguments']
 
    # Check if done
    finish_reason = chunk['choices'][0].get('finish_reason')
    if finish_reason:
        break
 
# Now you have:
# - content: Complete text message
# - tool_calls: Complete tool call objects with full JSON arguments

Real Example from Your Logs

When you see this in your terminal:

Sure! Let me test the `analyze_json` tool with a Python code snippet...
┌─[💻 mapar@Omputer]─[C:\REPO\shello-cli-python]
└─$ python -c "import json; data = {...}"

Here’s what actually happened behind the scenes:

Streaming chunks received:

"Sure! "
"Let me "
"test the "
`analyze_json`
" tool..."
Tool call starts: {"index": 0, "id": "rYEbckb86", "function": {"name": "bash"}}
Arguments chunk: "{\"command\""
Arguments chunk: ": \"python"
Arguments chunk: -c ..."
Arguments chunk: "}"
Finish reason: "tool_calls"

Your app accumulated all chunks into:

Content: "Sure! Let me test the analyze_json tool..."
Tool call: {id: "rYEbckb86", function: {name: "bash", arguments: "{\"command\": \"python -c ...\"}"}}

Why Stream Tool Calls?

Q: Why not just send the complete tool call at once?

A: Consistency and flexibility!

Same streaming mechanism for everything
Allows for very long tool arguments (e.g., large JSON payloads)
You can start processing as soon as you have enough data
Better user experience (shows progress)

Common Streaming Pitfalls

1. Incomplete JSON Arguments

# ❌ BAD: Using arguments before streaming is complete
for chunk in stream:
    if 'tool_calls' in chunk['delta']:
        args = chunk['delta']['tool_calls'][0]['function']['arguments']
        json.loads(args)  # ERROR! Incomplete JSON!
 
# ✅ GOOD: Accumulate first, parse after
accumulated_args = ""
for chunk in stream:
    if 'tool_calls' in chunk['delta']:
        accumulated_args += chunk['delta']['tool_calls'][0]['function']['arguments']
    if chunk['choices'][0].get('finish_reason'):
        parsed_args = json.loads(accumulated_args)  # Now it's complete!

2. Not Handling Multiple Tool Calls

# ❌ BAD: Assuming only one tool call
tool_call = {'arguments': ''}
for chunk in stream:
    if 'tool_calls' in chunk['delta']:
        tool_call['arguments'] += chunk['delta']['tool_calls'][0]['function']['arguments']
 
# ✅ GOOD: Track by index
tool_calls = {}
for chunk in stream:
    if 'tool_calls' in chunk['delta']:
        for tc in chunk['delta']['tool_calls']:
            index = tc['index']
            if index not in tool_calls:
                tool_calls[index] = {'arguments': ''}
            tool_calls[index]['arguments'] += tc['function']['arguments']

3. Displaying Tool Calls to User

# ❌ BAD: Showing raw JSON chunks
for chunk in stream:
    if 'tool_calls' in chunk['delta']:
        print(chunk['delta']['tool_calls'][0]['function']['arguments'])
        # Output: {"command"
        #         : "ls -la"
        #         }
        # Looks broken!
 
# ✅ GOOD: Wait until complete, then display nicely
# Accumulate silently, then show formatted output
if finish_reason == 'tool_calls':
    for tc in tool_calls.values():
        print(f"🔧 Calling {tc['function']['name']}({tc['function']['arguments']})")

Streaming Flow Diagram

User: "list files"
    ↓
[Stream starts]
    ↓
Chunk: {"delta": {"content": "Let "}}
    → Display: "Let "
    ↓
Chunk: {"delta": {"content": "me check..."}}
    → Display: "me check..."
    ↓
Chunk: {"delta": {"tool_calls": [{"index": 0, "id": "abc", "function": {"name": "bash"}}]}}
    → Store: tool_calls[0] = {id: "abc", name: "bash", args: ""}
    ↓
Chunk: {"delta": {"tool_calls": [{"index": 0, "function": {"arguments": "{\"command\""}}]}}
    → Accumulate: tool_calls[0].args += "{\"command\""
    ↓
Chunk: {"delta": {"tool_calls": [{"index": 0, "function": {"arguments": ": \"ls\"}"}}]}}
    → Accumulate: tool_calls[0].args += ": \"ls\"}"
    ↓
Chunk: {"finish_reason": "tool_calls"}
    → Parse: json.loads(tool_calls[0].args) = {"command": "ls"}
    → Execute: bash("ls")
    ↓
[Stream ends]

Summary

Streaming with tool calls means:

Content is streamed word-by-word (or token-by-token)
Tool calls are streamed piece-by-piece
Tool arguments (JSON) are streamed character-by-character
You must accumulate chunks before parsing
Multiple tool calls are tracked by index
finish_reason tells you when streaming is complete

The benefit: Users see progress in real-time, even when the AI is preparing to call tools!

Om's Brain

Explorer

1 OpenAI Tool Calling

OpenAI Tool Calling - Complete Guide

Overview

The Basic Flow

Message Types in Conversation History

1. System Message (role: “system”)

2. User Message (role: “user”)

3. Assistant Message (role: “assistant”)

4. Tool Message (role: “tool”)

Key Concept: Assistant Can Send BOTH Text AND Tool Calls

Real Example from Your Logs

Request 1: User asks to test analyze_json

When Does AI Send What?

Scenario 1: AI just talks (no tools needed)

Scenario 2: AI uses tool WITH explanation

Scenario 3: AI uses tool WITHOUT explanation

The Complete Conversation Flow

Important Rules

1. Tool Call IDs Must Match

2. Content Can Be Null

3. Tool Results Are Always Strings

4. Conversation History Includes Everything

Why This Design?

Debugging Tips

Check Your Logs For:

Common Issues:

Summary

Streaming with Tool Calls

How Streaming Works

Streaming Response Format

Key Points About Streaming

1. Content and Tool Calls Can Be Interleaved

2. Tool Arguments Are Streamed Character by Character

3. Multiple Tool Calls Can Be Streamed

4. Finish Reasons Tell You What Happened

Accumulating Streamed Data

Real Example from Your Logs

Why Stream Tool Calls?

Common Streaming Pitfalls

1. Incomplete JSON Arguments

2. Not Handling Multiple Tool Calls

3. Displaying Tool Calls to User

Streaming Flow Diagram

Summary

Table of Contents

Mindmap

Graph View