Mastering AI Agents: Tool Use and Function Calling

1. Concept Introduction

At its core, an AI agent is a system that perceives its environment and takes actions to achieve its goals. Early AI agents were often limited to a predefined set of actions within a simulated world. However, the power of modern agents, especially those built on Large Language Models (LLMs), comes from their ability to use tools.

In simple terms: Imagine you ask a very smart assistant to tell you the weather. The assistant doesn’t inherently know the weather; it’s just a language expert. To answer your question, it needs to look at a weather app. That “weather app” is a tool. Tool use is the ability of an AI agent to select and use external resources—like APIs, databases, or other code functions—to acquire information or perform actions that it cannot do on its own.

Technically speaking: “Function calling” is the mechanism that enables this. An LLM, when prompted with a user request, can determine that it needs to execute a function to fulfill the request. Instead of just generating a text response, the model outputs a structured JSON object containing the name of the function to call and the arguments to pass to it. The agent’s host environment then executes this function, gets a result, and feeds that result back to the LLM to generate the final, informed response. This turns the LLM from a passive text generator into an active reasoner that can orchestrate external capabilities.

2. Historical & Theoretical Context

The idea of AI using external tools is not new. It has roots in classical AI and robotics:

The recent explosion in this area is due to the remarkable reasoning capabilities of modern LLMs. Models like GPT-4 are trained not just on text, but also on code, which gives them an implicit understanding of function signatures and data structures. This makes them exceptionally good at determining when to call a function and what to pass to it.

3. Algorithms & Flow

The function calling process follows a clear, predictable loop. It’s less of a mathematical algorithm and more of a state-driven control flow.

Pseudocode for a Function-Calling Agent Loop:

function handle_user_request(request):
  // 1. Initial call to the LLM with user request and available tools
  response = llm.generate(
    prompt = request,
    tools = [get_weather, get_stock_price]
  )

  // 2. Check if the model wants to call a tool
  if response.has_tool_call():
    // 3. Execute the tool call
    tool_name = response.tool_call.name
    tool_args = response.tool_call.arguments
    
    // Find and run the actual function
    function_to_run = available_tools[tool_name]
    tool_result = function_to_run(**tool_args)

    // 4. Feed the result back to the LLM
    final_response = llm.generate(
      prompt = request,
      previous_context = [response, tool_result]
    )
    return final_response.text

  else:
    // The model answered directly
    return response.text

This loop is the foundation of many agent architectures.

4. Design Patterns & Architectures

Function calling is a key component in several agent design patterns:

5. Practical Application

Let’s see a simple Python example using the openai library.

import openai
import json

# Assume you have your OpenAI API key set up
client = openai.OpenAI()

# 1. Define the tool (a simple function)
def get_current_weather(location, unit="celsius"):
    """Get the current weather in a given location."""
    weather_info = {
        "location": location,
        "temperature": "22",
        "unit": unit,
        "forecast": ["sunny", "windy"],
    }
    return json.dumps(weather_info)

# 2. Make the first call to the model
messages = [{"role": "user", "content": "What's the weather like in Boston?"}]
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                },
                "required": ["location"],
            },
        },
    }
]

response = client.chat.completions.create(
    model="gpt-4-1106-preview",
    messages=messages,
    tools=tools,
    tool_choice="auto",
)

response_message = response.choices[0].message

# 3. Check if the model decided to call the tool
if response_message.tool_calls:
    # 4. Execute the function
    available_functions = {
        "get_current_weather": get_current_weather,
    }
    function_name = response_message.tool_calls[0].function.name
    function_to_call = available_functions[function_name]
    function_args = json.loads(response_message.tool_calls[0].function.arguments)
    
    function_response = function_to_call(
        location=function_args.get("location"),
        unit=function_args.get("unit"),
    )

    # 5. Send the info back to the model to get a natural language response
    messages.append(response_message)  # extend conversation with assistant's reply
    messages.append(
        {
            "tool_call_id": response_message.tool_calls[0].id,
            "role": "tool",
            "name": function_name,
            "content": function_response,
        }
    )
    second_response = client.chat.completions.create(
        model="gpt-4-1106-preview",
        messages=messages,
    )
    print(second_response.choices[0].message.content)

Frameworks like LangGraph are built almost entirely around this concept, representing the agent’s flow as a graph where nodes are functions and edges are the conditional logic that decides which tool to call next.

6. Comparisons & Tradeoffs

MethodProsConsBest For
Function CallingAccess to real-time, external data. Can perform actions. High reliability for structured tasks.Higher latency (multiple LLM calls). Costlier. Requires coding the tools.Tasks requiring up-to-date info, calculations, or interaction with other systems (e.g., booking a flight).
Fine-TuningDeeply embeds knowledge. Fast inference. Can change model’s style and tone.Expensive to train. Can become outdated (static knowledge). Doesn’t enable actions.Teaching an LLM a new, stable domain of knowledge (e.g., medical terminology, a specific coding style).
RAG (Retrieval-Augmented Generation)Access to external knowledge. Cheaper than fine-tuning. Can be updated easily.“Dumb” retrieval; no reasoning about the data source. Cannot perform actions.Question-answering over a large corpus of documents (e.g., a customer support bot for your product docs).

7. Latest Developments & Research

The field is moving fast:

An open problem is tool discovery: how can an agent find and learn to use a new tool it has never seen before? Another is robustness: ensuring agents fail gracefully when a tool call doesn’t work as expected.

8. Cross-Disciplinary Insight

The concept of tool use in AI has a fascinating parallel in cognitive science: affordance theory. Proposed by psychologist James J. Gibson, an “affordance” is what the environment offers an individual. A chair affords sitting; a knob affords turning.

When we provide an LLM with a set of tools, we are defining its “digital affordances.” The model’s reasoning process involves perceiving the user’s request and mapping it to the affordances provided by its tools. A well-designed toolset gives the agent the right affordances to effectively solve problems in its environment.

9. Daily Challenge / Thought Exercise

Your 30-Minute Challenge:

Write a simple Python agent that has two tools:

  1. get_current_time(): Returns the current time in a specified timezone.
  2. perform_calculation(expression): Takes a string like “5*8” and returns the result.

Your agent should be able to answer questions like:

Use the OpenAI function calling API or a similar library. Focus on the logic of selecting the right tool based on the user’s prompt.

10. References & Further Reading