ReAct: Reasoning and Acting Pattern for AI Agents

ReAct: Reasoning and Acting Pattern for AI Agents

Concept Introduction

Simple Explanation

Imagine asking a friend to help you book a vacation. They don’t just immediately start booking flights - they think out loud: “Okay, I need to first check your budget, then look at available dates, then search for flights…” They alternate between thinking (reasoning) and doing (acting).

ReAct (Reasoning + Acting) is a pattern that makes AI agents work the same way. Instead of just generating actions, the agent explicitly generates its reasoning steps alongside its actions, creating a traceback of why it’s doing what it’s doing.

Technical Detail

ReAct is a prompting and agent architecture pattern that interleaves reasoning traces and task-specific actions. The agent generates:

  1. Thought: Explicit reasoning about what to do next
  2. Action: A specific tool call or operation to execute
  3. Observation: The result of that action

This creates an interpretable sequence: Thought → Action → Observation → Thought → Action → …

The pattern was introduced in the 2022 paper “ReAct: Synergizing Reasoning and Acting in Language Models” by Yao et al. from Princeton and Google Research.

Historical & Theoretical Context

Origin

The ReAct pattern emerged from research trying to answer: “Why do LLMs struggle with multi-step reasoning tasks that require external information?”

Previous approaches fell into two camps:

ReAct unified these approaches by having the model generate both reasoning traces AND actions.

Theoretical Foundation

ReAct builds on:

The key insight: reasoning helps the agent plan and track progress, while acting grounds it in real-world feedback.

How ReAct Works

The Algorithm

1. Initialize: Agent receives task/question
2. Loop until task is solved or max iterations:
   a. Generate Thought: Model reasons about current state and next step
   b. Generate Action: Model decides which tool to call with what parameters
   c. Execute Action: System executes the tool call
   d. Generate Observation: System returns result to the agent
   e. Append to context: Add (Thought, Action, Observation) to conversation history
3. Generate Final Answer based on accumulated observations

Example Trace

Question: What is the population of the capital of France?

Thought 1: I need to first identify the capital of France.
Action 1: Search[capital of France]
Observation 1: Paris is the capital and most populous city of France.

Thought 2: Now I know Paris is the capital. I need to find its population.
Action 2: Search[population of Paris]
Observation 2: The population of Paris is approximately 2.1 million (city proper) and 10.9 million (metropolitan area).

Thought 3: I have found the information needed to answer the question.
Action 3: Finish[The population of Paris is approximately 2.1 million]

Design Patterns & Architectures

Core ReAct Architecture

class ReActAgent:
    def __init__(self, llm, tools):
        self.llm = llm
        self.tools = {tool.name: tool for tool in tools}
        self.max_iterations = 10
    
    def run(self, task):
        context = [{"role": "user", "content": task}]
        
        for i in range(self.max_iterations):
            # Generate Thought + Action
            response = self.llm.generate(context)
            
            # Parse the response
            thought = self.extract_thought(response)
            action_name, action_input = self.extract_action(response)
            
            # Execute action
            if action_name == "Finish":
                return action_input
            
            tool = self.tools.get(action_name)
            observation = tool.execute(action_input)
            
            # Append to context
            context.append({
                "role": "assistant",
                "content": f"Thought: {thought}\nAction: {action_name}[{action_input}]"
            })
            context.append({
                "role": "user",
                "content": f"Observation: {observation}"
            })
        
        return "Max iterations reached"

Pattern Integration

ReAct fits into broader agent patterns:

Practical Application

Complete Working Example

from typing import List, Dict, Callable
import openai

class Tool:
    def __init__(self, name: str, description: str, func: Callable):
        self.name = name
        self.description = description
        self.func = func
    
    def execute(self, input_str: str) -> str:
        return self.func(input_str)

# Define some simple tools
def search_tool(query: str) -> str:
    """Simulated search - in practice, call a real API"""
    knowledge_base = {
        "capital of France": "Paris is the capital of France.",
        "population of Paris": "Paris has a population of approximately 2.1 million in the city proper.",
        "Eiffel Tower height": "The Eiffel Tower is 330 meters (1,083 ft) tall."
    }
    return knowledge_base.get(query.lower(), "No information found.")

def calculator_tool(expression: str) -> str:
    """Safe calculator"""
    try:
        result = eval(expression, {"__builtins__": {}}, {})
        return str(result)
    except:
        return "Invalid expression"

# Create ReAct Agent
class SimpleReActAgent:
    def __init__(self, model="gpt-4"):
        self.model = model
        self.tools = [
            Tool("Search", "Search for information", search_tool),
            Tool("Calculator", "Perform calculations", calculator_tool),
        ]
        self.max_iterations = 5
    
    def create_prompt(self, task: str, history: List[Dict]) -> str:
        """Build the ReAct prompt"""
        tool_descriptions = "\n".join([
            f"{tool.name}: {tool.description}" for tool in self.tools
        ])
        
        prompt = f"""You are an agent that solves tasks by reasoning and acting.

Available tools:
{tool_descriptions}

Use this format:
Thought: [your reasoning about what to do next]
Action: [ToolName][input]

When you have the final answer:
Thought: [reasoning about the answer]
Action: Finish[final answer]

Task: {task}

"""
        # Append history
        for entry in history:
            prompt += entry + "\n"
        
        return prompt
    
    def parse_response(self, text: str) -> tuple:
        """Extract thought and action from LLM response"""
        lines = text.strip().split("\n")
        thought = ""
        action_name = ""
        action_input = ""
        
        for line in lines:
            if line.startswith("Thought:"):
                thought = line.replace("Thought:", "").strip()
            elif line.startswith("Action:"):
                action_part = line.replace("Action:", "").strip()
                # Parse Action: ToolName[input]
                if "[" in action_part:
                    action_name = action_part.split("[")[0].strip()
                    action_input = action_part.split("[")[1].rstrip("]")
        
        return thought, action_name, action_input
    
    def run(self, task: str) -> str:
        """Execute the ReAct loop"""
        history = []
        
        for iteration in range(self.max_iterations):
            # Generate prompt
            prompt = self.create_prompt(task, history)
            
            # Call LLM
            response = openai.ChatCompletion.create(
                model=self.model,
                messages=[{"role": "user", "content": prompt}],
                temperature=0
            )
            
            llm_output = response.choices[0].message.content
            thought, action_name, action_input = self.parse_response(llm_output)
            
            print(f"\n--- Iteration {iteration + 1} ---")
            print(f"Thought: {thought}")
            print(f"Action: {action_name}[{action_input}]")
            
            # Check if finished
            if action_name == "Finish":
                return action_input
            
            # Execute action
            tool = next((t for t in self.tools if t.name == action_name), None)
            if tool:
                observation = tool.execute(action_input)
            else:
                observation = f"Unknown tool: {action_name}"
            
            print(f"Observation: {observation}")
            
            # Update history
            history.append(f"Thought: {thought}")
            history.append(f"Action: {action_name}[{action_input}]")
            history.append(f"Observation: {observation}")
        
        return "Failed to complete task within max iterations"

# Example usage
if __name__ == "__main__":
    agent = SimpleReActAgent()
    result = agent.run("What is the height of the Eiffel Tower in meters?")
    print(f"\n=== Final Answer ===\n{result}")

Comparisons & Tradeoffs

ReAct vs Chain-of-Thought

AspectChain-of-ThoughtReAct
ReasoningInternal onlyInternal + grounded in observations
Tool UseNoYes
InterpretabilityGoodExcellent
Multi-step tasksLimited to model knowledgeCan access external info
Error recoveryDifficultNatural (observe failure, reason about fix)

ReAct vs Function Calling (Direct)

AspectDirect Function CallingReAct
Reasoning tracesNoneExplicit
DebuggabilityBlack boxTransparent
Tokens usedFewerMore
Complex tasksStruggles without reasoningHandles well

Limitations

Token overhead: ReAct uses significantly more tokens than direct function calling due to explicit reasoning traces.

Latency: More LLM calls = higher latency

Quality depends on prompting: If the model doesn’t follow the Thought/Action/Observation format, the agent breaks

Can get stuck in loops: Without proper stop conditions, the agent might repeat unsuccessful actions

Latest Developments & Research

2023-2025 Research

ReWOO (Reasoning WithOut Observation, 2023): Addresses ReAct’s token inefficiency by separating planning from execution - generate all actions upfront, execute them, then reason about results.

Self-Refine + ReAct (2024): Combines ReAct with self-critique loops. After generating a thought/action, a separate LLM call critiques the reasoning before executing.

ReAct with Vector Memory (2024-2025): Stores previous (Thought, Action, Observation) sequences in vector databases, allowing agents to retrieve and learn from past reasoning patterns.

Multimodal ReAct (2025): Extending ReAct to vision-language models, where observations can be images or videos, not just text.

Open Problems

Cross-Disciplinary Insight

Cognitive Science Connection

ReAct mirrors the perception-action loop in cognitive science and embodied AI:

This loop is fundamental to how humans and animals learn and interact with the world. ReAct implements this loop explicitly in LLM-based agents.

Systems Theory Perspective

From a systems perspective, ReAct creates a closed-loop control system:

Traditional LLM use is open-loop (no feedback). ReAct closes the loop, enabling error correction and adaptive behavior - core principles in control theory.

Daily Challenge

Coding Exercise (15-20 minutes)

Challenge: Extend the SimpleReActAgent to support a new “Wikipedia” tool that fetches article summaries. Then test it with the question: “Who wrote the novel ‘1984’ and what year was it published?”

Bonus: Add error handling for when observations don’t contain the needed information, prompting the agent to try alternative searches.

Thought Exercise (10 minutes)

Consider this scenario: Your agent uses ReAct to book a flight, but the airline API returns “No flights available” for the requested dates.

  1. What should the agent’s next Thought be?
  2. What alternative Actions could it take?
  3. How would you design the prompt to encourage flexible problem-solving?

References & Further Reading

Foundational Papers

Framework Implementations

Blog Posts


Tomorrow’s topic: We’ll explore Tool Use and Function Calling in depth - examining how agents decide which tools to use, handle tool failures, and compose complex tool sequences.