Goal-Oriented Action Planning (GOAP) for AI Agents

Goal-Oriented Action Planning (GOAP) for AI Agents

Concept Introduction

Simple explanation: GOAP is like having a GPS for decisions. You tell the agent where you want to end up (the goal), and it figures out the series of actions to get there, automatically finding alternatives if one path is blocked.

Technical detail: GOAP is a real-time planning architecture where an agent dynamically constructs action sequences to satisfy goal conditions. Unlike finite state machines or behavior trees that predefine transitions, GOAP agents search through possible actions at runtime, selecting those whose effects satisfy preconditions of subsequent actions until the goal is reached.

Historical & Theoretical Context

GOAP was pioneered by Jeff Orkin for the 2005 game F.E.A.R., revolutionizing game AI by allowing NPCs to exhibit believable, adaptive behavior. The architecture draws from STRIPS (Stanford Research Institute Problem Solver) from the 1970s, which formalized planning as searching through states defined by predicates.

GOAP connects to classical AI planning, but optimizes for real-time performance by limiting plan horizon, using precomputed heuristics, and caching common plans. It represents a middle ground between fully reactive systems and heavyweight planners.

Algorithms & Math

GOAP uses backward-chaining search from goal to current state:

Algorithm: GOAP Planner
Input: current_state, goal_state, available_actions
Output: action_sequence

1. Create open_set with initial node (goal_state, cost=0, actions=[])
2. While open_set not empty:
   a. Pop node with lowest cost
   b. If node.unsatisfied_conditions ⊆ current_state:
      return node.actions (plan found)
   c. For each action in available_actions:
      If action.effects ∩ node.unsatisfied_conditions ≠ ∅:
        - new_conditions = (node.unsatisfied_conditions - action.effects) ∪ action.preconditions
        - new_cost = node.cost + action.cost
        - Add new node to open_set
3. Return failure (no plan found)

The search is essentially A* where heuristic estimates remaining actions needed to satisfy preconditions.

Design Patterns & Architectures

GOAP fits into the Planner-Executor pattern:

In LLM agents, GOAP integrates as a structured reasoning layer:

User Query → Goal Extraction → GOAP Planner → Action Sequence → Tool Execution

Practical Application

Here’s a Python implementation for an LLM agent using GOAP:

from dataclasses import dataclass
from typing import Dict, Set, List
import heapq

@dataclass
class Action:
    name: str
    preconditions: Dict[str, bool]
    effects: Dict[str, bool]
    cost: float = 1.0

@dataclass 
class GOAPPlanner:
    def plan(self, current: Dict[str, bool], goal: Dict[str, bool], 
             actions: List[Action]) -> List[str]:
        
        # Priority queue: (cost, unsatisfied, action_list)
        open_set = [(0, goal.copy(), [])]
        
        while open_set:
            cost, unsatisfied, plan = heapq.heappop(open_set)
            
            # Check if current state satisfies remaining conditions
            if all(current.get(k) == v for k, v in unsatisfied.items()):
                return plan
            
            for action in actions:
                # Does action help satisfy any unsatisfied condition?
                helps = any(
                    action.effects.get(k) == v 
                    for k, v in unsatisfied.items()
                )
                
                if helps:
                    # Remove satisfied conditions, add preconditions
                    new_unsatisfied = {
                        k: v for k, v in unsatisfied.items()
                        if action.effects.get(k) != v
                    }
                    new_unsatisfied.update(action.preconditions)
                    
                    heapq.heappush(open_set, (
                        cost + action.cost,
                        new_unsatisfied,
                        [action.name] + plan
                    ))
        
        return []  # No plan found

# Example: Research agent
actions = [
    Action("search_web", {"has_query": True}, {"has_sources": True}),
    Action("read_sources", {"has_sources": True}, {"has_content": True}),
    Action("synthesize", {"has_content": True}, {"has_answer": True}),
    Action("extract_query", {}, {"has_query": True}),
]

planner = GOAPPlanner()
current_state = {"has_query": False}
goal_state = {"has_answer": True}

plan = planner.plan(current_state, goal_state, actions)
print(f"Plan: {' → '.join(plan)}")
# Output: Plan: extract_query → search_web → read_sources → synthesize

Comparisons & Tradeoffs

ApproachProsCons
GOAPFlexible, emergent behavior, handles novel situationsPlanning overhead, requires good action modeling
Behavior TreesPredictable, easy to debug, visual editingRigid, hard to handle unexpected situations
FSMSimple, fast, minimal overheadCombinatorial explosion for complex behaviors
ReActUses LLM reasoning directlyNo explicit planning, can meander

GOAP scales well when action spaces are moderate (<100 actions) and goals are clearly definable. It struggles with continuous action spaces and uncertain effects.

Latest Developments & Research

LLM-Enhanced GOAP (2024-2025):

Open problems:

Cross-Disciplinary Insight

GOAP mirrors means-end analysis from cognitive psychology—how humans solve problems by identifying differences between current and goal states, then finding operations to reduce those differences. Herbert Simon and Allen Newell’s work on human problem-solving directly influenced AI planning systems.

In economics, GOAP resembles backward induction in game theory, where optimal strategies are determined by reasoning backward from desired outcomes. This connection suggests opportunities to incorporate game-theoretic considerations into multi-agent GOAP systems.

Daily Challenge

Implement a GOAP-based LLM agent that can plan how to answer questions requiring multiple tool calls:

  1. Define actions for: web_search, read_file, calculate, write_response
  2. Each action should have realistic preconditions (e.g., read_file requires knowing filename)
  3. Test with query: “What’s 15% of the revenue mentioned in report.pdf?”
  4. Bonus: Add action costs based on estimated token usage

References & Further Reading