Reasoning System#

The reasoning system in Mesa-LLM provides different cognitive strategies for agents to analyze situations, make decisions, and plan actions. It forms the core intelligence layer that transforms observations into actionable plans using structured thinking approaches. The reasoning module enables agents to process environmental observations and memory context into executable action plans through various cognitive frameworks.

Usage in Mesa Simulations#

from mesa_llm.llm_agent import LLMAgent
from mesa_llm.reasoning.cot import CoTReasoning

class MyAgent(LLMAgent):
   def __init__(self, model, **kwargs):
      super().__init__(
            model=model,
            reasoning=CoTReasoning,  # Specify reasoning strategy
            **kwargs
      )

   def step(self):
      # Generate observation and create plan using reasoning strategy
      obs = self.generate_obs()
      plan = self.reasoning.plan(
            obs=obs,
            selected_tools=["move_one_step", "speak_to"]
      )
      self.apply_plan(plan)

# Strategy-specific configurations
from mesa_llm.reasoning.react import ReActReasoning
from mesa_llm.reasoning.rewoo import ReWOOReasoning

# For ReWOO with multi-step planning
plan = self.reasoning.plan(obs=obs, ttl=3)  # Plan valid for 3 steps

# Parallel reasoning execution
async def astep(self):
   obs = self.generate_obs()
   plan = await self.reasoning.aplan(
      prompt=self.step_prompt,
      obs=obs,
      selected_tools=["move_one_step", "arrest_citizen"]
   )
   self.apply_plan(plan)

Base abstractions#

class Observation(step: int, self_state: dict, local_state: dict)[source]#

Bases: object

A structured snapshot containing the agent’s current step, self-state (internal attributes and location), and local-state (neighboring agents and their properties). This provides complete situational awareness for decision-making.

Attributes:

step (int): The current simulation time step when the observation is made.

self_state (dict): A dictionary containing comprehensive information about the observing agent itself.

This includes: - Internal state such as morale, fear, aggression, fatigue, etc (behavioural). - Agent’s current location or spatial coordinates - Any other agent-specific metadata that could influence decision-making

local_state (dict): A dictionary summarizing the state of nearby agents (within the vision radius).
  • A dictionary of neighboring agents, where each key is the “angent’s class name + id” and the value is a dictionary containing the following:

  • position of neighbors

  • Internal state or attributes of neighboring agents

step: int#
self_state: dict#
local_state: dict#
class Plan(step: int, llm_plan: Any, ttl: int = 1)[source]#

Bases: object

An LLM-generated plan containing the step number, complete LLM response with tool calls, and a time-to-live (TTL) indicating how many steps the plan remains valid. Plans encapsulate both reasoning content and executable actions.

step: int#
llm_plan: Any#
ttl: int = 1#
class Reasoning(agent: LLMAgent)[source]#

Bases: ABC

Abstract base class providing the interface for all reasoning strategies, with both synchronous plan() and asynchronous aplan() methods for parallel execution scenarios.

Attributes:
  • agent (LLMAgent reference)

Methods:
  • abstract plan(prompt, obs=None, ttl=1, selected_tools=None, tool_calls=”auto”)Plan - Generate synchronous plan

  • async aplan(prompt, obs=None, ttl=1, selected_tools=None, tool_calls=”auto”)Plan - Generate asynchronous plan

Reasoning Flow:
  1. Agent generates observation of current situation through generate_obs()

  2. Reasoning strategies access memory to inform decisions

  3. Selected reasoning approach processes observation and memory into a structured plan

  4. Plans are automatically converted to tool schemas for LLM function calling

  5. Tool manager executes the planned actions in the simulation environment

abstractmethod plan(prompt: str | None = None, obs: Observation | None = None, ttl: int = 1, selected_tools: list[str] | None = None, tool_calls: str | None = 'auto') Plan[source]#

Generate a plan for the next action.

Args:

prompt: Optional prompt override for the reasoning strategy. obs: Optional observation to plan against. ttl: Time-to-live for the generated plan. selected_tools: Optional explicit tool allowlist forwarded to

ToolManager.get_all_tools_schema(). If omitted or None, the default behavior exposes all tools. [] exposes no tools, and a non-empty list restricts planning/execution to the named tools.

tool_calls: Execution-phase LiteLLM tool_choice override used

when converting the natural-language plan into tool calls. Planning still keeps tool use disabled.

Supported values in Mesa-LLM are: - None: defer to LiteLLM/provider default behavior. In

practice, this usually means no tool calls when no tools are provided and behavior similar to "auto" when tools are available.

  • "none": never return tool calls; return a normal assistant message instead.

  • "auto": allow the model to either return a normal assistant message or call one or more tools.

  • "required": require the model to call one or more tools.

Mesa-LLM currently exposes only these string choices, not provider-specific object forms. See LiteLLM docs: https://docs.litellm.ai/

async aplan(prompt: str | None = None, obs: Observation | None = None, ttl: int = 1, selected_tools: list[str] | None = None, tool_calls: str | None = 'auto') Plan[source]#

Asynchronous version of plan() method for parallel planning. Default implementation calls the synchronous plan() method.

selected_tools follows the same contract as plan(): omitting it or passing None uses the default behavior of exposing all tools, [] exposes no tools, and a non-empty list restricts planning/execution to the named tools.

execute_tool_call(chaining_message, selected_tools: list[str] | None = None, ttl: int = 1, tool_calls: str | None = 'auto')[source]#

Turn a natural-language plan into tool calls.

Args:

chaining_message: Natural-language plan or action text to execute. selected_tools: Optional explicit tool allowlist forwarded to

ToolManager.get_all_tools_schema(). Omitting it or passing None uses the default behavior of exposing all tools, [] exposes no tools, and a non-empty list restricts execution to the named tools.

ttl: Time-to-live for the returned plan. tool_calls: LiteLLM tool_choice passed to the execution call.

Supported values in Mesa-LLM are: - None: defer to LiteLLM/provider default behavior. In

practice, this usually means no tool calls when no tools are provided and behavior similar to "auto" when tools are available.

  • "none": never return tool calls; return a normal assistant message instead.

  • "auto": allow the model to either return a normal assistant message or call one or more tools.

  • "required": require the model to call one or more tools.

Mesa-LLM currently exposes only these string choices, not provider-specific object forms. See LiteLLM docs: https://docs.litellm.ai/

async aexecute_tool_call(chaining_message, selected_tools: list[str] | None = None, ttl: int = 1, tool_calls: str | None = 'auto')[source]#

Asynchronous version of execute_tool_call() method.

selected_tools follows the same contract as execute_tool_call(): omitting it or passing None uses the default behavior of exposing all tools, [] exposes no tools, and a non-empty list restricts execution to the named tools.

Reasoning strategies#

class CoTReasoning(agent: LLMAgent)[source]#

Bases: Reasoning

Chain of Thought reasoning with explicit step-by-step analysis before action execution. Uses structured numbered thoughts followed by tool execution. Integrates memory context for informed decision-making.

Attributes:
  • agent (LLMAgent reference)

Methods:
  • plan(obs, ttl=1, prompt=None, selected_tools=None, tool_calls=”auto”)Plan - Generate synchronous plan with CoT reasoning

  • async aplan(obs, ttl=1, prompt=None, selected_tools=None, tool_calls=”auto”)Plan - Generate asynchronous plan with CoT reasoning

Reasoning Format:

Thought 1: [Initial reasoning based on observation] Thought 2: [How memory informs the situation] Thought 3: [Possible alternatives or risks] Thought 4: [Final decision and justification] Action: [The action you decide to take]

get_cot_system_prompt(obs: Observation) str[source]#
plan(prompt: str | None = None, obs: Observation | None = None, ttl: int = 1, selected_tools: list[str] | None = None, tool_calls: str | None = 'auto') Plan[source]#

Plan the next (CoT) action based on the current observation and the agent’s memory.

selected_tools is forwarded to ToolManager.get_all_tools_schema(). Omitting it or passing None uses the default behavior of exposing all tools, [] exposes no tools, and a non-empty list restricts planning/execution to the named tools.

tool_calls controls the execution-phase LiteLLM tool_choice. The reasoning pass still keeps tool use disabled with "none".

Supported values in Mesa-LLM are: - None: defer to LiteLLM/provider default behavior. In practice,

this usually means no tool calls when no tools are provided and behavior similar to "auto" when tools are available.

  • "none": never return tool calls; return a normal assistant message instead.

  • "auto": allow the model to either return a normal assistant message or call one or more tools.

  • "required": require the model to call one or more tools.

async aplan(prompt: str | None = None, obs: Observation | None = None, ttl: int = 1, selected_tools: list[str] | None = None, tool_calls: str | None = 'auto') Plan[source]#

Asynchronous version of plan() method for parallel planning.

selected_tools follows the same contract as plan(): omitting it or passing None uses the default behavior of exposing all tools, [] exposes no tools, and a non-empty list restricts planning/execution to the named tools.

tool_calls controls the execution-phase LiteLLM tool_choice. The reasoning pass still keeps tool use disabled with "none".

Supported values in Mesa-LLM are: - None: defer to LiteLLM/provider default behavior. In practice,

this usually means no tool calls when no tools are provided and behavior similar to "auto" when tools are available.

  • "none": never return tool calls; return a normal assistant message instead.

  • "auto": allow the model to either return a normal assistant message or call one or more tools.

  • "required": require the model to call one or more tools.

class ReActOutput(*, reasoning: str, action: str)[source]#

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

reasoning: str#
action: str#
model_config = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ReActReasoning(agent: LLMAgent)[source]#

Bases: Reasoning

Reasoning + Acting with alternating reasoning and action in flexible conversational format. Combines thinking and acting in natural language flow. Less structured than CoT but incorporates memory and communication history.

Attributes:
  • agent (LLMAgent reference)

Methods:
  • plan(prompt, obs=None, ttl=1, selected_tools=None, tool_calls=”auto”)Plan - Generate synchronous plan with ReAct reasoning

  • async aplan(prompt, obs=None, ttl=1, selected_tools=None, tool_calls=”auto”)Plan - Generate asynchronous plan with ReAct reasoning

get_react_system_prompt() str[source]#
get_react_prompt(obs: Observation) list[str][source]#
plan(prompt: str | None = None, obs: Observation | None = None, ttl: int = 1, selected_tools: list[str] | None = None, tool_calls: str | None = 'auto') Plan[source]#

Plan the next (ReAct) action based on the current observation and the agent’s memory.

selected_tools is forwarded to ToolManager.get_all_tools_schema(). Omitting it or passing None uses the default behavior of exposing all tools, [] exposes no tools, and a non-empty list restricts planning/execution to the named tools.

tool_calls controls the execution-phase LiteLLM tool_choice. The reasoning pass still keeps tool use disabled with "none".

Supported values in Mesa-LLM are: - None: defer to LiteLLM/provider default behavior. In practice,

this usually means no tool calls when no tools are provided and behavior similar to "auto" when tools are available.

  • "none": never return tool calls; return a normal assistant message instead.

  • "auto": allow the model to either return a normal assistant message or call one or more tools.

  • "required": require the model to call one or more tools.

async aplan(prompt: str | None = None, obs: Observation | None = None, ttl: int = 1, selected_tools: list[str] | None = None, tool_calls: str | None = 'auto') Plan[source]#

Asynchronous version of plan() method for parallel planning.

selected_tools follows the same contract as plan(): omitting it or passing None uses the default behavior of exposing all tools, [] exposes no tools, and a non-empty list restricts planning/execution to the named tools.

tool_calls controls the execution-phase LiteLLM tool_choice. The reasoning pass still keeps tool use disabled with "none".

Supported values in Mesa-LLM are: - None: defer to LiteLLM/provider default behavior. In practice,

this usually means no tool calls when no tools are provided and behavior similar to "auto" when tools are available.

  • "none": never return tool calls; return a normal assistant message instead.

  • "auto": allow the model to either return a normal assistant message or call one or more tools.

  • "required": require the model to call one or more tools.

class ReWOOReasoning(agent: LLMAgent)[source]#

Bases: Reasoning

Reasoning Without Observation for multi-step planning without environmental feedback. Enables multi-step planning without requiring immediate environmental feedback. Plans remain valid across multiple simulation steps with extended TTL. Reduces computational overhead through strategic long-term thinking.

Attributes:
  • agent (LLMAgent reference)

  • remaining_tool_calls (int) - Number of tool calls remaining in current plan

  • current_plan (Plan) - Currently active multi-step plan

  • current_obs (Observation) - Last observation used for planning

Methods:
  • plan(prompt, obs=None, ttl=1, selected_tools=None, tool_calls=”auto”)Plan - Generate synchronous plan with ReWOO reasoning

  • async aplan(prompt, obs=None, ttl=1, selected_tools=None, tool_calls=”auto”)Plan - Generate asynchronous plan with ReWOO reasoning

get_rewoo_system_prompt(obs: Observation) str[source]#
plan(prompt: str | None = None, obs: Observation | None = None, ttl: int = 1, selected_tools: list[str] | None = None, tool_calls: str | None = 'auto') Plan[source]#

Plan the next (ReWOO) action based on the current observation and the agent’s memory.

selected_tools is forwarded to ToolManager.get_all_tools_schema(). Omitting it or passing None uses the default behavior of exposing all tools, [] exposes no tools, and a non-empty list restricts planning/execution to the named tools.

tool_calls controls the execution-phase LiteLLM tool_choice. The planning pass still keeps tool use disabled with "none".

Supported values in Mesa-LLM are: - None: defer to LiteLLM/provider default behavior. In practice,

this usually means no tool calls when no tools are provided and behavior similar to "auto" when tools are available.

  • "none": never return tool calls; return a normal assistant message instead.

  • "auto": allow the model to either return a normal assistant message or call one or more tools.

  • "required": require the model to call one or more tools.

async aplan(prompt: str | None = None, obs: Observation | None = None, ttl: int = 1, selected_tools: list[str] | None = None, tool_calls: str | None = 'auto') Plan[source]#

Asynchronous version of plan() method for parallel planning.

selected_tools follows the same contract as plan(): omitting it or passing None uses the default behavior of exposing all tools, [] exposes no tools, and a non-empty list restricts planning/execution to the named tools.

tool_calls controls the execution-phase LiteLLM tool_choice. The planning pass still keeps tool use disabled with "none".

Supported values in Mesa-LLM are: - None: defer to LiteLLM/provider default behavior. In practice,

this usually means no tool calls when no tools are provided and behavior similar to "auto" when tools are available.

  • "none": never return tool calls; return a normal assistant message instead.

  • "auto": allow the model to either return a normal assistant message or call one or more tools.

  • "required": require the model to call one or more tools.