Embodied: Embodied LLM Agents Learn to Cooperate in Organized Teams

Embodied implements a round-based communication system where agents exchange messages before taking actions. The algorithm emphasizes explicit communication rounds and direct agent-to-agent messaging. Link to Paper

Main Algorithm Pseudocode

Embodied

In our Embodied Implementation, we start each timestep with all agents parsing their observations (\(o_a\)) from the environment state in line 4, and then generating a text perception (\(p_a\)) given those observations through the \perception module (\ref{perception_prompt}) in line 5.

Then, for \(c\) communication rounds, each agent generates messages for any recipient agent through Communication Round in line 8, given the team's task, the agent's perception, action history (\(\mathcal{H}_a\)), and message history (\(\mathcal{M}_a\)). These messages are then parsed and added to the corresponding agents' message histories (\(\mathcal{M}_{recipient}\)) in line 11.

Then after all communication rounds, each agent generates their next action with Action Round in line 13, given the team's task, the agent's perception, action history (\(\mathcal{H}_a\)), and message history (\(\mathcal{M}_a\)). This generated action is then translated into a standardized action format.

Now since all agents have an active action (including \(\texttt{No Action}\)), each agent will generate an executable vector (\(e_a\)) through our Execution Module in line 17. If the current action (\(\texttt{action}_a\)) is complete or single-step, it will then be set to \(\texttt{None}\) and appended to the agent's action history.

Lastly, all executable tensors will be combined and executed within the Wildfire Environment in line 22. If the max score has been reached, the algorithm ends.

Key Methods

Communication Round

Implements the multi-round message exchange system. Each agent can send targeted messages to specific agents or broadcast to all agents. Messages are processed sequentially to ensure proper information flow between agents.

def communication_round(agent: Agent, global_data: dict):
    """
    Handles multi-agent communication:
    1. Generates messages for specific agents
    2. Supports global broadcast channel
    3. Updates chat histories for all recipients
    4. Tracks communication costs
    """

Action Round

Executes after all communication rounds are complete. Uses accumulated messages and observations to determine optimal actions. Actions are validated against agent-specific capabilities before execution.

def action_round(agent: Agent, global_data: dict):
    """
    Generates and executes actions:
    1. Considers all chat history
    2. Uses agent-specific action libraries
    3. Handles execution errors
    4. Updates action history
    """

Communication Protocol

Defines the structured format for inter-agent communication. Uses XML-style tags to differentiate between reasoning, direct messages, and global broadcasts. This format ensures consistent parsing and distribution of messages.

Message Format:

<reasoning>Decision explanation</reasoning>
<RECIPIENT>Message content</RECIPIENT>
<GLOBAL>Broadcast message</GLOBAL>

For actions:
<reasoning>Action rationale</reasoning>
<action>Specific action with coordinates</action>

Chat Management

Implements the message storage and retrieval system. Maintains separate chat histories for direct messages and global broadcasts, with timestamps for sequential ordering.

Direct Messages: Stores conversations between specific agent pairs. Each message includes source, content, and timestamp for proper sequencing and context maintenance.

# Structure per chat
chat_string = ""
for chat, msgs in agent.chat_history.items():
    chat_string += f"Chat with {chat}:\n"
    for source, content, time in msgs:
        chat_string += f"{source} (time: {time}): {content}\n"

Message Distribution: Handles the routing of messages to appropriate recipients. Supports both broadcast messages (sent to all agents) and direct messages (sent to specific agents).

# Handle global messages
if global_msg:
    for recipient in global_data['agents']:
        recipient.add_message("GLOBAL", f"AGENT_{agent.id}", global_msg, time)

# Handle direct messages
for recipient in agents:
    if m := re.search(f"<AGENT_{recipient.id}>(.*?)</AGENT_{recipient.id}>", content):
        msg = m.group(1).strip()
        recipient.add_message(f"AGENT_{agent.id}", msg, time)

Action Execution

Manages the translation of high-level action descriptions into executable commands and handles their execution in the environment.

Action Generation: Converts natural language action descriptions into structured action objects that can be executed by the environment.
```
action = translate_action(action_str, agent.type, global_data)
```

Execution Flow: Maps agent types to their specific action libraries and handles execution errors. Each agent type has specialized actions based on their capabilities.

libraries = {
    0: Run_Firefighter_Action,
    1: Run_Bulldozer_Action,
    2: Run_Drone_Action,
    3: Run_Helicopter_Action
}

try:
    result = libraries[agent.type](agent, action)
    agent.past_actions.append(action.description)
except:
    agent.log_chat("ERROR", "ERROR EXECUTING ACTION")
    return [0,0,0]

Algorithm Flow

Structures the entire process of observation gathering, multi-round communication, and synchronized action execution. Ensures proper sequencing of operations across all agents.

state = env.reset()
for timestep in range(max_timesteps):
    # Get observations
    observations = get_observations(state)
    perceptions = generate_perceptions(observations)

    # Communication rounds
    for _ in range(comm_rounds):
        for agent in agents:
            messages = communication_round(agent, global_data)
            distribute_messages(messages)

    # Action execution
    for agent in agents:
        action = action_round(agent, global_data)
        execute_action(action)