COELA: Cooperative Embodied Learning Agents with LLM-based Communication

COELA implements a decentralized multi-agent system where agents make independent decisions through LLM-based communication. Each agent can either take actions or send messages to coordinate with teammates. Link to Paper

Main Algorithm Pseudocode

COELA

In our COELA Implementation, our message history (\(\mathcal{M}\)) is initially empty in line 2. Then, for each timestep, each agent parses their observations (\(o_a\)) from the environment states, and then generates a text perception (\(p_a\)) given those observations through the \perception module (\ref{perception_prompt}).

Then, each agent checks their current action (\(\texttt{action}_a\)). If it is still active (meaning it is multi-step and not complete), the agent will do nothing, as seen in line 9.

However, if not, each agent proposes a message to send through Communication Generation given the team's task, the agent's perception (\(p_a\)), the agent's action history (\(\mathcal{H}_a\)), and the team's chat history (\(\mathcal{M}\)).

Then, given the proposed message, through Action Generation, the agent chooses to execute an action or to send the proposed message.

If the agent chooses to send the proposed message, then that message is appended to the message history (\(\mathcal{M}\)) and \(\texttt{No Action}\) is chosen in lines 13-14.

However, if the agent chooses a different action, that action is translated into a standardized action format. Now since all agents have an active action (including \(\texttt{No Action}\)), each agent will generate an executable vector (\(e_a\)) through our Execution Module in line 19. If the current action (\(\texttt{action}_a\)) is complete or single-step, it will then be set to \(\texttt{None}\) and appended to the agent's action history.

Lastly, all executable tensors will be combined and executed within the Wildfire Environment in line 24. If the max score has been reached, the algorithm ends.

Key Methods

Communication Generation

Enables agents to generate contextually appropriate messages. Considers the communication cost-benefit trade-off when deciding whether to send messages.

def generate_communication(agent: Agent, global_data: dict) -> str:
    """
    Proposes communication message based on:
    1. Agent's current perception
    2. Team chat history
    3. Agent's action history
    4. Current task state
    """

Action Generation

Determines whether to communicate or act based on current state and proposed message utility. Ensures efficient resource use by avoiding unnecessary communication.

def generate_action(agent: Agent, global_data: dict, proposed_message: str) -> str:
    """
    Decides between sending message or taking action:
    1. Evaluates proposed message utility
    2. Considers current state and task
    3. Returns either 'SEND_MESSAGE' or action description
    """

Communication Protocol

Defines the message structure for both action proposals and team communication. Uses a standardized format to ensure consistent interpretation across all agents.

Message Format:

<reasoning>Decision rationale</reasoning>
<message>Communication content</message>

For actions:
<reasoning>Action rationale</reasoning>
<action>SEND_MESSAGE or action description</action>

Environment Interaction

Observation Processing: Transforms environmental observations into semantic descriptions for LLM processing. Maintains consistent perception format across all agents.

observations = get_agent_observations(state, agent.id)  # Raw sensor data
perception = agent.generate_perception(cfg.envs, agent_states, global_data)  # Text description

Action Execution: Handles the execution of chosen actions or message sending. Includes special handling for communication actions.

if action == "SEND_MESSAGE":
    append message to chat_history
    action_array = no_action_vector
else:
    action_array = generate_action_from_option(agent=agent)

env_action[agent.id] = action_array
action_tensor = torch.from_numpy(np.array(env_action)).to(device)
state["agents"]["action"] = action_tensor
newstate = env.step(state)

Decision Flow

Message Proposal: Implements the core decision-making process for choosing between communication and action. Includes utility evaluation for proposed messages.

# Each timestep, agents:
proposed_message = generate_communication(agent, global_data)  # Propose message
chosen_action = generate_action(agent, global_data, proposed_message)  # Decide action

if "SEND_MESSAGE" in chosen_action:
    append proposed_message to chat_history
    action = no_action
else:
    action = translate_action(chosen_action)

Action Execution: Manages the execution of chosen actions and updates agent state accordingly. Handles completion tracking for multi-step actions.

# For each agent:
if action is not None:
    result = execute_action(action)
    if action.complete:
        agent.history.append(action)
        agent.action = None