Skip to content

Goal Seeking

A navigation task where agents move through a grid to reach goal cells. Based on the dynamic decision-making paradigm from Nguyen & Gonzalez (2020).

Description

Agents start at spawn points and navigate to goal targets placed on the grid. Each goal has an associated value. The agent receives a reward upon reaching a goal. Supports both single-agent and multi-agent configurations.

Environment Class

GoalSeeking extends CoGridEnv and adds:

  • Target values — each goal character maps to a reward value.
  • Optimal path length — provided per grid configuration for evaluation.
  • Multi-agent spawns — separate spawn positions for multi-agent setups.

Objects

Char Name Description
+ Spawn Agent start position
# Wall Impassable boundary
Floor Walkable cell
(custom) Goal Target cells with assigned values

Goal characters are defined per grid configuration. The grid data maps each character to its reward value.

Rewards

The GoalSeekingAgent defines two penalties:

Parameter Value Description
step_penalty 0.01 Per-step cost to encourage efficient paths
collision_penalty 0.05 Penalty when agents collide (multi-agent)

Goal rewards are defined by the target_values mapping in the grid configuration.

Configuration

Goal-seeking environments are configured via a grid data file that specifies:

grid_data = {
    "layout": [...],                    # ASCII grid rows
    "values": {"G": 1.0, "X": -0.5},   # character -> reward value
    "optimal_path_length": 8,           # for evaluation metrics
    "ma_spawns": [(2, 1), (3, 1)],      # multi-agent spawn positions
}

The environment is instantiated with a path to this grid data:

from cogrid.envs.goal_seeking.goal_seeking import GoalSeeking

env = GoalSeeking(grid_path="path/to/grid.json", config=config)

Agent API

GoalSeekingAgent provides:

  • create_inventory_ob() — returns a binary vector of collected target objects.
  • inventory_capacity — number of distinct target types.
  • step_penalty / collision_penalty — per-step and collision costs.

Spawn Selection

The environment supports multiple spawn modes:

  • Random spawn — shuffles available spawn points each reset.
  • Pre-defined spawn — uses ma_spawns positions for multi-agent setups.
  • Generated random spawn — samples from all free spaces (excluding map-specified spawns), controlled by config["env"]["gen_random_spawn"].