Goal Seeking¶
A navigation task where agents move through a grid to reach goal cells. Based on the dynamic decision-making paradigm from Nguyen & Gonzalez (2020).
Description¶
Agents start at spawn points and navigate to goal targets placed on the grid. Each goal has an associated value. The agent receives a reward upon reaching a goal. Supports both single-agent and multi-agent configurations.
Environment Class¶
GoalSeeking extends CoGridEnv and adds:
- Target values — each goal character maps to a reward value.
- Optimal path length — provided per grid configuration for evaluation.
- Multi-agent spawns — separate spawn positions for multi-agent setups.
Objects¶
| Char | Name | Description |
|---|---|---|
+ |
Spawn | Agent start position |
# |
Wall | Impassable boundary |
|
Floor | Walkable cell |
| (custom) | Goal | Target cells with assigned values |
Goal characters are defined per grid configuration. The grid data maps each character to its reward value.
Rewards¶
The GoalSeekingAgent defines two penalties:
| Parameter | Value | Description |
|---|---|---|
step_penalty |
0.01 | Per-step cost to encourage efficient paths |
collision_penalty |
0.05 | Penalty when agents collide (multi-agent) |
Goal rewards are defined by the target_values mapping in the grid configuration.
Configuration¶
Goal-seeking environments are configured via a grid data file that specifies:
grid_data = {
"layout": [...], # ASCII grid rows
"values": {"G": 1.0, "X": -0.5}, # character -> reward value
"optimal_path_length": 8, # for evaluation metrics
"ma_spawns": [(2, 1), (3, 1)], # multi-agent spawn positions
}
The environment is instantiated with a path to this grid data:
from cogrid.envs.goal_seeking.goal_seeking import GoalSeeking
env = GoalSeeking(grid_path="path/to/grid.json", config=config)
Agent API¶
GoalSeekingAgent provides:
create_inventory_ob()— returns a binary vector of collected target objects.inventory_capacity— number of distinct target types.step_penalty/collision_penalty— per-step and collision costs.
Spawn Selection¶
The environment supports multiple spawn modes:
- Random spawn — shuffles available spawn points each reset.
- Pre-defined spawn — uses
ma_spawnspositions for multi-agent setups. - Generated random spawn — samples from all free spaces (excluding map-specified spawns), controlled by
config["env"]["gen_random_spawn"].