Isaac Sim: grid-aware quadruped navigation

A quick visual pass on a warehouse-style grid where quadrupeds and manipulators navigate toward goal patches while leaving red and blue activation footprints. The pattern makes it easy to see which behaviors the policy is reinforcing and how evenly agents cover the floor.

Quadrupeds stepping across a grid with red and blue footprint markers

Local run video

What to watch for

Agents choose adjacent goal tiles rather than cutting diagonally, matching the grid-aligned reward.
Blue footprints mark a higher-confidence route; red shows exploration and slip recovery near obstacles.
The manipulator hesitates at corners, hinting at a small perception delay in the depth stream.
Policy stays stable under camera jitter, suggesting the observation normalization is working.

Reward sketch

def grid_reward(progress: float, collision: bool, goal_bonus: float) -> float:
    reward = progress + goal_bonus
    if collision:
        reward -= 1.0
    return reward