← Experiments
Isaac Sim: grid-aware quadruped navigation
December 15, 2025
isaacsimroboticsreinforcement-learningnavigation
A quick visual pass on a warehouse-style grid where quadrupeds and manipulators navigate toward goal patches while leaving red and blue activation footprints. The pattern makes it easy to see which behaviors the policy is reinforcing and how evenly agents cover the floor.

Local run video
What to watch for
- Agents choose adjacent goal tiles rather than cutting diagonally, matching the grid-aligned reward.
- Blue footprints mark a higher-confidence route; red shows exploration and slip recovery near obstacles.
- The manipulator hesitates at corners, hinting at a small perception delay in the depth stream.
- Policy stays stable under camera jitter, suggesting the observation normalization is working.
Reward sketch
def grid_reward(progress: float, collision: bool, goal_bonus: float) -> float:
reward = progress + goal_bonus
if collision:
reward -= 1.0
return reward