Skip to content

Fleet lab: swarm coordination sandbox + teaching tool#2

Open
publu wants to merge 28 commits into
mainfrom
feat/fleet-lab
Open

Fleet lab: swarm coordination sandbox + teaching tool#2
publu wants to merge 28 commits into
mainfrom
feat/fleet-lab

Conversation

@publu

@publu publu commented Jun 15, 2026

Copy link
Copy Markdown
Owner

A multi-quadruped /fleet sandbox reachable from the robot picker — built to teach how a swarm covers ground over an imperfect radio, not just demo it.

Fleet sandbox (/fleet)

  • 4 strategies under real limits (radio range, airtime, onboard memory, inbox depth, one-goal-at-a-time, lossy links): lone wolves, gossip, claim-and-yield, one commander.
  • Hover-to-inspect any robot: tiles it sensed firsthand (filled) vs. only heard from peers (outlined), with a live knowledge/inbox/data card.
  • Base station + data points: discover data and relay it home with greedy geographic routing (multi-hop / data-mule). The "main server" / sink.
  • Environments: open / scattered obstacles / building with line-of-sight-blocking walls.
  • CONCEPTS panel (expandable) answering in-UI: delivered vs dropped, overlapping searches, optimisation, "is this libp2p?", swarm intelligence/stigmergy — plus tooltips on every metric.
  • "Your algorithm": a live JS strategy editor + a Generate button that asks the local runtime's LLM to draft a policy (POST /api/fleet/strategy), runnable on the spot.

Python twin

  • roborun/swarm/ — the same comms model, strategies, base relay and data points as runnable Python (python -m roborun.swarm); ships with the package.

Also

  • ROS card "Allow network scan to load robots" — trips the browser's local-network permission and lists every rosbridge robot found as its own one-click view.
  • ROS-connected robots reuse the exact sim deck; the EYES camera docks into the layout instead of floating.
  • Replaced the last native confirm() (deploy-to-robot) with the styled in-app modal.
  • vercel.json (cleanUrls) so /fleet resolves on the static Vercel build.

Verification

  • 106/106 tests pass; JS validated; pages boot in Chrome with no console errors.
  • Sim, hover, base relay, environments, custom-strategy run, ROS scan button, and both new endpoints (/api/fleet/strategy, /api/sources/scan) all checked.

Note: a pre-existing uncommitted Vercel-analytics change in roborun/web/runtime-base.js was intentionally left out of this PR.

🤖 Generated with Claude Code

publu and others added 28 commits June 12, 2026 18:42
…elf-deadlock

- rosbridge.py: a websocket recv timeout on a quiet socket (a sim robot
  with no /tf chatter) was treated as a disconnect, causing an infinite
  reconnect loop that never replays subscriptions. Timeouts now continue.
- ros_camera.py: state() called is_active() while already holding the
  same non-reentrant lock, freezing every behavior's see() on first use.
  The freshness check is now inlined.
…oes live

The arena/deck pages already degrade gracefully on a static host, but
silently: relative /api fetches 404 and the page stays in demo mode even
when a live roborun is one port away.

runtime-base.js (loaded first by both pages) wraps fetch: /api calls
resolve same-origin first, then probe http://127.0.0.1:8765. A badge
shows the mode; in demo mode it keeps probing, so starting roborun
upgrades the open page to the live cockpit with one click.

Server side: Access-Control-Allow-Origin on all responses (deduped out
of do_OPTIONS — doubled CORS headers are rejected by browsers) plus
Access-Control-Allow-Private-Network for Chrome PNA preflights.
The whole thesis is switch from sim to robot without changes — so the
arena, not a separate deck, is what a connected robot looks like. When
/api/ros/health reports a robot, the same arena page enters robot mode:

- pose/heading/altitude come from the telemetry handle (SIM_SPEC contract)
  and drive the same bot body; the level hides, the accumulated lidar
  cloud is the map, the minimap and telemetry panels read as before
- an EYES panel shows /api/camera/stream — the same pixels robot.see()
  runs YOLO on
- WASD publishes real cmd_vel through /api/ros/move (now with linear_z
  for drones); behaviors keep running server-side, untouched
- pushState is gated off: feeding the arena backend while a robot is
  connected would flip get_arena().is_active() and silently reroute
  robot.see()/move() from hardware to the browser sim

Plus two host-fallback fixes with the same root cause: _get_ros_client
and RosTelemetry._try_subscribe demanded a profile robotIp and went dead
without one, even while a live connection existed — both now ride the
already-connected client. This was why robot_type never resolved (and
why drone cmd_vel fell back to /cmd_vel).
The camera stream was last-writer-wins: webcam and robot camera both
wrote /tmp/roborun_frame.jpg, so the EYES panel showed your desk while
claiming to be the robot. Each pipeline now writes its own file and
/api/camera/stream takes ?source=robot|webcam|auto (auto: fresh robot
frames outrank the webcam).

New sources layer answers "what can see and what can move, right now":
- GET /api/sources — webcam on/off, connected robot + camera state, and
  rosbridges discovered on the local /24 (plain TCP probe of :9090,
  cached 60s); POST /api/sources/scan forces a rescan
- arena EYES panel gets a source picker (robot camera / webcam)
- in sim modes the arena shows a chip for any rosbridge found on the
  network — one click connects and reloads into robot mode. A robot on
  your wifi is a source, not a config step.
A connected robot deserves a robot cockpit, not a game with a robot in
it. The screenshot version showed DOG—SANDBOX missions, practice RUNS,
sim crates in the main view, and a sandbox policy editor one click away
from commanding live hardware — the exact path by which player_policy
(forward=0.8 at 10 Hz) flew the test drone to 53 m and 45 m off the map.

In robot mode:
- MISSION/LEVELS/RUNS panels and their toolbar buttons hide (CSS via
  body.robot-mode); loadLevel can no longer resurrect the sim level
- the POLICY panel becomes the robot's behavior editor: it loads the
  source of the behavior actually running (new POST /api/behaviors/read,
  stem-only, no paths), RUN hot-reloads that file, STOP disables it
- deploying to hardware is deliberate: RUN asks for confirmation and
  names the file; the LLM mission compiler gets a be-conservative
  context instead of the level brief
- footer says what WASD really does now: drives the real robot
- room/practice telemetry hides; main camera defaults to chase
…hero

The arena was still a game with a robot crudely piped in — a low-poly dog
floating in a void, the real camera shrunk to a corner, sim crates in
view, the onboarding splash ('pick a robot, pick a task') greeting a live
drone. A connected robot now gets a purpose-built cockpit instead:

- the robot camera fills the stage (full-bleed, cinematic vignette) with
  live YOLO detection boxes overlaid (new /api/robot/detections, normalized)
- a glass identity bar: type glyph, LIVE pulse, host, and ALT/SPEED/HDG
  telemetry chips; OSD reticle + POS/ALT/HDG readouts over the feed
- a tactical minimap (range rings, trail, heading wedge; lidar when present)
- POLICY slides in to edit the *running* behavior; DEPLOY is confirmed
- an event ticker of the robot's live decisions
- the game panels, 3D arena, toolbar, and splash are fully suppressed in
  robot mode; the splash is gated so it never shows over a robot

Camera served as single JPEG frames (/api/camera/frame) polled by the
client — deterministic, unlike an MJPEG <img> that half-paints.

Telemetry hardened against flaky rosapi discovery: /rosapi/topics times
out on this setup, which left type=webcam_only and no pose. Now when
discovery returns empty, trust the type roborun connect saved and
subscribe to the standard topics blind (a subscribe to a not-yet-seen
topic is harmless and flows when it appears).
Answering 'I don't want the drone — give me something else even though ROS
is connected', and making the cockpit a complete shell:

- SOURCE picker: every robot/sim this runtime can reach in one menu — the
  connected robot (live), the browser sim arena, and any rosbridge found on
  the LAN. Pick the sim and the page pins to it (localStorage) without
  disconnecting the robot; a chip offers the way back. Multiple ROS robots
  just appear.
- the policy editor is syntax-highlighted now (a colored <pre> underlay
  behind a transparent textarea — keywords, defs, strings, decorators,
  comments, numbers) instead of plain white-on-black.
- TIMELINE panel (bottom-left, mirroring the tactical map) streams the
  robot's decisions and sightings with timestamps and source-colored dots —
  the same stream+map→timeline surface the sim arena has, so the experience
  is consistent whatever the source is.
- DECK link moves into the top bar.

The cockpit shell — camera stream (hero), tactical map (objects, moving vs
stationary, range rings), timeline, policy — is now one consistent UX; what
fills it is the source.
…urce

Pablo's vision: the sim IS the robot's visual view — what the camera would
see — so there should be one view, not a game UI and a robot UI. The stream,
point map, and timeline are all derived from inputs (camera, cloud, pose)
that both a sim and a ROS robot provide.

The cockpit is now the universal shell, generalized to enterCockpit(src):
- src=robot: the stream is the robot camera (frame-polled); map + telemetry
  from the ROS endpoints.
- src=sim: the stream IS the 3D arena render (POV camera, full-screen behind
  the chrome); telemetry from the sim body; the tactical map from the sim's
  world-located sightings + trail; DEPLOY/HOLD run the same policy through
  the game's path; a LEVELS button (sim-only) picks robot + task.

Both render the same HUD, tactical map (objects, range rings, moving vs
stationary), timeline (decisions + sightings), and policy editor — only the
source behind them changes. The game's panel-salad layout and the auto-splash
are retired; LEVELS reopens the picker on demand. body.cockpit + .src-robot/
.src-sim gate the source-specific bits.
…m view

Three issues from using it live:

1. The sim's timeline showed 'frame … · person' — the connected drone's
   camera events. The sim cockpit was reading the shared server event log.
   The sim now keeps its OWN client-side log (simLog): YOLO sightings and
   policy decisions from its own perception, never the server's. The
   tactical map likewise builds objects from the sim's client raycast
   detections (currentDets), not the contaminated server sightings.

2. The sim 'rotated like an idiot' — POV put you inside the dog's head as
   its policy turned. The sim hero is now the chase camera: a stable
   3rd-person view of the robot in its world.

3. The policy panel's header and close were hidden behind the top bar and
   it read as an inaccessible wall of code. It's now a framed floating
   panel below the bar — visible BEHAVIOR header, close button, and the
   DEPLOY/STOP bar. The track readout moved to a centered bottom pill so
   it no longer collides with the timeline.
…deck

There were three doors to two UIs: / and /deck served the legacy flight
deck, /arena served the cockpit, and each linked to the other. Confusing
and redundant.

Now /, /deck and /arena all serve the cockpit — the single view. A
connected robot shows its camera/map/timeline; no robot shows the sim; the
SOURCE picker switches between them. The DECK button and cross-links are
gone. deck.html/deck.js stay in the tree but are no longer routed (the
flight recorder still runs server-side; its UI can fold into the cockpit
later if needed).
Serving the same page at three paths still read as three things. Now '/'
is the one canonical URL for the cockpit; /deck and /arena redirect there.
… editor

From live use:
- the 'enter cockpit' chip overlapped the top-bar actions (SOURCE/POLICY/
  HOLD) — moved it below the bar so nothing is blocked.
- the sim immediately span a dog in circles: the starter policy auto-ran.
  The sim now starts PAUSED (simArmed=false → policy holds); DEPLOY arms it,
  STOP/HOLD pauses. Calm on arrival, you choose when it moves.
- the source picker was a cramped dropdown — now a centered modal with a
  backdrop and large source cards (Esc / backdrop / ✕ to close).
- the policy editor clipped long lines and was too narrow to code in — added
  an expand toggle (⤢) that widens it to 80vw with a bigger font.
Deploying to a real robot popped the browser's native confirm ('127.0.0.1
says…') — off-brand and ugly. Now it's an in-app modal matching the cockpit:
amber-bordered glass on a dimmed backdrop, a clear warning, Cancel + green
DEPLOY buttons (Esc/backdrop cancels).
- LEVELS did nothing: cockpit CSS force-hid #start. Now #start.show shows
  in cockpit mode, so the robot+task picker (quadruped/humanoid/drone) opens.
- the sim's lidar wasn't on the map and the 3D cloud sprayed the scene. The
  3D cloud is now hidden in the cockpit; the sim's 36-ray lidar accumulates
  into the 2D tactical map (world frame) — the generated map building up.
- the cryptic HOLD button is now a clear PAUSE/RESUME toggle whose label
  reflects state; PAUSE holds the policy, RESUME runs it. DEPLOY/STOP keep
  it in sync.
Picking a new robot (quadruped/humanoid/drone) from LEVELS rebuilt the sim
but the cockpit identity stayed 'DOG'. pollSimCockpit now refreshes the
type + glyph each tick, so the header reflects the robot you chose.
Two real bugs the audit surfaced:

1. /api/arena/state 500'd on every push when a recording was active: the
   handler reassigned 'h = pose.get("heading")', shadowing the HTTP handler
   'h', so the closing send_json(h, ...) got a float. Renamed to 'hd'.

2. The sim's player_policy is a server behavior. While a sim browser feeds
   /api/arena/state the arena is active and robot.move() drives the arena dog
   (correct). But once that browser closes, the arena goes inactive and the
   still-enabled player_policy falls through to drive the REAL robot. Now the
   sim disables player_policy on beforeunload (keepalive fetch) and the robot
   cockpit disables it on entry — so a closed/abandoned sim can't fly the
   drone.
roborun autostarted the webcam (with its privacy light) on every boot to
avoid a blank first screen. But if a robot is connected, the robot's camera
is the source and the webcam is just an unwanted light on the user's
machine. Autostart now skips the webcam when roborun connect has saved a
robot; it's still available as a manual camera source.
The static site (and any non-runtime page) now opens on the ROBORUN ARENA
menu — the front door to the whole system — instead of dropping straight
into a sim. A robot wired directly into THIS runtime still boots to its
cockpit; everywhere else the menu leads.

The menu gains a fourth card, ROS ROBOT — REAL HARDWARE, alongside the three
sim robots. It shows the complete system whether or not anything is
connected: a live robot (enter its cockpit), any rosbridge found on the LAN
(one-click connect), and an IP field to connect by hand. So 'the same code
drives a real robot' is a thing you can actually click, not just a tagline.

Picks: a sim robot+task enters the sim cockpit; a ROS robot connects and
enters the robot cockpit.
Gazebo and Isaac Sim aren't hardware but speak ROS just the same, so the
real split is browser-sim vs anything-on-rosbridge — not sim vs real. The
card now reads ROS · GAZEBO · ISAAC · HARDWARE: 'connect anything on
rosbridge — a Gazebo or Isaac sim, or a real robot.'
The LAN scan did a plain TCP open of :9090, so any unrelated service on that
port showed up as a 'robot to connect to' — confusing false positives. It now
completes a websocket handshake (what rosbridge actually is); non-websocket
:9090 services are filtered out. Result: only real rosbridge endpoints appear.

Also: a connected source with no detectable robot type now reads ROSBRIDGE,
not the ugly WEBCAM_ONLY.
The chip is the sim cockpit's affordance to jump back to a connected robot,
but pollNetworkRobots returned early in robot mode without hiding it — so it
leaked into the robot cockpit, telling you to 'enter cockpit' while you were
already in it. It now shows only in the sim cockpit and stays hidden in the
menu and robot cockpit (which have the ROS card / are already there).
A rover that drives on /cmd_vel and senses with a laser scanner — no joints,
no mavros — was falling through to webcam_only. It's now recognized as a
mobile ground robot (quadruped profile: cmd_vel control + lidar + camera), so
its pose, lidar map and camera all light up the cockpit instead of reading as
'WEBCAM ONLY'.
The cockpit map drew accumulated lidar as a scatter of 1.5px dots — noisy and
hard to read. It now bins returns into world cells and fills them (denser hits
= brighter = more confidently a wall), so the swept area reads as solid
structure — the built-up map the old deck's ROBOT MAP showed. Also enlarged
the panel (280px, taller canvas) since the map carries real information.
You preferred the old deck's arrangeable multi-panel layout (and its ROBOT
MAP) over the camera-hero cockpit. So the connected robot now drives the same
deck the sim uses, with real telemetry behind every panel:

- pose places the bot in the 3D scene; the VIEW panels render it from any
  angle (top/chase/orbit), and WASD still grabs the wheel
- lidar feeds integrateLidar -> the occupancy ROBOT MAP and the 3D point
  cloud (the map builds up as it drives)
- a new EYES panel shows the robot camera with YOLO boxes (robot mode only)
- STATUS shows pose/odometer/cmd; POLICY loads and RUN deploys the robot's
  own behavior (confirmed); MISSION becomes the robot identity
- sim-only bits (LEVELS, RUNS) hide in robot mode

Same deck, two sources. The camera-hero cockpit is retired (code dormant);
enterSimCockpit is now a no-op and the sim runs in the deck directly.
A multi-quadruped "/fleet" sandbox reachable from the robot picker, built to
teach how a swarm covers ground over an imperfect radio.

- Coverage sim with 4 strategies (lone wolves, gossip, claim-and-yield,
  one commander) under real limits: radio range, airtime, onboard memory,
  inbox depth, one goal at a time, lossy links.
- Hover any robot to see what it knows: tiles it sensed firsthand (filled)
  vs. only heard from peers (outlined), with a live knowledge card.
- Base station (the "main server") + data points: discover data and relay it
  home with greedy geographic routing (multi-hop / data-mule).
- Environments: open / scattered obstacles / building with LOS-blocking walls.
- CONCEPTS panel (expandable) explaining delivered vs dropped, overlap,
  optimisation, libp2p/gossipsub, stigmergy — plus tooltips on every metric.
- "Your algorithm": live JS strategy editor + a Generate button that asks the
  local LLM to draft a policy (POST /api/fleet/strategy), runnable on the spot.
- roborun/swarm/: the same model + strategies + base relay as runnable Python
  (python -m roborun.swarm), the headless twin; ships with the package.

Also:
- ROS card "Allow network scan to load robots" button — trips the browser's
  local-network permission and lists every rosbridge robot found as its own view.
- ROS-connected robots reuse the exact sim deck; EYES camera docks into the
  layout instead of floating.
- Replaced the last native confirm() (deploy-to-robot) with the styled modal.
- vercel.json (cleanUrls) so /fleet resolves on the static deploy.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A freshly opened page (even when a robot is connected on the local runtime)
now lands on the picker instead of auto-entering the robot deck — the menu is
the front door. Auto-enter only when a robot is explicitly pinned. In the deck,
keep a visible '⊞ MENU' button so there's always a way back to select something.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The ROS-parity change made EYES a layout-managed panel, so once the robot deck
opened it the saved layout carried it into later sim sessions — where there's
no robot camera, leaving it black. Guard it in applyLayout: the EYES panel and
its toolbar toggle only show when MODE === 'robot'.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Each robot now tracks everKnown (lifetime, never evicted) alongside its
memory-capped current set. Hover splits it into IN MEMORY NOW (sensed vs heard,
out of the memory cap) and EVER KNOWN (cells discovered, and how many it has
forgotten because memory filled). The map overlay adds a faint third tier for
the forgotten footprint, under the bright in-memory cells. Mirrored in
roborun/swarm (Robot.ever_known + forgotten()).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant