FAST-LIO pcap record/replay + Virtual Livox#2498
Conversation
…lio_native flake/cmake consuming dimos-module-fastlio2 pointlio branch + Estimator/parameters sources)
…y path) The fast-lio input was pinned to file:///Users/jeffhykin/... which only exists on the Mac. Repoint to the dimensionalOS/dimos-module-fastlio2 pointlio branch on github so the flake builds on Linux. Same locked rev.
Rename the mirrored fastlio_blueprints.py to pointlio_blueprints.py and wire it to PointLio (was incorrectly using FastLio2). Adds mid360-pointlio and mid360-pointlio-voxels to the blueprint registry.
…race-fix flake relock Adds the offline replay/inspection tooling for the pointlio_native module: pcap_replay.hpp (streams a Livox pcap into the SDK callbacks), deterministic_clock + dual-thread replay options, the ruwik2_pt3 replay harness, the pcap_to_db tool (append pointlio_odometry into an existing memory db at ~30Hz, streaming), validated Point-LIO mid360.yaml, and the live smoke-test helper. flake.lock relocks onto the fastlio2 pointlio rev carrying the mtx_buffer race fix; main.cpp comment corrected to attribute the divergence fix to that lock (not the publish-rate gating).
Switches the flake's fast-lio input from path:/home/dimos/repos/dimos-module-fastlio2 to github:dimensionalOS/dimos-module-fastlio2/pointlio so the module builds on any machine without a local fastlio2 clone. Relocked onto rev 7e5d88f (the mtx_buffer race fix); verified pointlio_native builds from the github source.
…hful config OUTPUT-model replay of the Jun-7 hand-shake bag diverged to ~25km (vs hku-mars master's bounded ~5m). Two issues, both fixed here (paired with the dimos-module-fastlio2 pointlio-branch curvature/deque fixes): - main.cpp: heap-use-after-free thread race in bag-frame replay. The feeder thread drives run_main_iter via step() after each feed, but the main thread also drove run_main_iter, so both raced on the shared PCL measurement cloud (ASan: free in sync_packages, read-after-free in ImuProcess::Process). Force serial_replay in bag-frame mode so only the feeder thread touches the EKF. - CMakeLists.txt + flake.nix: the iVox map backend needs glog; add the find_package/link and the nix buildInput. Drop ikd_Tree.cpp from sources (iVox replaces ikd-Tree). - config/mid360.yaml (+ cpp/config duplicate): set satu_acc 5.5->3.0 (master value — accel >=3g saturated, residual zeroed, keeps velocity bounded) and add ivox_grid_resolution=2.0 / ivox_nearby_type=6 (NEARBY6) matching the master config that produced the 5.028m baseline. - replay_ruwik2_pt3.py: make MAX_WALL_SEC env-overridable. After the fix the OUTPUT model stays bounded (peak |pos| 3.898m) on the full bag; ASan reports zero memory errors.
…onsumes the iVox fix The iVox-port + divergence fixes live in dimos-module-fastlio2 @ pointlio commit 02d5066, which is committed locally but NOT pushed (per the task's no-push constraint). The flake's fast-lio input was pointing at github:dimensionalOS/dimos-module-fastlio2/pointlio, whose lock pinned the pre-fix rev 7e5d88f. Since CMakeLists.txt already drops ikd_Tree.cpp (iVox replaces ikd-Tree), building against that stale github source fails with KD_TREE undefined-reference linker errors. Repoint fast-lio at path:/home/dimos/repos/dimos-module-fastlio2 and bump the lock so `nix build .#pointlio_native` consumes the local 02d5066 source. Verified: build exits 0, no KD_TREE errors, 661176-byte binary. NOTE: this bakes a machine-specific absolute path into the lock. Once Repo A's pointlio branch is pushed, switch this input back to github:dimensionalOS/dimos-module-fastlio2/pointlio and bump the lock to 02d5066 for portability.
Now that dimos-module-fastlio2 pointlio (02d5066, the iVox port + divergence fix) is on origin, switch the fast-lio flake input from the local path: pin back to github:dimensionalOS/dimos-module-fastlio2/pointlio and bump the lock to rev 02d5066. This makes jeff/feat/pointlio_native self-contained and portable (no machine-specific path in the lock). Verified: nix build .#pointlio_native exits 0, no KD_TREE errors.
… add --force Extends pcap_to_db.py to populate both pointlio_odometry and pointlio_lidar streams (start-aligned onto the db's earliest ts) so pointlio can be compared against fastlio in one recording. --force overwrites existing pointlio streams. Untracks the machine-specific fastlio_test/setup_network symlinks (enhance overlay) and drops the empty tools/__init__.py.
…o tools - Consolidate config/ to one default.yaml (the tuned Mid-360/iVox config); remove the unused upstream sensor presets (avia, horizon, marsim, ouster64, velodyne, mid360). Module now defaults to default.yaml. - Remove tools/replay_ruwik2_pt3.py (pcap_to_db.py covers offline replay) and tools/demo_live_test.py.
…egistry - --db is now optional: with no existing db, build one from scratch (defaults to <pcap>.db next to the pcap); with an existing db, append + time-align as before. - Rename the internal recorder Rec/RecConfig -> _Rec/_RecConfig so the blueprint-registry generator skips it (a tool helper isn't a public module); fixes test_all_blueprints_is_current. - Docstring: document from-scratch generation using the ruwik2_part3 LFS sample.
… rm cpp/README + config copies; concise comments
- config: consolidate to a single config/default.yaml; drop the ROS-only
lid_topic/imu_topic + odom frame-id/publish/pcd_save keys (parsed-but-unused
by the native binary), and note which params are Mid-360-specific.
- delete cpp/config/{mid360.json,mid360.yaml} (duplicate) and cpp/README.md.
- tighten verbose comments in main.cpp, pcap_replay.hpp, timing.hpp
(comments only; binary rebuilds clean).
… db name in docstring
120s clip (elapsed 55-175s) of the ruwik velocity-spike recording — the data
that diverges through FAST-LIO at the 0.5m pre-KF voxel and is bounded under
Point-LIO. Use via get_data('ruwik2_part3'); pcap_to_db --pcap <it> builds the
db from scratch.
….0 clock bug - module.py docstring: Point-LIO (not FAST-LIO2) + correct import path. - pcap_to_db.py: use 'is not None' (not 'or') for the ts fallback so a real sensor ts of 0.0 isn't replaced by wall time (which would misclassify the stream clock in _resolve_offset).
… NIC Reads a Livox Mid-360 pcap into RAM, rewrites packet timestamps to current-time, and replays point/IMU/status onto a virtual network interface at a configurable rate + delay. Synthesizes the Livox SDK2 control protocol (discovery + GetInternalInfo/FwType ACKs, CRC16/CRC32) so an unmodified consumer (pointlio) handshakes with it as a real sensor. Builds via nix (rustPlatform.buildRustPackage, cargoLock from Cargo.lock).
…bal_map - Cloud now published in the sensor frame (mid360_link): use fastlio2 get_body_cloud() (undistorted scan, no world registration) instead of inverse-transforming the world cloud. No transform in publish_lidar. - Split frames: frame_id (mid360_link) on both the cloud + odometry headers; odom_parent_frame_id (odom) -> odom_frame_id (base_link) for the TF publish. - Remove global_map / voxel_map.hpp entirely (deleted file, config, blueprint + pcap_to_db references). - Bump fast-lio pin to fcbd1c2 (adds get_body_cloud).
get_data() at module level triggered a git-LFS download during blueprint validation (test_blueprint_is_valid), which CI blocks via git-lfs-guard — failing the whole test matrix. Default pcap to empty; resolve the capture path at run time instead.
Port the minimal pcap-replay subsystem from jeff/feat/go2_record into the clean branch so FAST-LIO can run offline from a Mid-360 pcap, matching the Point-LIO pcap_to_db workflow: - cpp: pcap_replay.hpp + timing.hpp (header-only), main.cpp refactored so the main loop runs from either the live SDK or a pcap feeder thread, with an optional deterministic sensor-clock mode. Keeps the clean branch's velocity-cap (guarded set_max_velocity_norm_ms) and flake (velocity-cap fast-lio); does not pull go2_record's tcpdump record path. - module.py: replay_pcap / replay_skip_until_ns / first_packet_marker / deterministic_clock config fields; skip network validation in replay mode. - tools/pcap_to_db.py: replay a pcap through FastLio2 (real-time, non- deterministic) and append fastlio_odometry + fastlio_lidar into an existing memory2 db, time-aligned onto its clock. --force overwrites.
…FastLio2Recorder - livox/pcap_recorder.py: standalone tcpdump pcap capture (LivoxPcapRecorder), decoupled from FAST-LIO. The lidar/SLAM module no longer owns packet capture. - fastlio2/recorder.py: FastLio2Recorder records fastlio_odometry + fastlio_lidar and rewrites ONLY those streams' timestamps onto the db clock (promoted from pcap_to_db's inline recorder; fixes the ts==0.0 falsy-fallback bug). - pcap_to_db.py now imports FastLio2Recorder instead of an inline copy.
FastLio2 no longer produces a global voxel map — odometry + registered lidar only. Removed global_map Out, map config, and the mapping.GlobalPointcloud spec. Updated all consumers (fastlio_blueprints, alfred_nav, g1_onboard, g1_nav_onboard, mobile, pcap_to_db) to drop map args + the global_map remap. Nav blueprints lose their fastlio map; full nav wants a separate mapper wired in (follow-up).
- vendored core (dimos-module-fastlio2 @ 50367cb) now reads mapping.filter_size_surf/map from the YAML (was hardcoded 0.5); relock flake. - main.cpp honors the publish flags (passed as CLI args): scan_publish_en gates the lidar output, scan_bodyframe_pub_en picks sensor/body vs world frame, dense_publish_en voxel-downsamples the cloud when off. - nav blueprints set scan_bodyframe_pub_en=False so registered_scan is world-frame again (fixes the body-frame break). - FastLio2Config drops the lid_topic/imu_topic/path_en/pcd_save_en/interval fields the fork never reads; keeps the live tuning + the 3 publish flags. - remove the pointless virtual_now() wrapper + its unreachable failure branch.
…ame) Drop the body/world toggle; the module always publishes the sensor/body-frame cloud (get_body_cloud). Remove the field, the main.cpp arg + frame branch, and the nav blueprints' scan_bodyframe_pub_en=False overrides.
…-pcap - After writing <db>.rrd, open it in rerun (subprocess), unless --no-gui. - A --pcap that isn't a local file is resolved via get_data (LFS), mirroring the --db fallback, so a get_data-style pcap path works directly.
… mount - mobile / alfred_nav / unitree_g1_onboard drop their FastLio2 tuning overrides (acc_cov/gyr_cov/det_range/extrinsic_est_en/filter_size) and use the module defaults. - preserve the removed FlowBase Mid-360 mount pose as FLOWBASE_MID360_MOUNT in the new dimos/hardware/drive_trains/flowbase/config.py for later use.
Add a 'FastLio2 tuning' arg group (--acc-cov, --filter-size-surf/map, --det-range, --blind, --fov-degree, --lidar-type, --extrinsic-est-en, --scan/--dense-publish-en, etc.) merged into the FastLio2Config overrides — they take precedence over the --config YAML doc. Only flags that are set override anything.
The vendored core (dimos-module-fastlio2 @ a32c9f5) now takes a FastLioParams struct instead of loading a YAML, so drop yaml-cpp (flake + CMake). module.py no longer generates a throwaway YAML / config_path; the tuning fields are emitted as plain CLI args (lidar_type/timestamp_unit as strings, extrinsics as comma lists), and main.cpp reads them into FastLioParams. Also wire dense_publish_en to the core's get_body_cloud_down (IESKF-downsampled scan) instead of a PCL voxel in main.cpp.
Single space around assignment (was column-aligned, lisp-style) and collapse over-aligned inline comments.
It was a [&]-capturing lambda called from exactly one place (a leftover of the old replay/virtual-clock design). Inline the body into the while loop and drop the now-redundant check_now copy.
… a lambda) Extract the per-iteration body into a plain static function that takes the time point + the rate-limit bookmarks/intervals/flags as explicit args, instead of a [&]-capturing lambda.
…inery Remove _existing_min_ts / _resolve_offset / _resolve_ts / time_offset and the EPS tie-breaker. Records use the base Recorder's ts (msg.ts) directly — the native modules always stamp a real epoch ts, so no re-basing is needed. Drop the now-unused --time-offset from both pcap_to_db tools.
The README was redundant; the tuning comment referenced the removed YAML/_YAML_LAYOUT (it's plain CLI args now).
Drop the jhist pointer and Go2-specific wording; describe acc_cov generically (high vs low IMU-accel trust).
…streams _prepare_streams is now 3 lines: delete the recorder's own two streams if present (keeping any other streams), then record. Removes the force config + the refuse- to-overwrite raise, and --force from both pcap_to_db tools.
…ream_name Name the In ports as the db tables (fastlio_/pointlio_odometry/lidar) and wire them to the module's odometry/lidar outputs with .remappings() in pcap_to_db — matching the base Recorder convention (stream = port name). Removes _stream_name and the odom/lidar_stream_name config; _prepare_streams just replaces our own ports' streams. pcap_to_db drops the now-fixed --odom/lidar-stream-name args.
Move record_tf + @pose_setter_for into Recorder/RecorderConfig and delete tf_recorder.py. Every Recorder now records the live tf stream by default and supports per-stream pose setters; fastlio/pointlio recorders subclass Recorder directly. Drop the now-redundant tf-recorder blueprint entry.
Mirror the fastlio no-yaml change: PointLioConfig tuning fields are passed to the binary as CLI args (read into a PointLioParams struct in main.cpp) instead of being rendered to a throwaway YAML read via --config_path. Removes the yaml-cpp dependency from the flake + CMake and the config-file plumbing from the module. Requires the matching dimos-module-fastlio2 pointlio-branch change (flake.lock bumped to 288e357).
{port_name: db_stream_name} to control the recorded stream/table name without
subclassing — conceptually what .remappings() expresses, but the active
remappings aren't readily accessible from inside the module.
… frame Both modules now expose the same three frames: frame_id (fixed odom, the odometry header + TF parent), child_frame_id (moving body, the odometry child + TF child), and sensor_frame_id (the lidar's own frame, stamped on the published point cloud). get_body_cloud is the undistorted scan in the sensor frame, so the cloud was previously mislabeled — fastlio stamped it with the body frame and pointlio reused frame_id for both the cloud and the odometry header. pointlio's body_start_frame_id/body_frame_id are folded into this scheme. Drops a stale PGO comment.
Use the lidar's concrete frame name rather than the generic FRAME_SENSOR.
…_remapping Move the replace-on-append logic into the base Recorder so it drops exactly the streams it is about to rewrite: the remapped In-port streams (via _stream_name) AND the tf stream. Previously each recorder's _prepare_streams deleted only the raw In-port names, so re-running into an existing db appended a second full copy of the tf tree and would orphan remapped streams. Also guard _record_tf for non-pubsub tf backends, and drop the now-dead world/global_map_fastlio rerun overrides (FastLio2's global_map port was removed in this PR).
Greptile SummaryThis PR adds pcap record/replay support to FAST-LIO2 (and harmonises the parallel PointLIO path), introducing a
Confidence Score: 5/5Safe to merge — the changes are well-scoped and the removed C++ infrastructure (world-frame cloud transform, voxel map, YAML config) is cleanly replaced by CLI-arg tuning with correct defaults. The C++ refactor correctly re-orders SDK uninit before nulling globals (preventing a late-callback race), the new FastLio2Recorder mirrors the established PointlioRecorder pattern, and the pcap-to-db tool has a layered drain-detection strategy. The only finding is a minor timestamp-fallback inconsistency between FastLio2 and PointLio's TF callbacks that has no practical impact at runtime. dimos/hardware/sensors/lidar/fastlio2/module.py (minor ts-fallback inconsistency with the PointLio counterpart) Important Files Changed
Sequence Diagram%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant pcap_to_db as pcap_to_db script
participant Coord as ModuleCoordinator
participant VM360 as VirtualMid360
participant FL2 as FastLio2 (C++ binary)
participant Rec as FastLio2Recorder
participant DB as SQLite DB
pcap_to_db->>Coord: build(blueprint)
Coord->>VM360: start()
Coord->>FL2: start() [waits warmup_sec]
Coord->>Rec: start()
VM360-->>FL2: Livox wire protocol (UDP) host_ip/lidar_ip alias
FL2-->>Rec: odometry (LCM → Out port)
FL2-->>Rec: lidar pointcloud (LCM → Out port)
Rec-->>DB: append fastlio_odometry / fastlio_lidar
loop every _POLL_SEC
pcap_to_db->>DB: _odom_stats(odom_stream) + _odom_stats(lidar_stream)
DB-->>pcap_to_db: (count, min_ts, max_ts)
alt lidar_max stagnant for _STAGNANT_SEC OR sensor span reached
pcap_to_db->>Coord: stop()
end
end
pcap_to_db->>DB: _write_rrd (aggregate world cloud + path)
DB-->>pcap_to_db: .rrd file
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant pcap_to_db as pcap_to_db script
participant Coord as ModuleCoordinator
participant VM360 as VirtualMid360
participant FL2 as FastLio2 (C++ binary)
participant Rec as FastLio2Recorder
participant DB as SQLite DB
pcap_to_db->>Coord: build(blueprint)
Coord->>VM360: start()
Coord->>FL2: start() [waits warmup_sec]
Coord->>Rec: start()
VM360-->>FL2: Livox wire protocol (UDP) host_ip/lidar_ip alias
FL2-->>Rec: odometry (LCM → Out port)
FL2-->>Rec: lidar pointcloud (LCM → Out port)
Rec-->>DB: append fastlio_odometry / fastlio_lidar
loop every _POLL_SEC
pcap_to_db->>DB: _odom_stats(odom_stream) + _odom_stats(lidar_stream)
DB-->>pcap_to_db: (count, min_ts, max_ts)
alt lidar_max stagnant for _STAGNANT_SEC OR sensor span reached
pcap_to_db->>Coord: stop()
end
end
pcap_to_db->>DB: _write_rrd (aggregate world cloud + path)
DB-->>pcap_to_db: .rrd file
Reviews (2): Last reviewed commit: "fastlio main.cpp: match pointlio formatt..." | Re-trigger Greptile |
| last_lidar_max = 0.0 | ||
| first_max: float | None = None | ||
| stagnant_since: float | None = None | ||
| start_time = time.time() | ||
| while True: | ||
| time.sleep(_POLL_SEC) | ||
| odom_cnt, odom_min, odom_max = _odom_stats(db_path, odom_stream) | ||
| if odom_cnt == 0: | ||
| # Stagnation timeout only arms once odometry exists, so bound the | ||
| # no-output wait separately or a dead binary would hang forever. | ||
| if time.time() - start_time > _STARTUP_TIMEOUT_SEC: | ||
| print( | ||
| f"[pcap_to_db] no odometry after {_STARTUP_TIMEOUT_SEC:.0f}s — FAST-LIO " | ||
| "failed to start (check the binary, pcap path, and interface setup).", | ||
| file=sys.stderr, | ||
| flush=True, | ||
| ) | ||
| return False | ||
| continue | ||
| if first_max is None: | ||
| first_max = odom_min | ||
| if max_sensor_sec > 0 and (odom_max - first_max) >= max_sensor_sec: | ||
| print(f"[pcap_to_db] reached --max-sensor-sec={max_sensor_sec:.1f}s", flush=True) | ||
| return True | ||
| _, _, lidar_max = _odom_stats(db_path, lidar_stream) | ||
| if lidar_max == last_lidar_max: | ||
| if stagnant_since is None: | ||
| stagnant_since = time.time() | ||
| elif time.time() - stagnant_since > _STAGNANT_SEC: | ||
| return True | ||
| else: | ||
| last_lidar_max = lidar_max | ||
| stagnant_since = None |
There was a problem hiding this comment.
Stagnation timer arms prematurely on first poll
last_lidar_max is initialised to 0.0, and _odom_stats returns 0.0 for MAX(ts) when the table is empty (via row[2] or 0.0). The very first poll iteration after odometry appears will therefore evaluate lidar_max == last_lidar_max as True, immediately arming stagnant_since. If the recorder's lidar table hasn't been written yet at that moment — which can happen because FAST-LIO rate-limits lidar publishes to pointcloud_freq (default 100 ms) — the 5-second stagnation clock is already running. In the worst case this causes an early exit before a single lidar frame is recorded.
The same pattern exists in dimos/hardware/sensors/lidar/pointlio/scripts/pcap_to_db.py (line 263). Initialising last_lidar_max = None and skipping the stagnation check until at least one non-zero lidar frame has been observed would avoid the false positive.
| from dimos.msgs.geometry_msgs.Vector3 import Vector3 | ||
|
|
||
| # Mid-360 mount pose on the FlowBase (position + orientation) in the base frame. | ||
| FLOWBASE_MID360_MOUNT = Pose(0.20, -0.20, 0.10, *Quaternion.from_euler(Vector3(0, 0, 0))) |
There was a problem hiding this comment.
FLOWBASE_MID360_MOUNT is defined but never imported or used
This file was introduced in this PR to hold the FlowBase mount pose, but a search of the entire codebase shows no import site for FLOWBASE_MID360_MOUNT. The counterpart _flowbase_mid360_mount was removed from mobile.py along with the mount= parameter (which was also removed from FastLio2Config). As written, this constant is dead code — the mount pose is not compensated anywhere for the FlowBase. Either wire it into a downstream TF publisher / extrinsic field, or remove the file until it is needed.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
…_stream_name Braces on every if/for/while (inline single-statement bodies), collapse the awkwardly-wrapped calls/signatures to one line, and break the long run_main_iter def/call with the closing paren on its own line. Inline get_publish_ts (and the Recorder._stream_name helper, now config.stream_remapping.get at the use sites); add the missing braces to pointlio's parse_doubles too.
The pose-setter dict is typed dict[str, Any] (to avoid evaluating the Pose forward-ref at class-definition time), so cast its result back to Pose | None.
Adds virtual mid360 pcap record/replay to FAST-LIO; rips out the in-process pcap reader and the output-cloud downsampling/denoising.
Usage / Testing
Pcap to DB (also generates .rrd)
# gen <pcap>.db from the pcap (a <db>.rrd quick-look is written next to it) python -m dimos.hardware.sensors.lidar.fastlio2.tools.pcap_to_db \ --db output_mem2.db \ --config overrides.yaml \ --pcap mid360_shake_stairs/mid360_shake_stairs.pcapRecord a Mid-360 pcap
Replay a pcap to a module