Blog
Technical guides on robotics data pipelines, dataset formats, and training data infrastructure.
We Audited 5 Popular LeRobot Datasets. 4 Ship Stats That Produce Inf.
We ran traceplane check on five of the most-downloaded LeRobot datasets on HuggingFace. Every single-task dataset ships normalization stats that produce Inf the moment your dataloader uses them — silently.
GEN-1 Proved Human Data Trains Robots. Here's the Infrastructure You Need.
Generalist AI achieved 99% success rates using 500K hours of human activity data and zero robot data. Here's the data infrastructure required to replicate this approach.
We Audited 10 Popular Open-Source Robot Datasets. Here's What We Found.
We ran automated quality checks on 10 widely-used robotics datasets including Bridge V2, Open X-Embodiment, ALOHA, and LeRobot datasets. Every single one had issues that could silently degrade your policy.
Automated QA for Robot Trajectory Data: A Three-Layer Framework
Why "record more demos" doesn't fix training failures. A practical framework for structural, kinematic, and semantic quality checks on every episode — automatically.
How to Convert rosbag2 Data to LeRobot Format
A practical guide to converting ROS 2 bag files (.mcap, .db3) to HuggingFace LeRobot's Parquet + MP4 format for policy training. Covers timestamp alignment, video encoding, and schema mapping.