Reborn AGI
  • COMPANY
    • About
    • Mission
  • Mechanisms
    • Robotic Foundation Models
    • Reborn Flywheel for RFM
    • Reborn Data-to-Model Pipeline
    • Reborn's Partnerships
  • Products
    • Reborn Nest
    • Unified Data Platform
      • Rebocap: Reborn Mocap Life
      • Reborn VR Gaming
      • Reboverse Simulation Engine
      • Reborn Embodied Vlog
    • Open Model Ecosystem
      • Reborn Model Zoo
      • Reborn App Store
  • Physical Agents
Powered by GitBook
On this page
  • Reborn Embodied Vlog
  • Transfer Reborn Embodied Vlog to Embodied Training Data
  1. Products
  2. Unified Data Platform

Reborn Embodied Vlog

Robots learn embodied manipulation from human handwork.

PreviousReboverse Simulation EngineNextOpen Model Ecosystem

Reborn Embodied Vlog

"Reborn Embodied Vlog" (REV) is designed as a mobile app that enables users to record and upload first-person perspective videos of fine manipulation tasks from their daily lives. The process is simple:

  • Recording: Users use their smartphones, GoPro, or other cameras to capture videos of themselves performing detailed tasks, such as preparing food, cleaning, assembling objects, or other hand-manipulation actions.

  • Uploading: The app allows users to upload these videos directly to the platform, where the footage is anonymized and processed.

  • Video Analysis: The system processes the video data to extract hand movement, gestures, object interactions, and fine manipulation patterns.

  • Global Contribution: By contributing their data, users help build a massive, diverse dataset that is accessible to AI systems worldwide, improving the accuracy and efficiency of robot manipulation tasks.

Transfer Reborn Embodied Vlog to Embodied Training Data

To process first-person perspective video data into hand landmarks for robotic dexterous hand training, a structured pipeline can be followed:

1. Video Preprocessing

  • Frame Extraction: Convert video into a sequence of frames to analyze each individually. Select a suitable frame rate (e.g., 30 FPS) to balance detail and computational cost.

  • Image Enhancement: Improve video quality (e.g., brightness, contrast) to ensure clear visualization of hands and objects in various lighting conditions.

  • Segmentation: Use a hand segmentation algorithm to isolate the hand region from the background, reducing noise and focusing the analysis.

2. Hand Landmark Detection

  • Pose Estimation Models: Utilize state-of-the-art hand pose estimation models, such as Mediapipe Hand Tracking or DeepHand, to detect keypoints on the hand.

    • Detect key landmarks such as fingertips, knuckles, wrist, and palm center.

    • Use 2D/3D coordinate extraction to map hand keypoints relative to the frame or environment.

  • 3D Reconstruction (if needed): Use stereo cameras or infer depth from monocular video using advanced models like DensePose or MANO (Model-based Articulated Hand Object).