Skip to content

Robotics Domain Adaptation Gallery

Authors: Raju WagwaniJathavan SriramRichard YarlettJoshua BapstJinwei Gu

Organization: NVIDIA

Overview

This page showcases results from Cosmos Transfer 2.5 for robotics applications. The examples demonstrate sim-to-real transfer for robotic manipulation tasks in kitchen environments, showing how synthetic simulation videos can be transformed into photorealistic scenes with varied materials, lighting, and environmental conditions. These results enable domain adaptation and data augmentation for robotic training and validation.

Use Case: Robotics engineers can use these techniques to generate diverse training data from a single simulation, creating variations in kitchen styles, materials, and lighting conditions without re-running expensive simulations or capturing real-world data.

Example 1: Edge-Only Control for Environment Variation

This example demonstrates how to transform synthetic robotic simulation videos into photorealistic scenes with different kitchen styles and materials using edge control, which preserves the original structure, motion, and geometry of the robot and scene while allowing the visual appearance to change dramatically based on the text prompt.

  • Edge control: Maintains the structure and layout of objects, robot poses, and camera motion from the simulation, while transforming the visual appearance (materials, lighting, colors) according to the prompt.
  • Why use edge-only: To preserve exact robot motions and object positions from simulation while varying environmental aesthetics.

For detailed explanations of control modalities, refer to the Control Modalities Overview.

Scene 1a: Kitchen Stove - Cooking Task

This scene shows a humanoid robot performing a cooking task at a stove. The examples demonstrate how different kitchen cabinet styles (white, red, wood tones) and robot materials (plastic, metal, gold) can be generated from the same simulation.

Input Video

Examples

Scene 1b: Kitchen Island - Object Manipulation

This scene shows a robot performing precise object manipulation at a kitchen island, picking up and placing items. The examples demonstrate material variations (different fruit/objects) coordinated with kitchen style changes.

Input Video

Examples

Scene 1c: Kitchen Refrigerator - Appliance Interaction

This scene demonstrates robot interaction with appliances, showing the robot opening a refrigerator. The examples maintain the lighting dynamics (fridge interior light) while varying kitchen aesthetics.

Input Video

Examples

Example 2: Multi-Control with Custom Control Videos

These examples demonstrate advanced usage, where you provide custom pre-computed control videos (depth, edge, segmentation) alongside the input video. Multi-control gives you fine-grained control over different aspects of the transformation:

  • depth: Controls 3D spatial relationships and perspective
  • edge: Maintains structural boundaries and object shapes
  • seg: Enables semantic-level changes and object replacement
  • vis: Preserves lighting and camera properties (set to 0 in this example)

When to use multi-control: Use this approach when you need precise control over the transformation by pre-generating and fine-tuning specific control signals, especially for complex scene manipulations or when edge-only control is insufficient.

Scene 2a

Input and Control Videos

Output Video

Scene 2b

Input and Control Videos

Examples

Quality Enhancements: Transfer 2.5 vs Transfer 1

Compared to Cosmos Transfer 1, Cosmos Transfer 2.5 offers significant improvements in both video quality and inference speed. The examples below show side-by-side comparisons where each video transitions between Transfer 1 results and Transfer 2.5 results, illustrating the quality of improvements achieved in the latest version.

Examples