Skip to content

Autonomous Vehicle Domain Adaptation Gallery

Authors: Raju WagwaniNikolay MatveievJoshua BapstJinwei Gu

Organization: NVIDIA

Overview

This page showcases a collection of results generated using Cosmos Transfer 2.5 for autonomous vehicle (AV) applications. The examples demonstrate how to transform real-world or simulation-based driving videos across various environmental conditions such as different weather, lighting, and time of day. These results are intended to serve as inspiration for users exploring how to leverage the model for domain adaptation and synthetic data augmentation in autonomous driving use cases.

Driving Scene 1

  • Multi control: The model uses different controls, each with different control weights to produce an output. This gives the user more control over how the output would look like.
  • depth: Maintains 3D realism and spatial consistency.
  • edge: Preserves original structure, shape, and layout.
  • seg: Enables structural changes and semantic replacement.
  • vis: Preserves background, lighting, and overall visual appearance.

For detailed explanations of control modalities, refer to the Control Modalities Overview.

Input Video

This scene shows a driving video captured from a dashcam perspective. The example below demonstrates how different environmental conditions (weather, lighting, time of day) can be generated from the same input video while preserving the structure and motion of the driving scene.

Examples

Input Prompt
The video is a driving scene through a modern urban environment, likely captured from a dashcam or a similar fixed camera setup inside a vehicle. The scene unfolds on a wide, multi-lane road flanked by tall, modern buildings with glass facades. The road is relatively empty, with only a few cars visible, including a black car directly ahead of the camera, maintaining a steady pace. The camera remains static, providing a consistent view of the road and surroundings as the vehicle moves forward.On the left side of the road, there are several trees lining the sidewalk, providing a touch of greenery amidst the urban setting. Pedestrians are visible on the sidewalks, some walking leisurely, while others stand near the buildings. The buildings are a mix of architectural styles, with some featuring large glass windows and others having more traditional concrete exteriors. A few commercial signs and logos are visible on the buildings, indicating the presence of businesses and offices.Traffic cones are placed on the road ahead, suggesting some form of roadwork or lane closure, guiding the vehicles to merge or change lanes. The road markings are clear, with white arrows indicating the direction of travel. Throughout the video, the vehicle maintains a steady speed, and the camera captures the gradual approach towards the intersection, where the road splits into different directions. The overall atmosphere is calm and orderly, typical of a city during non-peak hours. heavy rain, wet road with puddles
Parameters
seed: 4000, guidance: 3, depth: 0.5, edge: 0.1, seg: 0.35, vis: 0.0
Input Prompt
The video is a driving scene through a modern urban environment, likely captured from a dashcam or a similar fixed camera setup inside a vehicle. The scene unfolds on a wide, multi-lane road flanked by tall, modern buildings with glass facades. The road is relatively empty, with only a few cars visible, including a black car directly ahead of the camera, maintaining a steady pace. The camera remains static, providing a consistent view of the road and surroundings as the vehicle moves forward.On the left side of the road, there are several trees lining the sidewalk, providing a touch of greenery amidst the urban setting. Pedestrians are visible on the sidewalks, some walking leisurely, while others stand near the buildings. The buildings are a mix of architectural styles, with some featuring large glass windows and others having more traditional concrete exteriors. A few commercial signs and logos are visible on the buildings, indicating the presence of businesses and offices.Traffic cones are placed on the road ahead, suggesting some form of roadwork or lane closure, guiding the vehicles to merge or change lanes. The road markings are clear, with white arrows indicating the direction of travel. Throughout the video, the vehicle maintains a steady speed, and the camera captures the gradual approach towards the intersection, where the road splits into different directions. The overall atmosphere is calm and orderly, typical of a city during non-peak hours. night time, bright street lamps and colorful neon lights on buildings
Parameters
seed: 4000, guidance: 7, depth: 0.5, edge: 0.1, seg: 0.35, vis: 0.0
Input Prompt
Dashcam video, driving through a modern urban environment, sunset with beautiful clouds
Parameters
seed: 5000, guidance: 3, depth: 0.35, edge: 0.1, seg: 0.35, vis: 0.0
Input Prompt
Dashcam video, driving through a modern urban environment, heavy rain, wet road with puddles
Parameters
seed: 5000, guidance: 3, depth: 0.35, edge: 0.1, seg: 0.35, vis: 0.0
Input Prompt
Dashcam video, driving through a modern urban environment, winter with heavy snow storm, trees and sidewalks covered in snow
Parameters
seed: 8001, guidance: 3, depth: 0.35, edge: 0.15, seg: 0.35, vis: 0.15
Input Prompt
Dashcam video, driving through a modern urban environment, heavy rain, wet road with puddles
Parameters
seed: 5000, guidance: 6, depth: 0.6, edge: 0.1, seg: 0.4, vis: 0.1
Input Prompt
Dashcam video, driving through a modern urban environment, sunset with beautiful clouds
Parameters
seed: 7000, guidance: 3, depth: 0.35, edge: 0.1, seg: 0.35, vis: 0.05
Input Prompt
Dashcam video, driving through a modern urban environment, winter with heavy snow storm, trees and sidewalks covered in snow
Parameters
seed: 7001, guidance: 3, depth: 0.35, edge: 0.1, seg: 0.35, vis: 0.05
Input Prompt
Dashcam video, driving through a modern urban environment, twilight or early morning, partly cloudy
Parameters
seed: 5000, guidance: 3, depth: 0.4, edge: 0.1, seg: 0.5, vis: 0.1
Input Prompt
Dashcam video, driving through a modern urban environment, night time, bright street lamps lighting up the fog
Parameters
seed: 5000, guidance: 3, depth: 0.4, edge: 0.1, seg: 0.5, vis: 0.1

Quality Enhancements: Transfer 2.5 vs Transfer 1

Compared to Cosmos Transfer 1, Cosmos Transfer 2.5 offers significant improvements in both video quality and inference speed. The examples below show side-by-side comparisons, where each video transitions between Transfer 1 results and Transfer 2.5 results, illustrating the quality improvements achieved in the latest version.