Additional Examples from Cosmos Model Repos

This page provides links to inference and post-training examples from the official Cosmos model repositories. These examples complement the end-to-end tutorials in the cookbook with comprehensive guides for model usage and customization.

Cosmos Predict

Cosmos Predict 2.5 (Latest)

For the latest Cosmos Predict 2.5 model documentation, visit the Cosmos Predict 2.5 Repository.

Inference with Pre-Trained Cosmos Predict 2.5 Models

Inference Guide: Video generation with Text2World, Image2World, and Video2World capabilities
Auto Multiview Inference Guide: Multi-camera view generation for autonomous vehicle applications

Post-Training with Cosmos Predict 2.5 Models

Video2World Post-Training for DreamGen Bench: Humanoid robot trajectory generation using the DreamGen benchmark

Cosmos Predict 2

Inference with Pre-Trained Cosmos Predict 2 Models

Text2Image Inference: Generating high-quality images from text prompts
Video2World Inference: Generating videos from images/videos with text prompts (single/batch processing, multi-frame conditioning, multi-GPU inference, prompt refiner, rejection sampling)
Text2World Inference: Generating videos directly from text prompts (single/batch processing, multi-GPU inference)

Post-Training with Cosmos Predict 2 Models

Video2World Post-Training Guide: General guide to the Video2World training system
Video2World Post-Training on Cosmos-NeMo-Assets: Post-training on Cosmos-NeMo-Assets data
Video2World Post-Training on Fisheye-View AgiBotWorld-Alpha Dataset: Post-training on fisheye-view robot videos from the AgiBotWorld-Alpha dataset
Video2World Post-Training on GR00T Dreams GR1 and DROID Datasets: Post-training on GR00T Dreams GR1 and DROID datasets
Video2World Action-Conditioned Post-Training on Bridge Dataset: Action-conditioned post-training on Bridge dataset
Text2Image Post-Training Guide: General guide to the Text2Image training system
Text2Image Post-Training on Cosmos-NeMo-Assets: Post-training on Cosmos-NeMo-Assets image data

Cosmos Transfer

Cosmos Transfer 2.5 (Latest)

For the latest Cosmos Transfer 2.5 model documentation, visit the Cosmos Transfer 2.5 Repository.

Inference with Pre-Trained Cosmos Transfer 2.5 Models

Inference Guide: Multi-control video generation with depth, segmentation, LiDAR, and HDMap conditioning
Auto Multiview Inference Guide: Multi-camera view generation for autonomous vehicle applications

Post-Training with Cosmos Transfer 2.5 Models

Post-Training Guide: General guide for custom control modalities and domain adaptation
Auto Multiview Post-Training for HDMap: Multi-view autonomous driving scenarios with HDMap control

Cosmos Transfer 1

Inference with Pre-Trained Cosmos Transfer 1 Models

Cosmos-Transfer1-7B Inference: Multi-GPU support
Cosmos-Transfer1-7B-Sample-AV Inference: Multi-GPU support
Cosmos-Transfer1-7B-4KUpscaler Inference: 4K upscaling with multi-GPU support
Cosmos-Transfer1-7B Inference (Depth): Depth-based control
Cosmos-Transfer1-7B Inference (Segmentation): Segmentation-based control
Cosmos-Transfer1-7B Inference (Edge): Edge-based control
Cosmos-Transfer1-7B Inference (Vis): Visual-based control
Cosmos-Transfer1pt1-7B Inference (Keypoint): Keypoint-based control
Cosmos-Transfer1-7B-Sample-AV-Multiview Inference: Multi-view generation

Post-Training with Cosmos Transfer 1 Models

Cosmos-Transfer1-7B Post-Training: Depth, Edge, Keypoint, Segmentation, and Vis controls with multi-GPU support
Cosmos-Transfer1-7B-Sample-AV Post-Training: LiDAR and HDMap controls with multi-GPU support
Cosmos-Transfer1-7B-Sample-AV-Multiview Post-Training: Multi-view LiDAR and HDMap controls with multi-GPU support

Post-Training Cosmos Transfer 1 Models from Scratch

Cosmos-Transfer1-7B Post-Training: Depth, Edge, Keypoint, Segmentation, and Vis controls with multi-GPU support
Cosmos-Transfer1-7B-Sample-AV Post-Training: LiDAR and HDMap controls with multi-GPU support
Cosmos-Transfer1-7B-Sample-AV-Multiview Post-Training: Multi-view LiDAR and HDMap controls with multi-GPU support

Cosmos Reason 1

For the latest Cosmos Reason 1 model documentation, visit the Cosmos Reason 1 Repository.

Post-Training with Cosmos Reason 1 Models

Cosmos Reason 1 Post-Training Example: Complete post-training guide for vision-language reasoning tasks