Welcome to cosmos-rl’s documentation!

cosmos-rl is fully compatible with PyTorch and is designed for the future of distributed training.

Main Features

  • Natively Designed for Physical AI
    • Cosmos-RL supports training serveral physical AI paradigms, e.g., LLM/VLM, world foundational models, VLA, etc.

    • Multi-training Algorithms
      • Supports state-of-the-art LLM RL algorithms (e.g., GRPO, DAPO, etc.), RL algorithms for world foundational models (e.g., FlowGRPO, DDRL, DiffusionNFT, etc.), and VLA-specific algorithms.

      • Well-architected design ensures high extensibility, requiring only minimal configuration to implement custom training algorithms.

    • Diversified Model Support
      • For LLM/VLM:
        • Natively supports LLaMA/Qwen/Qwen-VL/Qwen3-MoE series models.

        • Compatible with all Huggingface LLMs.

      • For world foundational models:
        • Natively supports SD3/Cosmos-Predict2.5/SANA.

        • Compatible with mainstream Huggingface world foundational models based on diffusers.

      • For VLA (Vision-Language-Action):
        • Natively supports OpenVLA, OpenVLA-OFT, and PI0.5 series models.

        • Integrated with LIBERO and BEHAVIOR-1K simulators.

      • Easily extensible to other model architectures by customizing interface.

  • 6D Parallelism: Sequence, Tensor, Context, Pipeline, FSDP, DDP.

  • Elastic & Fault Tolerance: A set of techniques to improve the robustness of distributed training.

  • Async RL
    • Flexible
      • Rollout and Policy are decoupled into independent processes/GPUs.

      • No colocation of Rollout and Policy is required.

      • Number of Rollout/Policy instances can be scaled independently.

    • Fast
      • IB/NVLink are used for high-speed weight synchronization.

      • Policy training and Rollout weight synchronization are PARALLELIZED.

    • Robust
      • Support AIPO for stable off-policy training.

      • Async/Sync strategy can be selected upon to user’s choice.

Note

6D Parallelism is fully supported by Policy Model. For Rollout Model, only Tensor Parallelism and Pipeline Parallelism are supported.

Parallelism

World Foundational Models