Skip to content

Tutorials

Build unlimited training data and train state-of-the-art SWE-agents. This guide covers the complete workflow: from environment setup to model training and evaluation.


๐Ÿ“‹ Prerequisites

System Requirements

Required: Docker Tested on: Ubuntu 22.04.4 LTS Not supported: Windows, macOS

New to SWE-smith? Start with Installation and Quickstart.


๐ŸŽฏ Quick Navigation

  • Build Environments


    Create reproducible Docker images for any repository. Capture dependencies, build containers, and validate with automated testing.

    Get started

  • Create Instances


    Generate task instances using LM prompts, procedural modifications, PR mirroring, or combined techniques. Scale to thousands of bugs.

    Generate bugs

  • Validate & Evaluate


    Filter candidates that break tests and verify proposed solutions. Built-in harnesses for validation and evaluation workflows.

    Run harnesses

  • Generate Issue Text


    Add natural language problem statements to task instances using LM generation or alternative methods.

    Create issues

  • Rate Difficulty ยท Optional


    Classify tasks as easy/medium/hard using a fine-tuned Qwen 2.5 Coder model. Compare against SWE-bench benchmarks.

    Assess difficulty

  • Train SWE-agents


    Complete RSFT pipeline: generate trajectories, filter successful solutions, fine-tune models, and evaluate on SWE-bench.

    Start training


graph LR
    A[Build Environments] --> B[Create Instances]
    B --> C[Validate & Evaluate]
    C --> D[Generate Issue Text]
    D --> E[Rate Difficulty]
    D --> F[Train SWE-agents]
    E --> F
  1. Build Environments โ†’ Set up Docker images
  2. Create Instances โ†’ Generate synthetic bugs
  3. Validate & Evaluate โ†’ Filter valid task instances
  4. Generate Issue Text โ†’ Add problem descriptions
  5. Rate Difficulty (optional) โ†’ Classify task complexity
  6. Train SWE-agents โ†’ Fine-tune models with RSFT