SWE Paper List

Paper List:

  • SWE-bench Goes Live! arXiv:2505.23419
  • SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks arXiv:2506.10954
  • SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale arXiv:2602.23866
  • SWE-Universe: Scale Real-World Verifiable Environments to Millions arXiv:2602.02361
  • DockSmith: Scaling Reliable Coding Environments via an Agentic Docker Builder arXiv:2602.00592
  • Multi-Docker-Eval: A ‘Shovel of the Gold Rush’ Benchmark on Automatic Environment Building for Software Engineering arXiv:2512.06915
  • Scaling Agentic Verifier for Competitive Coding arXiv:2602.04254
  • SWE-MiniSandbox: Container-Free Reinforcement Learning for Building Software Engineering Agents arXiv:2602.04254
  • Immersion in the GitHub Universe: Scaling Coding Agents to Mastery arXiv:2602.09892

SWE-bench Goes Live!

arXiv logo View on arXiv

SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks

arXiv logo View on arXiv

SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale

arXiv logo View on arXiv

SWE-Universe: Scale Real-World Verifiable Environments to Millions

arXiv logo View on arXiv

DockSmith: Scaling Reliable Coding Environments via an Agentic Docker Builder

arXiv logo View on arXiv

Multi-Docker-Eval: A ‘Shovel of the Gold Rush’ Benchmark on Automatic Environment Building for Software Engineering

arXiv logo View on arXiv

Scaling Agentic Verifier for Competitive Coding

arXiv logo View on arXiv

SWE-MiniSandbox: Container-Free Reinforcement Learning for Building Software Engineering Agents

arXiv logo View on arXiv

Immersion in the GitHub Universe: Scaling Coding Agents to Mastery

arXiv logo View on arXiv

Author

Alan Zeng

Posted on

2026-03-16

Updated on

2026-03-19

Licensed under

Comments