Posted 2026-03-16Updated 2026-07-09Paper-Reading2 minutes read (About 236 words)0 visits

SWE Paper List

Paper List:

SWE-bench Goes Live! arXiv:2505.23419

SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks arXiv:2506.10954

SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale arXiv:2602.23866

SWE-Universe: Scale Real-World Verifiable Environments to Millions arXiv:2602.02361

DockSmith: Scaling Reliable Coding Environments via an Agentic Docker Builder arXiv:2602.00592

Multi-Docker-Eval: A ‘Shovel of the Gold Rush’ Benchmark on Automatic Environment Building for Software Engineering arXiv:2512.06915

Scaling Agentic Verifier for Competitive Coding arXiv:2602.04254

SWE-MiniSandbox: Container-Free Reinforcement Learning for Building Software Engineering Agents arXiv:2602.04254

Immersion in the GitHub Universe: Scaling Coding Agents to Mastery arXiv:2602.09892

SWE-bench Goes Live!

View on arXiv

SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks

View on arXiv

Multi-Docker-Eval: A ‘Shovel of the Gold Rush’ Benchmark on Automatic Environment Building for Software Engineering

View on arXiv

Scaling Agentic Verifier for Competitive Coding

View on arXiv

SWE-MiniSandbox: Container-Free Reinforcement Learning for Building Software Engineering Agents

View on arXiv

Immersion in the GitHub Universe: Scaling Coding Agents to Mastery

View on arXiv

SWE Paper List

https://blog.alanzeng.com/paper-reading/llm4code/

Author

Alan Zeng

Posted on

2026-03-16

Updated on

2026-07-09

Licensed under

SWE Paper List

SWE-bench Goes Live!

SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks

SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale

SWE-Universe: Scale Real-World Verifiable Environments to Millions

DockSmith: Scaling Reliable Coding Environments via an Agentic Docker Builder

Multi-Docker-Eval: A ‘Shovel of the Gold Rush’ Benchmark on Automatic Environment Building for Software Engineering

Scaling Agentic Verifier for Competitive Coding

SWE-MiniSandbox: Container-Free Reinforcement Learning for Building Software Engineering Agents

Immersion in the GitHub Universe: Scaling Coding Agents to Mastery

Author

Posted on

Updated on

Licensed under

Comments

Catalogue

Categories

Tags

Links