SWE-RL: Reinforcement Learning for LLMs on Software Evolution

Author: Neural Intelligence Network
Published: Sat 15 Mar 2025
Episode Link: https://podcasters.spotify.com/pod/show/neuralintelpod/episodes/SWE-RL-Reinforcement-Learning-for-LLMs-on-Software-Evolution-e2vn4g5

This paper introduces SWE-RL, a reinforcement learning (RL) method to improve large language models (LLMs) for software engineering tasks using software evolution data and rule-based rewards. The approach trains LLMs to autonomously learn from open-source software's lifecycle, including code snapshots, changes, and events. The resulting model, Llama3-SWE-RL-70B, achieves state-of-the-art performance among medium-sized models on SWE-bench Verified, a benchmark for solving real-world GitHub issues. Surprisingly, training with SWE-RL on software evolution data enhances the LLM's generalized reasoning skills, leading to improved performance on out-of-domain tasks like math and code generation. This highlights the potential of RL on software engineering data to improve LLM reasoning and the paper also introduces Agentless Mini, a framework that prioritizes straightforward component decomposition, parallelization, and scalability. Ultimately, this research paves the way for developing more powerful and reliable LLMs for software engineering.

Share to:

EachPod

EachPod

SWE-RL: Reinforcement Learning for LLMs on Software Evolution