Spaces:

SWE-Gym
/

README

Running

App Files Files Community

xingyaoww commited on 22 days ago

Commit

f6b86e1

verified ·

1 Parent(s): 5d12649

Update README.md

Browse files

Files changed (1) hide show

README.md +4 -45

README.md CHANGED Viewed

@@ -26,7 +26,9 @@ Model/Data associated with Paper:
 </p>
 <p align="center">
-<a href="assets/paper.pdf">📃 Paper</a>
 •
 <a href="https://huggingface.co/SWE-Gym" >🤗 Data & Models</a>
 </p>
@@ -49,52 +51,9 @@ Our baselines achieve new open SOTA - 32%/26% on SWE-Bench Verified/Lite, with p
 ![SWE-Gym Scaling](https://github.com/SWE-Gym/SWE-Gym/raw/main/assets/images/scaling.jpg)
 *SWE-Gym enables scalable improvements for software engineering agents at both training and inference time. Our current results is primarity bottlenecked by training and inference compute, rather than the size of our environment.*
-## SWE-Gym Environment
-We create SWE-Gym, the first environment for training SWE agents, with **2.4K real tasks from 11 Python repos** & a Lite split of 234 instances. SWE-Gym combines real-world Python tasks, repository context, executable environments, and test verification to train agents for solving software engineering problems.
-![SWE-Gym Repo Distribution](https://github.com/SWE-Gym/SWE-Gym/raw/main/assets/images/swe-gym.jpg)
-## SWE-Gym trains LMs as agents
-When fine-tuned on less than 500 agent-environment interaction trajectories sampled from it from GPT-4o and Claude 3.5 Sonnet, we achieve **+14%** absolute gains on SWE-Bench Verified with an 32B LM-powered OpenHands agent.
-![OpenHands Performance diff before and after training](https://github.com/SWE-Gym/SWE-Gym/raw/main/assets/images/oh-agent.jpg)
-## SWE-Gym enables self-improvement
-SWE-Gym is also effective across agent scaffolds. With rejection sampling fine-tuning and MeatlessTools scaffold, our 32B and 7B models achieve 20% and 10% respectively on SWE-Bench Lite through self-improvement.
-<p align="center">
-  <img src="https://github.com/SWE-Gym/SWE-Gym/raw/main/assets/images/ml-agent.jpg" width="80%" alt="Moatless self-improvement">
-</p>
-## SWE-Gym enables inference-time scaling
-SWE-Gym enables inference-time scaling through verifiers trained on agent trajectories.
-These verifiers identify most promising solutions via best-of-n selection, together with our learned agents, they achieve 32%/26% on SWE-Bench Verified/Lite, a new open SoTA.
-![Inference Time Scaling for Moatless Agent](https://github.com/SWE-Gym/SWE-Gym/raw/main/assets/images/inference-ml.jpg)
-*Inference Time Scaling for Moatless Agent*
-![Inference Time Scaling for OpenHands Agent](https://github.com/SWE-Gym/SWE-Gym/raw/main/assets/images/inference-oh.jpg)
-*Inference Time Scaling for OpenHands Agent*
-## Our baselines on SWE-Gym shows strong scaling trends
-Lastly, our ablations reveal strong scaling trends - performance is now bottlenecked by train and inference compute, rather than the size of our dataset. Pushing and improving these scaling trends further is an exciting direction for future work.
-![](https://github.com/SWE-Gym/SWE-Gym/raw/main/assets/images/scaling.jpg)
 ## Reproducing Results
-See [docs/OpenHands.md](docs/OpenHands.md) and [docs/MoatlessTools.md](docs/MoatlessTools.md) for instructions on reproducing results with our training and inference-time results for OpenHands and MoatlessTools agents.
 ## 📚 Citation

 </p>
 <p align="center">
+<a href="https://github.com/SWE-Gym/SWE-Gym">💻 Code </a>
+•
+<a href="https://github.com/SWE-Gym/SWE-Gym/raw/main/assets/paper.pdf">📃 Paper</a>
 •
 <a href="https://huggingface.co/SWE-Gym" >🤗 Data & Models</a>
 </p>
 ![SWE-Gym Scaling](https://github.com/SWE-Gym/SWE-Gym/raw/main/assets/images/scaling.jpg)
 *SWE-Gym enables scalable improvements for software engineering agents at both training and inference time. Our current results is primarity bottlenecked by training and inference compute, rather than the size of our environment.*
 ## Reproducing Results
+See [docs/OpenHands.md](https://github.com/SWE-Gym/SWE-Gym/tree/main/docs/OpenHands.md) and [docs/MoatlessTools.md](https://github.com/SWE-Gym/SWE-Gym/tree/main/docs/MoatlessTools.md) for instructions on reproducing results with our training and inference-time results for OpenHands and MoatlessTools agents.
 ## 📚 Citation