Spaces:
Running
Running
File size: 4,650 Bytes
b36652f 94852fa b36652f 94852fa b36652f f8d35c2 b36652f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 |
---
title: Agent Papers
emoji: 🥇
colorFrom: green
colorTo: indigo
sdk: gradio
app_file: app.py
pinned: true
license: apache-2.0
short_description: LLM Agent Research Collection
sdk_version: 5.19.0
---
# Large Language Model Agent Papers Explorer
This is a companion application for the paper "Large Language Model Agent: A Survey on Methodology, Applications and Challenges" ([arXiv:2503.21460](https://arxiv.org/abs/2503.21460)).
## About
The application provides an interactive interface to explore papers from our comprehensive survey on Large Language Model (LLM) agents. It allows you to search and filter papers across key categories including agent construction, collaboration mechanisms, evolution, tools, security, benchmarks, and applications.

## Key Features
- **Paper Search**: Find papers by keywords, titles, summaries, or publication venues
- **Category Filtering**: Browse papers by sections/categories
- **Year Filtering**: Filter papers by publication year
- **Sorting Options**: Sort papers by year, title, or section
- **Paper Statistics**: View distributions of papers across categories and years
- **Direct Links**: Access original papers through direct links to their sources
## Collection Overview
Our paper collection spans multiple categories:
- **Introduction**: Survey papers and foundational works introducing LLM agents
- **Construction**: Papers on building and designing agents
- **Collaboration**: Multi-agent systems and communication methods
- **Evolution**: Learning and improvement of agents over time
- **Tools**: Integration of external tools with LLM agents
- **Security**: Safety, alignment, and ethical considerations
- **Datasets & Benchmarks**: Evaluation frameworks and resources
- **Applications**: Domain-specific uses in science, medicine, etc.
## Related Resources
- [Full Survey Paper on arXiv](https://arxiv.org/abs/2503.21460)
- [Awesome-Agent-Papers GitHub Repository](https://github.com/luo-junyu/Awesome-Agent-Papers)
## How to Contribute
If you have a paper that you believe should be included in our collection:
1. Check if the paper is already in our database
2. Submit your paper at [this form](https://forms.office.com/r/sW0Zzymi5b) or email us at [email protected]
3. Include the paper's title, authors, abstract, URL, publication venue, and year
4. Suggest a section/category for the paper
## Citation
If you find our survey helpful, please consider citing our work:
```
@article{agentsurvey2025,
title={Large Language Model Agent: A Survey on Methodology, Applications and Challenges},
author={Junyu Luo and Weizhi Zhang and Ye Yuan and Yusheng Zhao and Junwei Yang and Yiyang Gu and Bohan Wu and Binqi Chen and Ziyue Qiao and Qingqing Long and Rongcheng Tu and Xiao Luo and Wei Ju and Zhiping Xiao and Yifan Wang and Meng Xiao and Chenwu Liu and Jingyang Yuan and Shichang Zhang and Yiqiao Jin and Fan Zhang and Xian Wu and Hanqing Zhao and Dacheng Tao and Philip S. Yu and Ming Zhang},
journal={arXiv preprint arXiv:2503.21460},
year={2025}
}
```
## Local Development
To run this application locally:
1. Clone this repository
2. Install the required dependencies with `pip install -r requirements.txt`
3. Run the application with `python app.py`
## License
This project is licensed under the MIT License - see the LICENSE file for details.
# Start the configuration
Most of the variables to change for a default leaderboard are in `src/env.py` (replace the path for your leaderboard) and `src/about.py` (for tasks).
Results files should have the following format and be stored as json files:
```json
{
"config": {
"model_dtype": "torch.float16", # or torch.bfloat16 or 8bit or 4bit
"model_name": "path of the model on the hub: org/model",
"model_sha": "revision on the hub",
},
"results": {
"task_name": {
"metric_name": score,
},
"task_name2": {
"metric_name": score,
}
}
}
```
Request files are created automatically by this tool.
If you encounter problem on the space, don't hesitate to restart it to remove the create eval-queue, eval-queue-bk, eval-results and eval-results-bk created folder.
# Code logic for more complex edits
You'll find
- the main table' columns names and properties in `src/display/utils.py`
- the logic to read all results and request files, then convert them in dataframe lines, in `src/leaderboard/read_evals.py`, and `src/populate.py`
- the logic to allow or filter submissions in `src/submission/submit.py` and `src/submission/check_validity.py` |