Spaces:

meta-agents-research-environments
/

leaderboard

Running on CPU Upgrade

GAIA2 only applicable to LLM evaluation not agent scaffold evaluation?

by pseudotensor - opened about 1 month ago

about 1 month ago

Don't see mechanism to support alternative agents, only LLMs.

Meta Agents Research Environments org about 1 month ago

For running the benchmark and submitting results here, to be comparable, we recommend using the base react agent implementation provided with ARE. For your own experimentation, you can control the agent, however it's a bit harder than just changing the llm source as there is no standard there yet. You can check the BaseAgent class and pointers on how to extend it here: https://facebookresearch.github.io/meta-agents-research-environments/api_reference/agents.html or go deep and implement this class: https://github.com/facebookresearch/meta-agents-research-environments/blob/main/are/simulation/agents/are_simulation_agent.py and see how the agent builder chooses the agent: https://github.com/facebookresearch/meta-agents-research-environments/blob/main/are/simulation/agents/agent_builder.py#L22

You might want to move this discussion to the github repo where they might get more visibility.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment