DABstep Leaderboard
DABstep Reasoning Benchmark Leaderboard
None defined yet.
Welcome to Adyen AI 🚀
At Adyen, we power seamless payments for businesses worldwide. As a global financial technology company, we provide an end-to-end platform that enables businesses to accept payments across online, mobile, and in-store channels. Our technology is trusted by leading brands like Spotify, Uber, and eBay to optimize payment operations and enhance customer experiences.
On this page, you’ll find our open-source AI projects—built to drive innovation in payments, machine learning, and financial technology. We’re excited to share our research, tools, and models with the community. Contributions and collaborations are always welcome!
💡 Explore our projects and join us in shaping the future of AI in fintech.
The Data Agent Benchmark for Multi-step Reasoning (DABStep) is a benchmark developed in collaboration with Hugging Face 🤗 to evaluate the capabilities of LLM Agents in real-world data analysis tasks.
We’ve curated new datasets and 450 tasks derived from real-world challenges at Adyen, setting them up in an easy-to-use evaluation framework for agentic research. This benchmark aims to drive advancements in AI-powered data analysis by providing a structured way to measure and improve multi-step reasoning in LLM Agents.