Atla

company

Verified

https://www.atla-ai.com

Atla_AI

atla-ai

Activity Feed

AI & ML interests

Scalable oversight

Recent Activity

kaikaidai updated a Space 8 days ago

AtlaAI/LLMsOnTrial

mbartolo authored a paper 8 days ago

Command A: An Enterprise-Ready Large Language Model

mathias-atla updated a dataset 13 days ago

AtlaAI/demo-chat-responses

View all activity

Articles

Selene 1 Mini: the best small language model-as-a-judge

Jan 29

• 12

Judge Arena: Benchmarking LLMs as Evaluators

Nov 19, 2024

• 56

AtlaAI's activity

kaikaidai

updated a Space 8 days ago

LLMs on Trial

🌖

Ask questions and get expert answers from AI models

mbartolo

authored a paper 8 days ago

Command A: An Enterprise-Ready Large Language Model

Paper • 2504.00698 • Published 9 days ago • 23

mathias-atla

updated 2 datasets 13 days ago

AtlaAI/demo-chat-responses

Viewer • Updated 13 days ago • 50 • 52

AtlaAI/demo-chat-updated-prompt

Viewer • Updated 13 days ago • 50 • 35

mathias-atla

published a dataset 13 days ago

AtlaAI/demo-chat-updated-prompt

Viewer • Updated 13 days ago • 50 • 35

mathias-atla

updated a dataset 13 days ago

AtlaAI/demo-chat-original-prompt

Viewer • Updated 13 days ago • 50 • 34

mathias-atla

published a dataset 13 days ago

AtlaAI/demo-chat-original-prompt

Viewer • Updated 13 days ago • 50 • 34

kaikaidai

updated a collection 14 days ago

Selene 1

Collection

Our most powerful evaluation model, Selene 1 beats frontier models from leading labs. Try it for free here: https://www.atla-ai.com/sign-up • 2 items • Updated 14 days ago

kaikaidai

published a Space 14 days ago

LLMs on Trial

🌖

Ask questions and get expert answers from AI models

mathias-atla

published a dataset 15 days ago

AtlaAI/demo-chat-responses

Viewer • Updated 13 days ago • 50 • 52

MauriceBurg

in AtlaAI/Selene-1-Mini-Llama-3.1-8B 29 days ago

Limited Evaluation Capabilities

#4 opened 29 days ago by

h4rz3rk4s3

spisupat

in AtlaAI/Selene-1-Mini-Llama-3.1-8B about 1 month ago

Applying with Ragas/DeepEval evaluation

#1 opened 2 months ago by

tapos999

spisupat

updated a collection about 1 month ago

Selene 1

Collection

Our most powerful evaluation model, Selene 1 beats frontier models from leading labs. Try it for free here: https://www.atla-ai.com/sign-up • 2 items • Updated 14 days ago

spisupat

updated a Space about 1 month ago

Selene 1 Playground

🌍

Run evaluation tests with Selene and Selene-Mini models

inwaves

authored a paper 2 months ago

Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published Jan 27 • 36

spisupat

authored a paper 2 months ago

Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published Jan 27 • 36

MauriceBurg

authored a paper 2 months ago

Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published Jan 27 • 36

kaikaidai

authored a paper 2 months ago

Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published Jan 27 • 36

NinaCalvi

authored a paper 2 months ago

Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published Jan 27 • 36

kaikaidai

posted an update 4 months ago

Post

1079

📈 Early results on the 8B evaluation model we've been training...

@NinaCalvi wrote about the progress we've made this quarter towards training the best 'LLM-as-a-judge' evaluator. We've significantly improved against the baseline and are approaching state-of-the-art evaluation performance with an 8B model.

Next up: training Llama-3.1-70B 👀

Here's the full article: https://www.atla-ai.com/post/evaluating-the-evaluator

2 replies

AI & ML interests

Recent Activity

Articles

Selene 1 Mini: the best small language model-as-a-judge

Judge Arena: Benchmarking LLMs as Evaluators

Team members 13

AtlaAI's activity

LLMs on Trial

LLMs on Trial

Limited Evaluation Capabilities

Applying with Ragas/DeepEval evaluation

Selene 1 Playground