Mixture of Rewards

university

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

jshin49 authored a paper about 12 hours ago

Reducing Gender Bias in Abusive Language Detection

jshin49 authored a paper about 12 hours ago

EvalLM: Interactive Evaluation of Large Language Model Prompts on User-Defined Criteria

jshin49 authored a paper about 12 hours ago

Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

View all activity

MoR-labeling's activity

jshin49

authored 12 papers about 12 hours ago

Reducing Gender Bias in Abusive Language Detection

Paper • 1808.07231 • Published Aug 22, 2018

EvalLM: Interactive Evaluation of Large Language Model Prompts on User-Defined Criteria

Paper • 2309.13633 • Published Sep 24, 2023

Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Paper • 2310.08491 • Published Oct 12, 2023 • 55

Aligning Large Language Models through Synthetic Feedback

Paper • 2305.13735 • Published May 23, 2023 • 1

The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning

Paper • 2305.14045 • Published May 23, 2023 • 5

Who Wrote this Code? Watermarking for Code Generation

Paper • 2305.15060 • Published May 24, 2023 • 1

KLUE: Korean Language Understanding Evaluation

Paper • 2105.09680 • Published May 20, 2021 • 1

Dialogue Summaries as Dialogue States (DS2), Template-Guided Summarization for Few-shot Dialogue State Tracking

Paper • 2203.01552 • Published Mar 3, 2022

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 25

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Paper • 2405.01535 • Published May 2, 2024 • 123

The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

Paper • 2406.05761 • Published Jun 9, 2024 • 2

Trillion 7B Technical Report

Paper • 2504.15431 • Published 3 days ago • 25

seungone

authored 2 papers 15 days ago

M-Prometheus: A Suite of Open Multilingual LLM Judges

Paper • 2504.04953 • Published 18 days ago

Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators

Paper • 2503.19877 • Published about 1 month ago

wkddydpf

authored a paper 2 months ago

Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published Feb 18 • 58

seungone

authored 2 papers 4 months ago

LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation

Paper • 2412.10424 • Published Dec 10, 2024 • 2

Bridging the Data Provenance Gap Across Text, Speech and Video

Paper • 2412.17847 • Published Dec 19, 2024 • 9

seungone

authored a paper 5 months ago

Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published Dec 4, 2024 • 49

seungone

authored 2 papers 6 months ago

MM-Eval: A Multilingual Meta-Evaluation Benchmark for LLM-as-a-Judge and Reward Models

Paper • 2410.17578 • Published Oct 23, 2024 • 1

Better Instruction-Following Through Minimum Bayes Risk

Paper • 2410.02902 • Published Oct 3, 2024

AI & ML interests

Recent Activity

Team members 13

MoR-labeling's activity