arxiv:2307.02179

Open-Source Large Language Models Outperform Crowd Workers and Approach ChatGPT in Text-Annotation Tasks

Published on Jul 5, 2023

· Submitted by

akhaliq on Jul 6, 2023

Upvote

Authors:

Maël Kubli ,

Juan Diego Bermeo ,

Fabrizio Gilardi

Abstract

This study examines the performance of open-source Large Language Models (LLMs) in text annotation tasks and compares it with proprietary models like ChatGPT and human-based services such as MTurk. While prior research demonstrated the high performance of ChatGPT across numerous NLP tasks, open-source LLMs like HugginChat and FLAN are gaining attention for their cost-effectiveness, transparency, reproducibility, and superior data protection. We assess these models using both zero-shot and few-shot approaches and different temperature parameters across a range of text annotation tasks. Our findings show that while ChatGPT achieves the best performance in most tasks, open-source LLMs not only outperform MTurk but also demonstrate competitive potential against ChatGPT in specific tasks.

View arXiv page View PDF Add to collection

Community

Gafdsaf

Jul 6, 2023

你好

Gafdsaf

Jul 6, 2023

本研究考察了开源大语言模型的性能（LLM）在文本注释任务中，并将其与专有模型进行比较，例如 ChatGPT和基于人类的服务，如MTurk。虽然先前的研究展示了 ChatGPT 在众多 NLP 任务中的高性能，像HugginChat和FRAN这样的开源LLM因其成本效益、透明度、可重复性和卓越的数据保护。我们使用零镜头和少镜头方法评估这些模型以及一系列文本注释任务中的不同温度参数。我们的研究结果表明，虽然 ChatGPT 在大多数情况下都实现了最佳性能任务，开源LLM不仅优于MTurk，而且还展示了在特定任务中与ChatGPT竞争潜力。

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2307.02179 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2307.02179 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2307.02179 in a Space README.md to link it from this page.