Open-Source Large Language Models Outperform Crowd Workers and Approach ChatGPT in Text-Annotation Tasks
Abstract
This study examines the performance of open-source Large Language Models (LLMs) in text annotation tasks and compares it with proprietary models like ChatGPT and human-based services such as MTurk. While prior research demonstrated the high performance of ChatGPT across numerous NLP tasks, open-source LLMs like HugginChat and FLAN are gaining attention for their cost-effectiveness, transparency, reproducibility, and superior data protection. We assess these models using both zero-shot and few-shot approaches and different temperature parameters across a range of text annotation tasks. Our findings show that while ChatGPT achieves the best performance in most tasks, open-source LLMs not only outperform MTurk but also demonstrate competitive potential against ChatGPT in specific tasks.
Community
你好
本研究考察了开源大语言模型的性能 (LLM)在文本注释任务中,并将其与专有模型进行比较,例如 ChatGPT和基于人类的服务,如MTurk。虽然先前的研究 展示了 ChatGPT 在众多 NLP 任务中的高性能, 像HugginChat和FRAN这样的开源LLM因其 成本效益、透明度、可重复性和卓越的数据 保护。我们使用零镜头和少镜头方法评估这些模型 以及一系列文本注释任务中的不同温度参数。 我们的研究结果表明,虽然 ChatGPT 在大多数情况下都实现了最佳性能 任务,开源LLM不仅优于MTurk,而且还展示了 在特定任务中与ChatGPT竞争潜力。
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper