Spaces:
Running
Running
metadata
title: README
emoji: π
colorFrom: pink
colorTo: indigo
sdk: static
pinned: false
Indic Language Benchmarking for Large Language Models
India is diverse with 22+ languages. This project aims to benchmark the performance of large language models on Indic languages across datasets. Goal is to evaluate a models abilities in understanding, generating, and processing text in these languages.
We currently have 8 languages across 3 datasets, more coming soon
Languages
- Bengali (bn)
- Gujarati (gu)
- Hindi (hi)
- Kannada (kn)
- Malayalam (ml)
- Odiya (or)
- Tamil (ta)
- Telugu (te)
Datasets
- ARC-Challenge: hi, bn, gu, kn, ml, or, ta, te
- TruthfulQA: hi, bn, gu, kn, ml, or, ta, te
- Hellaswag: hi, bn, gu, kn, ml, or, ta, te