MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems Paper • 2410.13716 • Published Oct 17, 2024
Ragnarök: A Reusable RAG Framework and Baselines for TREC 2024 Retrieval-Augmented Generation Track Paper • 2406.16828 • Published Jun 24, 2024
view post Post 3317 🦢 The SWIM-IR dataset contains 29 million text-retrieval training pairs across 27 diverse languages. It is one of the largest synthetic multilingual datasets generated using PaLM 2 on Wikipedia! 🔥🔥SWIM-IR dataset contains three subsets :- Cross-lingual:nthakur/swim-ir-cross-lingual- Monolingual: nthakur/swim-ir-monolingual- Indic Cross-lingual: nthakur/indic-swim-ir-cross-lingualCheck it out:https://huggingface.co/collections/nthakur/swim-ir-dataset-662ddaecfc20896bf14dd9b7 🔥 3 3 👀 1 1 🤯 1 1 + Reply
Resonance RoPE: Improving Context Length Generalization of Large Language Models Paper • 2403.00071 • Published Feb 29, 2024 • 22
Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face Paper • 2302.14534 • Published Feb 28, 2023
MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition Paper • 2210.12391 • Published Oct 22, 2022
Making a MIRACL: Multilingual Information Retrieval Across a Continuum of Languages Paper • 2210.09984 • Published Oct 18, 2022 • 2
AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages Paper • 2305.06897 • Published May 11, 2023 • 8
GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration Paper • 2306.01481 • Published Jun 2, 2023 • 1
NoMIRACL: Knowing When You Don't Know for Robust Multilingual Retrieval-Augmented Generation Paper • 2312.11361 • Published Dec 18, 2023 • 1
Resources for Brewing BEIR: Reproducible Reference Models and an Official Leaderboard Paper • 2306.07471 • Published Jun 13, 2023
NoMIRACL: Knowing When You Don't Know for Robust Multilingual Retrieval-Augmented Generation Paper • 2312.11361 • Published Dec 18, 2023 • 1
HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution Paper • 2307.16883 • Published Jul 31, 2023