-
SocialIQA: Commonsense Reasoning about Social Interactions
Paper • 1904.09728 • Published • 2 -
PIQA: Reasoning about Physical Commonsense in Natural Language
Paper • 1911.11641 • Published • 2 -
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
Paper • 1905.10044 • Published • 1 -
HellaSwag: Can a Machine Really Finish Your Sentence?
Paper • 1905.07830 • Published • 4
Collections
Discover the best community collections!
Collections including paper arxiv:1911.11641
-
The Curious Case of Neural Text Degeneration
Paper • 1904.09751 • Published • 3 -
Getting it Right: Improving Spatial Consistency in Text-to-Image Models
Paper • 2404.01197 • Published • 30 -
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
Paper • 1905.10044 • Published • 1 -
PIQA: Reasoning about Physical Commonsense in Natural Language
Paper • 1911.11641 • Published • 2
-
The Curious Case of Neural Text Degeneration
Paper • 1904.09751 • Published • 3 -
PIQA: Reasoning about Physical Commonsense in Natural Language
Paper • 1911.11641 • Published • 2 -
SocialIQA: Commonsense Reasoning about Social Interactions
Paper • 1904.09728 • Published • 2 -
HellaSwag: Can a Machine Really Finish Your Sentence?
Paper • 1905.07830 • Published • 4
-
Can large language models explore in-context?
Paper • 2403.15371 • Published • 32 -
Long-context LLMs Struggle with Long In-context Learning
Paper • 2404.02060 • Published • 35 -
PIQA: Reasoning about Physical Commonsense in Natural Language
Paper • 1911.11641 • Published • 2 -
AQuA: A Benchmarking Tool for Label Quality Assessment
Paper • 2306.09467 • Published • 1
-
Can large language models explore in-context?
Paper • 2403.15371 • Published • 32 -
GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling
Paper • 2403.19655 • Published • 18 -
WavLLM: Towards Robust and Adaptive Speech Large Language Model
Paper • 2404.00656 • Published • 10 -
Enabling Memory Safety of C Programs using LLMs
Paper • 2404.01096 • Published • 1
-
Gemini: A Family of Highly Capable Multimodal Models
Paper • 2312.11805 • Published • 45 -
Measuring Massive Multitask Language Understanding
Paper • 2009.03300 • Published • 3 -
HellaSwag: Can a Machine Really Finish Your Sentence?
Paper • 1905.07830 • Published • 4 -
PIQA: Reasoning about Physical Commonsense in Natural Language
Paper • 1911.11641 • Published • 2