Learning to (Learn at Test Time): RNNs with Expressive Hidden States Paper • 2407.04620 • Published Jul 5 • 27
Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive? Paper • 2406.04391 • Published Jun 6 • 7
Principled Federated Domain Adaptation: Gradient Projection and Auto-Weighting Paper • 2302.05049 • Published Feb 10, 2023
Crossing Linguistic Horizons: Finetuning and Comprehensive Evaluation of Vietnamese Large Language Models Paper • 2403.02715 • Published Mar 5 • 3
Scaling Laws for Downstream Task Performance of Large Language Models Paper • 2402.04177 • Published Feb 6 • 17
Are Emergent Abilities of Large Language Models a Mirage? Paper • 2304.15004 • Published Apr 28, 2023 • 6
Representation Engineering: A Top-Down Approach to AI Transparency Paper • 2310.01405 • Published Oct 2, 2023 • 5
Pairwise Ranking Losses of Click-Through Rates Prediction for Welfare Maximization in Ad Auctions Paper • 2306.01799 • Published Jun 1, 2023 • 1
Transforming and Combining Rewards for Aligning Large Language Models Paper • 2402.00742 • Published Feb 1 • 11
Beyond Scale: the Diversity Coefficient as a Data Quality Metric Demonstrates LLMs are Pre-trained on Formally Diverse Data Paper • 2306.13840 • Published Jun 24, 2023 • 11
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models Paper • 2306.11698 • Published Jun 20, 2023 • 12