OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement Paper ā¢ 2503.17352 ā¢ Published 13 days ago ā¢ 21
AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies Paper ā¢ 2406.17864 ā¢ Published Jun 25, 2024
DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails Paper ā¢ 2502.05163 ā¢ Published Feb 7 ā¢ 22
DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails Paper ā¢ 2502.05163 ā¢ Published Feb 7 ā¢ 22
Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning Paper ā¢ 2410.22304 ā¢ Published Oct 29, 2024 ā¢ 18
CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning Paper ā¢ 2303.03323 ā¢ Published Mar 6, 2023 ā¢ 1
Unsupervised Learning of Neural Networks to Explain Neural Networks Paper ā¢ 1805.07468 ā¢ Published May 18, 2018
Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality Paper ā¢ 2310.06982 ā¢ Published Oct 10, 2023
Robust Learning with Progressive Data Expansion Against Spurious Correlation Paper ā¢ 2306.04949 ā¢ Published Jun 8, 2023
SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models Paper ā¢ 2403.07384 ā¢ Published Mar 12, 2024 ā¢ 1
AIR-Bench 2024: A Safety Benchmark Based on Risk Categories from Regulations and Policies Paper ā¢ 2407.17436 ā¢ Published Jul 11, 2024
SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI Paper ā¢ 2410.11096 ā¢ Published Oct 14, 2024 ā¢ 13
Enhancing Large Vision Language Models with Self-Training on Image Comprehension Paper ā¢ 2405.19716 ā¢ Published May 30, 2024
MIRAI: Evaluating LLM Agents for Event Forecasting Paper ā¢ 2407.01231 ā¢ Published Jul 1, 2024 ā¢ 18
view post Post 1363 Check out our new benchmark paper on LLM agents for global events forecasting! MIRAI: Evaluating LLM Agents for Event Forecasting (2407.01231) š Arxiv: https://arxiv.org/abs/2407.01231š Project page: https://mirai-llm.github.ioš» GitHub Repo: https://github.com/yecchen/MIRAIš Dataset: https://drive.google.com/file/d/1xmSEHZ_wqtBu1AwLpJ8wCDYmT-jRpfrN/view?usp=sharingš Interactive Demo Notebook: https://colab.research.google.com/drive/1QyqT35n6NbtPaNtqQ6A7ILG_GMeRgdnO?usp=sharing ā¤ļø 2 2 + Reply
Mixture-of-Agents Enhances Large Language Model Capabilities Paper ā¢ 2406.04692 ā¢ Published Jun 7, 2024 ā¢ 59
Introducing v0.5 of the AI Safety Benchmark from MLCommons Paper ā¢ 2404.12241 ā¢ Published Apr 18, 2024 ā¢ 11
RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content Paper ā¢ 2403.13031 ā¢ Published Mar 19, 2024 ā¢ 1