PRefLexOR Collection PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking • 4 items • Updated Oct 25 • 2
QVQ Collection QVQ: Qwen models for visual reasoning • 4 items • Updated about 24 hours ago • 17
Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment Paper • 2410.14148 • Published Oct 18 • 1