Orr Zohar's picture

Orr Zohar PRO

orrzohar

·

https://orrzohar.github.io

AI & ML interests

Large Multi-Modal Models, Foundation Models, Video Understanding

Recent Activity

upvoted a paper about 5 hours ago

Temporal Preference Optimization for Long-Form Video Understanding

upvoted a paper 1 day ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

upvoted a paper 1 day ago

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

View all activity

Organizations

orrzohar's activity

upvoted a paper about 5 hours ago

Temporal Preference Optimization for Long-Form Video Understanding

Paper • 2501.13919 • Published 1 day ago • 14

upvoted 2 papers 1 day ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 2 days ago • 161

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published 2 days ago • 61

upvoted 3 papers 3 days ago

VideoWorld: Exploring Knowledge Learning from Unlabeled Videos

Paper • 2501.09781 • Published 8 days ago • 20

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published 8 days ago • 95

Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong

Paper • 2501.09775 • Published 9 days ago • 26

upvoted 2 papers 8 days ago

Learnings from Scaling Visual Tokenizers for Reconstruction and Generation

Paper • 2501.09755 • Published 8 days ago • 33

Towards Best Practices for Open Datasets for LLM Training

Paper • 2501.08365 • Published 10 days ago • 47

upvoted 7 papers 9 days ago

A Multi-Modal AI Copilot for Single-Cell Analysis with Instruction Following

Paper • 2501.08187 • Published 10 days ago • 24

Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

Paper • 2501.08326 • Published 10 days ago • 31

Enhancing Automated Interpretability with Output-Centric Feature Descriptions

Paper • 2501.08319 • Published 10 days ago • 10

OpenCSG Chinese Corpus: A Series of High-quality Chinese Datasets for LLM Training

Paper • 2501.08197 • Published 10 days ago • 7

Potential and Perils of Large Language Models as Judges of Unstructured Textual Data

Paper • 2501.08167 • Published 10 days ago • 6

HALoGEN: Fantastic LLM Hallucinations and Where to Find Them

Paper • 2501.08292 • Published 10 days ago • 16

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 10 days ago • 268

upvoted 2 papers 10 days ago

BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature

Paper • 2501.07171 • Published 12 days ago • 48

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published 11 days ago • 85

upvoted 2 papers 11 days ago

ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding

Paper • 2501.05452 • Published 15 days ago • 15

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

Paper • 2501.05510 • Published 15 days ago • 37

upvoted a paper 14 days ago

SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution

Paper • 2501.05040 • Published 16 days ago • 14