Xuankun Rong

XuankunRong

https://xuankunrong.github.io/

XuankunRong

AI & ML interests

Federated Learning, Continual Learning

Recent Activity

upvoted a paper about 7 hours ago

FlowReasoner: Reinforcing Query-Level Meta-Agents

upvoted a paper 14 days ago

ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers

upvoted a paper 20 days ago

Efficient Inference for Large Reasoning Models: A Survey

View all activity

Organizations

None yet

XuankunRong's activity

upvoted a paper about 7 hours ago

FlowReasoner: Reinforcing Query-Level Meta-Agents

Paper • 2504.15257 • Published about 21 hours ago • 30

upvoted a paper 14 days ago

ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers

Paper • 2504.00502 • Published 21 days ago • 21

upvoted a paper 20 days ago

Efficient Inference for Large Reasoning Models: A Survey

Paper • 2503.23077 • Published 24 days ago • 46

upvoted a paper 27 days ago

Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation

Paper • 2503.19622 • Published 28 days ago • 30

authored a paper about 2 months ago

A Survey of Safety on Large Vision-Language Models: Attacks, Defenses and Evaluations

Paper • 2502.14881 • Published Feb 14 • 1

updated a collection about 2 months ago

LVLM Safety

Collection

LVLM Safety • 1 item • Updated Feb 25

upvoted 2 papers about 2 months ago

A Survey of Safety on Large Vision-Language Models: Attacks, Defenses and Evaluations

Paper • 2502.14881 • Published Feb 14 • 1

VLM^2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues

Paper • 2502.12084 • Published Feb 17 • 30

liked a dataset 2 months ago

Sterzhang/PVIT-3M

Viewer • Updated Nov 2, 2024 • 3M • 222 • 18

upvoted 2 papers 2 months ago

Logical Reasoning in Large Language Models: A Survey

Paper • 2502.09100 • Published Feb 13 • 23

MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency

Paper • 2502.09621 • Published Feb 13 • 28

liked a Space 3 months ago

1.95k

Chat With Janus-Pro-7B

🌍

A unified multimodal understanding and generation model.

liked 2 Spaces 6 months ago

630

Qwen2-VL-72B

🌖

Engage in multi-modal conversations with images and videos

319

Qwen-VL-Max

📷

Interact with images and texts using Qwen-VL-Max

upvoted a paper 6 months ago

Personalized Visual Instruction Tuning

Paper • 2410.07113 • Published Oct 9, 2024 • 71