3 7 2

Hao Fei

scofield7419

http://haofei.vip/

scofield7419

AI & ML interests

Natural Language Processing, Vision and Language, Structural Modeling, Large Language Model

Recent Activity

authored a paper about 8 hours ago

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

upvoted a paper about 8 hours ago

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

updated a Space 22 days ago

General-Level/README

View all activity

Organizations

scofield7419's activity

authored a paper about 8 hours ago

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Paper • 2503.12605 • Published 2 days ago • 13

upvoted a paper about 8 hours ago

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Paper • 2503.12605 • Published 2 days ago • 13

updated a Space 22 days ago

README

🌍

published a dataset 22 days ago

General-Level/General-Bench

Updated 22 days ago • 43

published a Space 23 days ago

README

🌍

upvoted a paper 2 months ago

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Paper • 2501.04001 • Published Jan 7 • 43

commented 4 papers 3 months ago

Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing

Paper • 2412.19806 • Published Oct 8, 2024 • 1 •

Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing

Paper • 2412.19806 • Published Oct 8, 2024 • 1 •

Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing

Paper • 2412.19806 • Published Oct 8, 2024 • 1 •

Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing

Paper • 2412.19806 • Published Oct 8, 2024 • 1 •

authored a paper 3 months ago

Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing

Paper • 2412.19806 • Published Oct 8, 2024 • 1

commented a paper 3 months ago

Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing

Paper • 2412.19806 • Published Oct 8, 2024 • 1 •

upvoted a paper 3 months ago

Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing

Paper • 2412.19806 • Published Oct 8, 2024 • 1

authored 3 papers 4 months ago

Transfer Visual Prompt Generator across LLMs

Paper • 2305.01278 • Published May 2, 2023

Reasoning Implicit Sentiment with Chain-of-Thought Prompting

Paper • 2305.11255 • Published May 18, 2023 • 1

MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter

Paper • 2310.12798 • Published Oct 19, 2023 • 4

upvoted a paper 4 months ago

Faithful Logical Reasoning via Symbolic Chain-of-Thought

Paper • 2405.18357 • Published May 28, 2024 • 2

authored 3 papers 4 months ago

LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning

Paper • 2311.18651 • Published Nov 30, 2023

LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation

Paper • 2308.05095 • Published Aug 9, 2023

Empowering Dynamics-aware Text-to-Video Diffusion with Large Language Models

Paper • 2308.13812 • Published Aug 26, 2023 • 1