3 13 4

Vardaan Pahuja

vardaan123

https://vardaanpahuja.github.io/

AI & ML interests

LLM Agents, Multimodal Foundation Models, Knowledge Graphs

Recent Activity

upvoted a paper 1 day ago

Diversifying Joint Vision-Language Tokenization Learning

upvoted a paper 1 day ago

A Systematic Investigation of KB-Text Embedding Alignment at Scale

upvoted an article 12 days ago

Open-source DeepResearch – Freeing our search agents

View all activity

Organizations

vardaan123's activity

upvoted 2 papers 1 day ago

Diversifying Joint Vision-Language Tokenization Learning

Paper • 2306.03421 • Published Jun 6, 2023 • 2

A Systematic Investigation of KB-Text Embedding Alignment at Scale

Paper • 2106.01586 • Published Jun 3, 2021 • 1

upvoted an article 12 days ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.2k

upvoted 2 papers 25 days ago

Structure Learning for Neural Module Networks

Paper • 1905.11532 • Published May 27, 2019 • 1

Learning Sparse Mixture of Experts for Visual Question Answering

Paper • 1909.09192 • Published Sep 19, 2019 • 1

liked a dataset about 1 month ago

ritaranx/clinical-synthetic-text-llm

Viewer • Updated Jul 2, 2024 • 9.36k • 318 • 2

upvoted a paper about 1 month ago

Bringing Back the Context: Camera Trap Species Identification as Link Prediction on Multimodal Knowledge Graphs

Paper • 2401.00608 • Published Dec 31, 2023 • 2

commented a paper about 1 month ago

Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents

Paper • 2502.11357 • Published Feb 17 • 10 •

authored 5 papers about 1 month ago

Diversifying Joint Vision-Language Tokenization Learning

Paper • 2306.03421 • Published Jun 6, 2023 • 2

A Systematic Investigation of KB-Text Embedding Alignment at Scale

Paper • 2106.01586 • Published Jun 3, 2021 • 1

Bringing Back the Context: Camera Trap Species Identification as Link Prediction on Multimodal Knowledge Graphs

Paper • 2401.00608 • Published Dec 31, 2023 • 2

A Retrieve-and-Read Framework for Knowledge Graph Link Prediction

Paper • 2212.09724 • Published Dec 19, 2022 • 1

Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents

Paper • 2502.11357 • Published Feb 17 • 10

upvoted a paper about 1 month ago

Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents

Paper • 2502.11357 • Published Feb 17 • 10

New activity in huggingface/HuggingDiscussions about 1 month ago

[FEEDBACK] Daily Papers

127

#32 opened 10 months ago by

kramp

New activity in meta-llama/Llama-3.2-11B-Vision-Instruct 4 months ago

Flash Attention Support

#41 opened 6 months ago by

rameshch

liked a model 6 months ago

mistralai/Pixtral-12B-2409

Image-Text-to-Text • Updated Dec 26, 2024 • • 627

upvoted 2 papers 6 months ago

ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery

Paper • 2410.05080 • Published Oct 7, 2024 • 21

Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents

Paper • 2410.05243 • Published Oct 7, 2024 • 19

upvoted a paper 12 months ago

Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

Paper • 2404.05719 • Published Apr 8, 2024 • 82