Xi's picture

Xi

xi0v

·

AI & ML interests

Reinforcement learning, Diffusion Model Merging, LLM Merging, Model Editing and Vision/Multimodal Model Fine-tuning.

Recent Activity

updated a model about 1 hour ago

xi0v/HoshiXL-v1

published a model about 1 hour ago

xi0v/HoshiXL-v1

updated a model about 1 hour ago

xi0v/Illu-Model-Archive

View all activity

Organizations

xi0v's activity

upvoted a paper 3 days ago

YourBench: Easy Custom Evaluation Sets for Everyone

Paper • 2504.01833 • Published 18 days ago • 20

upvoted an article 3 days ago

Article

Introducing HELMET

5 days ago

• 18

upvoted a paper 3 days ago

D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation

Paper • 2504.09454 • Published 8 days ago • 11

upvoted a paper 5 days ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 6 days ago • 228

upvoted 2 collections 6 days ago

Shisa V2

A family of bilingual JA/EN LLMs • 27 items • Updated 4 days ago • 6

GLM-4-0414

GLM-4-0414 series model • 8 items • Updated 5 days ago • 101

upvoted a paper 9 days ago

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published 19 days ago • 80

upvoted 2 papers 10 days ago

DDT: Decoupled Diffusion Transformer

Paper • 2504.05741 • Published 12 days ago • 71

OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

Paper • 2504.07096 • Published 11 days ago • 70

upvoted a paper 14 days ago

FreSca: Unveiling the Scaling Space in Diffusion Models

Paper • 2504.02154 • Published 18 days ago • 18

upvoted a paper 16 days ago

ZClip: Adaptive Spike Mitigation for LLM Pre-Training

Paper • 2504.02507 • Published 17 days ago • 76

upvoted a paper 18 days ago

Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources

Paper • 2504.00595 • Published 19 days ago • 34

upvoted a paper 19 days ago

AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation

Paper • 2503.19693 • Published 26 days ago • 75

upvoted a paper 23 days ago

ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model

Paper • 2503.21144 • Published 25 days ago • 25

upvoted an article 25 days ago

Article

Training and Finetuning Reranker Models with Sentence Transformers v4

26 days ago

• 112

upvoted a paper 26 days ago

I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published 27 days ago • 117

upvoted a paper 27 days ago

Modifying Large Language Model Post-Training for Diverse Creative Writing

Paper • 2503.17126 • Published about 1 month ago • 36

upvoted a paper 29 days ago

Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models

Paper • 2503.16257 • Published Mar 20 • 24