6 45 58

Lê Võ Quyết Thắng PRO

thangvip

https://vualidon.github.io/

AI & ML interests

Adapting LLM to specific domain

Recent Activity

updated a model 6 days ago

VietLegalLM/VietLegalLM-Sailor-DPO

published a model 6 days ago

VietLegalLM/VietLegalLM-Sailor-DPO

updated a model 6 days ago

VietLegalLM/VietLegalLM-Sailor-SFT

View all activity

Organizations

thangvip's activity

upvoted a paper 7 days ago

How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

Paper • 2502.14502 • Published Feb 20 • 90

upvoted an article 21 days ago

Article

Training and Finetuning Reranker Models with Sentence Transformers v4

28 days ago

• 115

upvoted 2 articles 3 months ago

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

Jan 23

• 172

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 845

upvoted 2 papers 3 months ago

Agent Laboratory: Using LLM Agents as Research Assistants

Paper • 2501.04227 • Published Jan 8 • 91

The GAN is dead; long live the GAN! A Modern GAN Baseline

Paper • 2501.05441 • Published Jan 9 • 92

upvoted 3 papers 5 months ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 130

Natural Language Reinforcement Learning

Paper • 2411.14251 • Published Nov 21, 2024 • 31

Hymba: A Hybrid-head Architecture for Small Language Models

Paper • 2411.13676 • Published Nov 20, 2024 • 45

upvoted a collection 5 months ago

Tulu 3 Datasets

Collection

All datasets released with Tulu 3 -- state of the art open post-training recipes. • 33 items • Updated Mar 13 • 78

upvoted 3 papers 5 months ago

upvoted 4 papers 6 months ago

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22, 2024 • 94

From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning

Paper • 2410.06456 • Published Oct 9, 2024 • 38

GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment

Paper • 2410.08193 • Published Oct 10, 2024 • 4

Large Language Models as Markov Chains

Paper • 2410.02724 • Published Oct 3, 2024 • 33

upvoted 2 papers 7 months ago

Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models

Paper • 2409.11136 • Published Sep 17, 2024 • 24

Cottention: Linear Transformers With Cosine Attention

Paper • 2409.18747 • Published Sep 27, 2024 • 17