In a Training Loop 🔄
lewtun
·
AI & ML interests
LLMs, LLMs, LLMs
Organizations
lewtun/olmo3-7b-lora_ds200_ep32
Updated
lewtun/data-repetition-replication
Updated
lewtun/wordle-grpo-Qwen3-1.7B
Text Generation
• 2B • Updated • 8
Text Generation
• 4B • Updated • 1
lewtun/Qwen3-32B-SFT-20250908120312
Updated
lewtun/Qwen3-0.6B-SFT-20250908114642
Text Generation
• 0.6B • Updated • 2
lewtun/Qwen3-32B-SFT-20250908115917
Updated
lewtun/SmolLM2-135M-Instruct-SFT-Trackio-Test
Text Generation
• 0.1B • Updated • 1
lewtun/Qwen3-0.6B-SFT-Trackio-Test
Text Generation
• 0.6B • Updated • 3
lewtun/Qwen3-0.6B-SFT-Demo
Text Generation
• 0.6B • Updated lewtun/zephyr-7b-gemma-dpo
Updated
lewtun/zephyr-7b-gemma-sft
Updated
lewtun/smollm-360M-instruct-new
Updated
lewtun/mistral-7b-sft-constitutional-ai
Updated
lewtun/mistral-7b-dpo-constitutional-ai
Updated
lewtun/zephyr-7b-sft-full
Text Generation
• 266k • Updated • 8
lewtun/Qwen2.5-1.5B-Open-R1-Distill
Text Generation
• 2B • Updated lewtun/does-deepspeed-still-work-sft
Text Generation
• 2B • Updated lewtun/Llama-3.2-1B-SFT-Capybara-No-Packing-Llama
Text Generation
• 1B • Updated • 1
lewtun/Qwen2.5-1.5B-SFT-Capybara-No-Packing
Text Generation
• 2B • Updated • 4
lewtun/Llama-3.2-1B-SFT-Capybara-No-Packing-ChatML
Text Generation
• 1B • Updated • 1
lewtun/Qwen2.5-7B-Instruct-GRPO
Updated
lewtun/Qwen2.5-Math-1.5B-Instruct-GRPO
Updated
Text Generation
• Updated • 1
lewtun/Qwen2.5-1.5B-Open-R1-Code-GRPO
Updated
lewtun/smollm2-distill-default-chat-template
Text Generation
• 2B • Updated • 2
lewtun/qwen2.5-1.5b-distill-default-chat-template
2B • Updated • 4
lewtun/DeepSeek-R1-Distill-Qwen-1.5B-GRPO