anthonymikinka/DeepSeek-R1-Distill-Llama-8B-Stateful-CoreML Text Generation • Updated 23 days ago • 7 • 1
👩💻 OlympicCoder Collection Reasoning datasets and models for competitive coding • 4 items • Updated Mar 11 • 16
anthonymikinka/DeepSeek-R1-Distill-Llama-8B-Stateful-CoreML Text Generation • Updated 23 days ago • 7 • 1
view article Article Introducing smolagents: simple agents that write actions in code. Dec 31, 2024 • 973
anthonymikinka/DeepSeek-R1-Distill-Llama-8B-Stateful-CoreML Text Generation • Updated 23 days ago • 7 • 1
view post Post 2589 Lightweight (nanoGPT) implementation of hybrid norm - an intuitive normalization method that combines the strength of both pre-norm (i.e QKV-norm in MHA) and post-norm in the feed-forward network.Code: https://github.com/Jaykef/ai-algorithms/blob/main/hybrid_normalization.ipynb See translation 👀 5 5 🔥 3 3 + Reply