Post
148
π’ For those who wish to launch distilled DeepSeek R1 for reasoning with schema, sharing the Google Colab notebook:
π https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_deep_seek_7b_distill_colab.ipynb
This is a wrapper of the Qwen2 model hf provider via bulk-chain framework.
Model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
GPU: T4 (15GB) is nearly enough in float32 mode.
π To boost performance to load in bf16 (setup use_bf16=True)
π Powered by bulk-chain: https://github.com/nicolay-r/bulk-chain
π https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_deep_seek_7b_distill_colab.ipynb
This is a wrapper of the Qwen2 model hf provider via bulk-chain framework.
Model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
GPU: T4 (15GB) is nearly enough in float32 mode.
π To boost performance to load in bf16 (setup use_bf16=True)
π Powered by bulk-chain: https://github.com/nicolay-r/bulk-chain