Can't load this model in colab/kaggle
I have tried running this model in colab, but I can't. Loading this model takes up all the disk space.
Is there a smaller model version with the same performance.
This is a very large model, 72B. I suggest wither using the GGUF models which are quantized from this model: https://huggingface.co/MaziyarPanahi/calme-2.1-qwen2-72b-GGUF
or using the 7B models: https://huggingface.co/MaziyarPanahi/calme-2.1-qwen2-7b
I have tried using the 7B model https://huggingface.co/MaziyarPanahi/calme-2.1-qwen2-7b
But I still get the same error of "Your notebook tried to allocate more memory than is available. It has restarted."
How can I go about this issue
It seems you don't have enough resources to use LLMs in your Colab. The last thing you can try for the 7B is to use load_in_4bit=True
, and your Colab must be on GPU.
If that didn't work, you just need more GPU memory or use the GGUFs on CPUs.
OK, maybe this helps. This is a free Colab: https://colab.research.google.com/drive/1TLtljhuyNK3AKEo0tUK1717BAnMGNSJC?usp=sharing
Thanks @MaziyarPanahi for this colab link. I have now managed to run it successfully with your instructions. And its producing amazing results
I would like to further use this and would like to ask; how can I use the 7B model to create a RAG chatbot using Langchain?
You are welcome. I recommend following this tutorial and similar online contents: https://huggingface.co/learn/cookbook/en/advanced_rag