Can't load this model in colab/kaggle

#18
by Kamal-Moha - opened

I have tried running this model in colab, but I can't. Loading this model takes up all the disk space.

Is there a smaller model version with the same performance.

This is a very large model, 72B. I suggest wither using the GGUF models which are quantized from this model: https://huggingface.co/MaziyarPanahi/calme-2.1-qwen2-72b-GGUF

or using the 7B models: https://huggingface.co/MaziyarPanahi/calme-2.1-qwen2-7b

I have tried using the 7B model https://huggingface.co/MaziyarPanahi/calme-2.1-qwen2-7b

But I still get the same error of "Your notebook tried to allocate more memory than is available. It has restarted."

How can I go about this issue

It seems you don't have enough resources to use LLMs in your Colab. The last thing you can try for the 7B is to use load_in_4bit=True, and your Colab must be on GPU.

If that didn't work, you just need more GPU memory or use the GGUFs on CPUs.

MaziyarPanahi changed discussion status to closed
MaziyarPanahi changed discussion status to open

Thanks @MaziyarPanahi for this colab link. I have now managed to run it successfully with your instructions. And its producing amazing results

I would like to further use this and would like to ask; how can I use the 7B model to create a RAG chatbot using Langchain?

You are welcome. I recommend following this tutorial and similar online contents: https://huggingface.co/learn/cookbook/en/advanced_rag

MaziyarPanahi changed discussion status to closed

Sign up or log in to comment