Spaces:

Ahil1991
/

Bee_8B_HF_Space

Sleeping

Ahil1991 commited on Sep 15, 2024

Commit

f9c6b3d

verified ·

1 Parent(s): 27b8cd9

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -4,8 +4,8 @@ import time
 # Load your LLaMA model with GPU support, quantization, and multi-threading
 llm = Llama.from_pretrained(
-    repo_id="Ahil1991/Bee-V.01-7B",
-    filename="Bee-V.01.gguf",
     use_gpu=True,  # Enable GPU if available
     quantize="4bit",  # Quantization for speed (4-bit or 8-bit, adjust based on needs)
     num_threads=4  # Adjust based on CPU cores available (only for CPU use)

 # Load your LLaMA model with GPU support, quantization, and multi-threading
 llm = Llama.from_pretrained(
+    repo_id="Ahil1991/Bee-8.3B",
+    filename="Bee 8.3B Q4_K_M.gguf",
     use_gpu=True,  # Enable GPU if available
     quantize="4bit",  # Quantization for speed (4-bit or 8-bit, adjust based on needs)
     num_threads=4  # Adjust based on CPU cores available (only for CPU use)