MoxoffSrL
/

AzzurroQuantized

Text Generation

Inference Endpoints

Model card Files Files and versions Community

marcodambra commited on Apr 8, 2024

Commit

99c461a

·

verified ·

1 Parent(s): 3db21aa

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -46,7 +46,7 @@ llm = Llama(
   model_path="/path/to/model.gguf",  # Download the model file first
   n_ctx=2048,  # The max sequence length to use - note that longer sequence lengths require much more resources
   n_threads=8,            # The number of CPU threads to use, tailor to your system and the resulting performance
-  n_gpu_layers=35         # The number of layers to offload to GPU, if you have GPU acceleration available
 )
 # Simple inference example
@@ -79,7 +79,7 @@ print(assistant_message)
 ## Bias, Risks and Limitations
-Azzurro-Quantized and its original model [Azzurro](https://huggingface.co/MoxoffSpA/Azzurro) have not been aligned to human preferences for safety within the RLHF phase or deployed with in-the-loop filtering of
 responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so). It is also unknown what the size and composition
 of the corpus were used to train the base model [mistralai/Mistral-7B-v0.2](https://huggingface.co/mistralai/Mistral-7B-v0.2), however, it is likely to have included a mix of Web data and technical sources
 like books and code.

   model_path="/path/to/model.gguf",  # Download the model file first
   n_ctx=2048,  # The max sequence length to use - note that longer sequence lengths require much more resources
   n_threads=8,            # The number of CPU threads to use, tailor to your system and the resulting performance
+  n_gpu_layers=0         # The number of layers to offload to GPU, if you have GPU acceleration available
 )
 # Simple inference example
 ## Bias, Risks and Limitations
+AzzurroQuantized and its original model [Azzurro](https://huggingface.co/MoxoffSpA/Azzurro) have not been aligned to human preferences for safety within the RLHF phase or deployed with in-the-loop filtering of
 responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so). It is also unknown what the size and composition
 of the corpus were used to train the base model [mistralai/Mistral-7B-v0.2](https://huggingface.co/mistralai/Mistral-7B-v0.2), however, it is likely to have included a mix of Web data and technical sources
 like books and code.