LMLK
/

AMD-Llama-135m-code-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

LMLK commited on Sep 28, 2024

Commit

71ff2e0

·

verified ·

1 Parent(s): bd66412

Update README.md

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -10,6 +10,11 @@ library_name: transformers
 ## Introduction
 AMD-Llama-135m is a language model trained on AMD MI250 GPUs. Based on LLaMA2 model architecture, this model can be smoothly loaded as LlamaForCausalLM with huggingface transformers. Furthermore, we use the same tokenizer as LLaMA2, enabling it to be a draft model of speculative decoding for LLaMA2 and CodeLlama.
 ```python
 import sys
 import os

 ## Introduction
 AMD-Llama-135m is a language model trained on AMD MI250 GPUs. Based on LLaMA2 model architecture, this model can be smoothly loaded as LlamaForCausalLM with huggingface transformers. Furthermore, we use the same tokenizer as LLaMA2, enabling it to be a draft model of speculative decoding for LLaMA2 and CodeLlama.
+## Quickstart
+AMD-Llama-135m-code-GGUF can be loaded and used via Llama.cpp, here is a program with GUI.
 ```python
 import sys
 import os