Model Card: Custom LLaMA-3 Model with 4-bit Quantization
Model Details
- Architecture: LoRA (Low-Rank Adaptation)
- Quantization: 4-bit
Model Description
This is a custom version of the LLaMA-3 language model trained with 4-bit quantization. The model uses LoRA (Low-Rank Adaptation) for efficient fine-tuning, allowing for reduced memory usage and faster training times without significant loss in performance.
Training Configuration
The model was trained using the following configuration:
- Learning Rate: 2e-4
- Optimizer: AdamW (8-bit)
- Weight Decay: 0.01
- LR Scheduler: Linear
- Mixed Precision: FP16/BF16 (depending on hardware support)
LoRA Configuration
The model uses LoRA for efficient parameter adaptation with the following settings:
- Rank (r): 16
- Target Modules:
["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
- LoRA Alpha: 16
Training Dataset
- Dataset: Custom dataset containing Turkish text data
- Max Sequence Length: 1024
Usage
To use this model, you can load it using the Hugging Face transformers
library:
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("erythropygia/LLAMA3-8B-Turkish-4bit-Quantized")
model = AutoModelForCausalLM.from_pretrained("erythropygia/LLAMA3-8B-Turkish-4bit-Quantized", low_cpu_mem_usage=True, load_in_4bit=True)
prompt_format = """Aşağıda bir görevi tanımlayan bir talimat ve daha fazla bağlam sağlayan bir girdi bulunmaktadır. Talebi uygun şekilde tamamlayan bir yanıt yazın.
### Instruction:
{}
### Input:
{}
### Response:
{}"""
inputs = tokenizer(
[
prompt_format.format(
""", # instruction
"", # input
"", # output - leave this blank for generation!
)
], return_tensors = "pt").to("cuda")
from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer)
_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 512, do_sample=True, temperature=0.75, top_k=50, top_p=0.9, repetition_penalty=1.1)
Performance
- Training Loss:: 1.385300
- Evaluation Metrics: To be updated based on evaluation results
- Limitations and Biases: This model inherits biases present in the training data. It is important to evaluate the model thoroughly for your specific use case and consider any ethical implications of its deployment.
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support