CallMeDaniel
/

Llama-2-7b-chat-hf_vn

@@ -5,45 +5,56 @@ language:
 - vi
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
 ### Model Description
 <!-- Provide a longer summary of what this model is. -->
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
 - **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
 ### Downstream Use [optional]
@@ -59,9 +70,9 @@ language:
 ## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
 ### Recommendations
@@ -79,17 +90,8 @@ Use the code below to get started with the model.
 ### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
 #### Training Hyperparameters
@@ -128,7 +130,6 @@ Use the code below to get started with the model.
 ### Results
-[More Information Needed]
 #### Summary
@@ -152,71 +153,81 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
 - **Compute Region:** [More Information Needed]
 - **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
 ### Model Architecture and Objective
 [More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]
-## Training procedure
-The following `bitsandbytes` quantization config was used during training:
-- quant_method: bitsandbytes
-- load_in_8bit: True
-- load_in_4bit: False
-- llm_int8_threshold: 6.0
-- llm_int8_skip_modules: None
-- llm_int8_enable_fp32_cpu_offload: False
-- llm_int8_has_fp16_weight: False
-- bnb_4bit_quant_type: fp4
-- bnb_4bit_use_double_quant: False
-- bnb_4bit_compute_dtype: float32
-### Framework versions
-- PEFT 0.6.3.dev0

 - vi
 ---
+# Vietnamese Fine-tuned Llama-2-7b-chat-hf
+This repository contains a Vietnamese-tuned version of the `Llama-2-7b-chat-hf` model, which has been fine-tuned on Vietnamese datasets using LoRA (Low-Rank Adaptation) techniques.
 ## Model Details
+This model is a fine-tuned version of the Llama-2-7b-chat-hf model, specifically adapted for improved performance on Vietnamese language tasks. It uses LoRA fine-tuning to efficiently adapt the large language model to Vietnamese data while maintaining much of the original model's general knowledge and capabilities.
 ### Model Description
 <!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [Daniel Du](https://github.com/danghoangnhan)
+- **Model type:** Large Language Model
+- **Language(s) (NLP):** Vietnamese
 - **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)
+- **Language:** Vietnamese
+### Direct Use
+You can use this model directly with the Hugging Face Transformers library:
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+from peft import PeftModel, PeftConfig
+# Load the base model
+base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-chat-hf")
+# Load the LoRA configuration and model
+peft_model_id = "CallMeMrFern/Llama-2-7b-chat-hf_vn"
+config = PeftConfig.from_pretrained(peft_model_id)
+model = PeftModel.from_pretrained(base_model, peft_model_id)
+# Load the tokenizer
+tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf")
+# Example usage
+input_text = "Xin chào, hôm nay thời tiết thế nào?"
+inputs = tokenizer(input_text, return_tensors="pt")
+outputs = model.generate(**inputs, max_length=100)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
 ### Downstream Use [optional]
 ## Bias, Risks, and Limitations
+- This model is specifically fine-tuned for Vietnamese and may not perform as well on other languages.
+- The model inherits limitations from the base Llama-2-7b-chat-hf model.
+- Performance may vary depending on the specific task and domain.
 ### Recommendations
 ### Training Data
+Dataset: alpaca_translate_GPT_35_10_20k.json (Vietnamese translation of the Alpaca dataset)
 #### Training Hyperparameters
 ### Results
 #### Summary
 - **Compute Region:** [More Information Needed]
 - **Carbon Emitted:** [More Information Needed]
 ### Model Architecture and Objective
 [More Information Needed]
+## Citation
+If you use this model in your research, please cite:
+```
+@misc{vietnamese_llama2_7b_chat,
+  author = {[Your Name]},
+  title = {Vietnamese Fine-tuned Llama-2-7b-chat-hf},
+  year = {2023},
+  publisher = {GitHub},
+  journal = {GitHub repository},
+  howpublished = {\url{https://huggingface.co/CallMeMrFern/Llama-2-7b-chat-hf_vn}}
+}
+```
+## Training procedure
+The following `bitsandbytes` quantization config was used during training:
+- quant_method: bitsandbytes
+- load_in_8bit: True
+- load_in_4bit: False
+- llm_int8_threshold: 6.0
+- llm_int8_skip_modules: None
+- llm_int8_enable_fp32_cpu_offload: False
+- llm_int8_has_fp16_weight: False
+- bnb_4bit_quant_type: fp4
+- bnb_4bit_use_double_quant: False
+- bnb_4bit_compute_dtype: float32
+### Framework versions
+- PEFT 0.6.3.dev0
+## Model Description
+This model is a fine-tuned version of the Llama-2-7b-chat-hf model, specifically adapted for improved performance on Vietnamese language tasks. It uses LoRA fine-tuning to efficiently adapt the large language model to Vietnamese data while maintaining much of the original model's general knowledge and capabilities.
+## Fine-tuning Details
+- **Fine-tuning Method:** LoRA (Low-Rank Adaptation)
+- **LoRA Config:**
+  - Target Modules: `["q_proj", "v_proj"]`
+  - Precision: 8-bit
+- **Dataset:** `alpaca_translate_GPT_35_10_20k.json` (Vietnamese translation of the Alpaca dataset)
+## Training Procedure
+The model was fine-tuned using the following command:
+```bash
+python finetune/lora.py \
+--base_model meta-llama/Llama-2-7b-chat-hf \
+--model_type llama \
+--data_dir data/general/alpaca_translate_GPT_35_10_20k.json \
+--output_dir finetuned/meta-llama/Llama-2-7b-chat-hf \
+--lora_target_modules '["q_proj", "v_proj"]' \
+--micro_batch_size 1
+```
+For multi-GPU training, a distributed training approach was used.
+## Evaluation Results
+[Include any evaluation results, perplexity scores, or benchmark performances here]
+## Acknowledgements
+- This project is part of the TF07 Course offered by ProtonX.
+- We thank the creators of the original Llama-2-7b-chat-hf model and the Hugging Face team for their tools and resources.
+- Appreciation to [VietnamAIHub/Vietnamese_LLMs](https://github.com/VietnamAIHub/Vietnamese_LLMs) for the translated dataset.