--- base_model: unsloth/Meta-Llama-3.1-8B-bnb-4bit language: - ko license: apache-2.0 tags: - text-generation-inference - transformers - unsloth - llama - trl datasets: - wikimedia/wikipedia - FreedomIntelligence/alpaca-gpt4-korean --- # unsloth/Meta-Llama-3.1-8B-bnb-4bit fine tuning after Continued Pretraining # (TREX-Lab at Seoul Cyber University) ## Summary - Base Model : unsloth/Meta-Llama-3.1-8B-bnb-4bit - Dataset : wikimedia/wikipedia(Continued Pretraining), FreedomIntelligence/alpaca-gpt4-korean(FineTuning) - This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. - Test whether fine tuning of a large language model is possible on A30 GPU*1 (successful) - **Developed by:** [TREX-Lab at Seoul Cyber University] - **Language(s) (NLP):** [Korean] - **Finetuned from model :** [unsloth/Meta-Llama-3.1-8B-bnb-4bit] ## Continued Pretraining ``` warmup_steps = 10 learning_rate = 5e-5 embedding_learning_rate = 1e-5 bf16 = True optim = "adamw_8bit" weight_decay = 0.01 lr_scheduler_type = "linear" ``` ``` loss : 1.171600 ``` ## Fine Tuning Detail ``` warmup_steps = 10 learning_rate = 5e-5 embedding_learning_rate = 1e-5 bf16 = True optim = "adamw_8bit" weight_decay = 0.001 lr_scheduler_type = "linear" ``` ``` loss : 0.699600 ``` ## Usage #1 ``` # Prompt model_prompt = """다음은 작업을 설명하는 명령입니다. 요청을 적절하게 완료하는 응답을 작성하세요. ### 지침: {} ### 응답: {}""" FastLanguageModel.for_inference(model) inputs = tokenizer( [ model_prompt.format( "이순신 장군은 누구인가요 ? 자세하게 알려주세요.", "", ) ], return_tensors = "pt").to("cuda") outputs = model.generate(**inputs, max_new_tokens = 128, use_cache = True) tokenizer.batch_decode(outputs) ``` ## Usage #2 ``` from transformers import TextStreamer # Prompt model_prompt = """다음은 작업을 설명하는 명령입니다. 요청을 적절하게 완료하는 응답을 작성하세요. ### 지침: {} ### 응답: {}""" FastLanguageModel.for_inference(model) inputs = tokenizer( [ model_prompt.format( "지구를 광범위하게 설명하세요.", "", ) ], return_tensors = "pt").to("cuda") text_streamer = TextStreamer(tokenizer) value = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128, repetition_penalty = 0.1) ```