--- license: apache-2.0 language: - ko tags: - construction - interior - defective - finished materials --- 주식회사 한솔데코의 공개 도메인 데이터셋을 토큰화 및 학습하였습니다. base model : mistralai/Mistral-7B-v0.1 Dataset : 한솔데코 도메인 데이터셋 DPo dataset : maywell님께서 업로드 주신 ko_Ultrafeedback_binarized을 사용하였습니다. ## 학습 파라미터 ``` num_train_epochs=3 per_device_train_batch_size=1 gradient_accumulation_steps=4 gradient_checkpointing=True learning_rate=5e-5 lr_scheduler_type="linear" max_steps=200 save_strategy="no" logging_steps=1 output_dir=new_model optim="paged_adamw_32bit" warmup_steps=100 fp16=True ``` ## 실행 예제 ```python from transformers import AutoTokenizer, AutoModelForCausalLM from transformers import TextStreamer, GenerationConfig model_name='sosoai/hansoldeco-mistral-dpo-v1' model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto") tokenizer = AutoTokenizer.from_pretrained(model_name) streamer = TextStreamer(tokenizer) def gen(x): generation_config = GenerationConfig( temperature=0.1, top_p=0.8, top_k=100, max_new_tokens=256, early_stopping=True, do_sample=True, repetition_penalty=1.2, ) q = f"[INST]{x} [/INST]" gened = model.generate( **tokenizer( q, return_tensors='pt', return_token_type_ids=False ).to('cuda'), generation_config=generation_config, pad_token_id=tokenizer.eos_token_id, eos_token_id=tokenizer.eos_token_id, streamer=streamer, ) result_str = tokenizer.decode(gened[0]) start_tag = f"\n\n### Response: " start_index = result_str.find(start_tag) if start_index != -1: result_str = result_str[start_index + len(start_tag):].strip() return result_str print(gen('마감하자는 어떤 종류가 있나요?')) ```