Riyuechang
/

Breeze-7B-PTT-Chat-v2

Text Generation

Model card Files Files and versions Community

Riyuechang commited on Sep 16, 2024

Commit

b71cdd2

•

1 Parent(s): 06329dc

Update README.md

Files changed (1) hide show

README.md +42 -0

README.md CHANGED Viewed

@@ -8,3 +8,45 @@ tags:
 - PTT
 - PTT_Chat
 ---

 - PTT
 - PTT_Chat
 ---
+# 簡介
+本模型是基於[MediaTek-Research/Breeze-7B-Instruct-v1_0](https://huggingface.co/MediaTek-Research/Breeze-7B-Instruct-v1_0)微調後的產物
+模型使用來自[PTT](https://www.ptt.cc/bbs/index.html)網站中的[Gossiping](https://www.ptt.cc/bbs/Gossiping/index.html)分類的資料訓練
+過程中使用了一些方法從海量的數據中，過濾出噪聲較小(理論上)的部份作為訓練數據
+訓練資料: [Riyuechang/PTT-Corpus-100K_Gossiping-1400-39400](https://huggingface.co/datasets/Riyuechang/PTT-Corpus-100K_Gossiping-1400-39400)
+# 設備
+- Ubuntu 22.04.4 LTS
+- NVIDIA GeForce RTX 3060 12G
+# Lora參數
+```python
+r=8,
+lora_alpha=32,
+lora_dropout=0.1,
+task_type="CAUSAL_LM",
+target_modules="all-linear",
+bias="none",
+use_dora=True,
+use_rslora=True
+```
+# 訓練參數
+```python
+per_device_train_batch_size=28,
+gradient_accumulation_steps=1,
+num_train_epochs=3,
+warmup_ratio=0.1,
+learning_rate=2e-5,
+bf16=True,
+save_strategy="steps",
+save_steps=500,
+save_total_limit=10,
+logging_steps=10,
+output_dir=log_output,
+optim="paged_adamw_8bit",
+gradient_checkpointing=True
+```
+# 結果
+- loss: 1.1035