Syed-Hasan-8503
/

Mistral_classification_head_qlora

Text Generation

Generated from Trainer

Transformer-heads

Inference Endpoints

Model card Files Files and versions Community

Syed-Hasan-8503 commited on Apr 9

Commit

c7248ad

•

1 Parent(s): c919494

Update README.md

Files changed (1) hide show

README.md +6 -2

README.md CHANGED Viewed

@@ -34,8 +34,6 @@ dataset using QloRA. The model has been trained for 1 epoch on 1x A40 GPU. The e
 This experiment was performed using **[Transformer-heads library](https://github.com/center-for-humans-and-machines/transformer-heads/tree/main)**
-</details><br>
 ### Training Script
 The training script for attaching a new transformer head for classification task using QLoRA is following:
@@ -53,6 +51,12 @@ For evaluating the transformer head that has been attached to the base model, yo
 The following hyperparameters were used during training:
 * output_dir="emotion_linear_probe",
 * learning_rate=0.00002,
 * num_train_epochs=train_epochs,

 This experiment was performed using **[Transformer-heads library](https://github.com/center-for-humans-and-machines/transformer-heads/tree/main)**
 ### Training Script
 The training script for attaching a new transformer head for classification task using QLoRA is following:
 The following hyperparameters were used during training:
+train_epochs = 1
+eval_epochs = 1
+logging_steps = 1
+train_batch_size = 4
+eval_batch_size = 4
 * output_dir="emotion_linear_probe",
 * learning_rate=0.00002,
 * num_train_epochs=train_epochs,