afrideva commited on
Commit
4179d7f
1 Parent(s): db4ad4d

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +97 -0
README.md ADDED
@@ -0,0 +1,97 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BEE-spoke-data/zephyr-220m-sft-full
3
+ datasets:
4
+ - HuggingFaceH4/ultrachat_200k
5
+ inference: false
6
+ license: apache-2.0
7
+ model-index:
8
+ - name: zephyr-220m-sft-full
9
+ results: []
10
+ model_creator: BEE-spoke-data
11
+ model_name: zephyr-220m-sft-full
12
+ pipeline_tag: text-generation
13
+ quantized_by: afrideva
14
+ tags:
15
+ - generated_from_trainer
16
+ - gguf
17
+ - ggml
18
+ - quantized
19
+ - q2_k
20
+ - q3_k_m
21
+ - q4_k_m
22
+ - q5_k_m
23
+ - q6_k
24
+ - q8_0
25
+ ---
26
+ # BEE-spoke-data/zephyr-220m-sft-full-GGUF
27
+
28
+ Quantized GGUF model files for [zephyr-220m-sft-full](https://huggingface.co/BEE-spoke-data/zephyr-220m-sft-full) from [BEE-spoke-data](https://huggingface.co/BEE-spoke-data)
29
+
30
+
31
+ | Name | Quant method | Size |
32
+ | ---- | ---- | ---- |
33
+ | [zephyr-220m-sft-full.fp16.gguf](https://huggingface.co/afrideva/zephyr-220m-sft-full-GGUF/resolve/main/zephyr-220m-sft-full.fp16.gguf) | fp16 | 436.50 MB |
34
+ | [zephyr-220m-sft-full.q2_k.gguf](https://huggingface.co/afrideva/zephyr-220m-sft-full-GGUF/resolve/main/zephyr-220m-sft-full.q2_k.gguf) | q2_k | 94.43 MB |
35
+ | [zephyr-220m-sft-full.q3_k_m.gguf](https://huggingface.co/afrideva/zephyr-220m-sft-full-GGUF/resolve/main/zephyr-220m-sft-full.q3_k_m.gguf) | q3_k_m | 114.65 MB |
36
+ | [zephyr-220m-sft-full.q4_k_m.gguf](https://huggingface.co/afrideva/zephyr-220m-sft-full-GGUF/resolve/main/zephyr-220m-sft-full.q4_k_m.gguf) | q4_k_m | 137.58 MB |
37
+ | [zephyr-220m-sft-full.q5_k_m.gguf](https://huggingface.co/afrideva/zephyr-220m-sft-full-GGUF/resolve/main/zephyr-220m-sft-full.q5_k_m.gguf) | q5_k_m | 157.91 MB |
38
+ | [zephyr-220m-sft-full.q6_k.gguf](https://huggingface.co/afrideva/zephyr-220m-sft-full-GGUF/resolve/main/zephyr-220m-sft-full.q6_k.gguf) | q6_k | 179.52 MB |
39
+ | [zephyr-220m-sft-full.q8_0.gguf](https://huggingface.co/afrideva/zephyr-220m-sft-full-GGUF/resolve/main/zephyr-220m-sft-full.q8_0.gguf) | q8_0 | 232.28 MB |
40
+
41
+
42
+
43
+ ## Original Model Card:
44
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
45
+ should probably proofread and complete it, then remove this comment. -->
46
+
47
+ # zephyr-220m-sft-full
48
+
49
+ This model is a fine-tuned version of [BEE-spoke-data/smol_llama-220M-openhermes](https://huggingface.co/BEE-spoke-data/smol_llama-220M-openhermes) on the Ultrachat_200k dataset.
50
+ It achieves the following results on the evaluation set:
51
+ - Loss: 1.6579
52
+
53
+ ## Model description
54
+
55
+ More information needed
56
+
57
+ ## Intended uses & limitations
58
+
59
+ More information needed
60
+
61
+ ## Training and evaluation data
62
+
63
+ More information needed
64
+
65
+ ## Training procedure
66
+
67
+ ### Training hyperparameters
68
+
69
+ The following hyperparameters were used during training:
70
+ - learning_rate: 2e-05
71
+ - train_batch_size: 16
72
+ - eval_batch_size: 16
73
+ - seed: 42
74
+ - distributed_type: multi-GPU
75
+ - num_devices: 2
76
+ - gradient_accumulation_steps: 4
77
+ - total_train_batch_size: 128
78
+ - total_eval_batch_size: 32
79
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
80
+ - lr_scheduler_type: cosine
81
+ - num_epochs: 1
82
+
83
+ ### Training results
84
+
85
+ | Training Loss | Epoch | Step | Validation Loss |
86
+ |:-------------:|:-----:|:----:|:---------------:|
87
+ | 1.6447 | 1.0 | 1624 | 1.6579 |
88
+
89
+
90
+ ### Framework versions
91
+
92
+ - Transformers 4.37.0.dev0
93
+ - Pytorch 2.1.2+cu121
94
+ - Datasets 2.15.0
95
+ - Tokenizers 0.15.0
96
+
97
+ https://wandb.ai/amazingvince/huggingface/runs/5rffzk3x/workspace?workspace=user-amazingvince