artificialguybr commited on
Commit
f991def
1 Parent(s): 8ec999c

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +77 -0
README.md ADDED
@@ -0,0 +1,77 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - de
5
+ - fr
6
+ - it
7
+ - pt
8
+ - hi
9
+ - es
10
+ - th
11
+ library_name: transformers
12
+ pipeline_tag: text-generation
13
+ license: llama3.2
14
+ base_model: NousResearch/Llama-3.2-1B
15
+ tags:
16
+ - generated_from_trainer
17
+ - facebook
18
+ - meta
19
+ - pytorch
20
+ - llama
21
+ - llama-3
22
+ model-index:
23
+ - name: llama3.2-1b-synthia-II
24
+ results: []
25
+ ---
26
+
27
+ # Llama 3.2 1B - Synthia-v1.5-II - Redmond - Fine-tuned Model
28
+
29
+ This model is a fine-tuned version of [NousResearch/Llama-3.2-1B](https://huggingface.co/NousResearch/Llama-3.2-1B) on the [Synthia-v1.5-II](https://huggingface.co/datasets/migtissera/Synthia-v1.5-II) dataset.
30
+
31
+ Thanks [RedmondAI](https://redmond.ai) for all the GPU Support!
32
+
33
+ ## Model Description
34
+
35
+ The base model is Llama 3.2 1B, a multilingual large language model developed by Meta. This version has been fine-tuned on the Synthia-v1.5-II instruction dataset to improve its instruction-following capabilities.
36
+
37
+ ### Training Data
38
+
39
+ The model was fine-tuned on Synthia-v1.5-II.
40
+
41
+
42
+ ### Training Procedure
43
+
44
+ The model was trained with the following hyperparameters:
45
+ - Learning rate: 2e-05
46
+ - Train batch size: 1
47
+ - Eval batch size: 1
48
+ - Seed: 42
49
+ - Gradient accumulation steps: 8
50
+ - Total train batch size: 8
51
+ - Optimizer: Paged AdamW 8bit (betas=(0.9,0.999), epsilon=1e-08)
52
+ - LR scheduler: Cosine with 100 warmup steps
53
+ - Number of epochs: 3
54
+
55
+ ### Framework Versions
56
+ - Transformers 4.46.1
57
+ - Pytorch 2.3.1+cu121
58
+ - Datasets 3.0.1
59
+ - Tokenizers 0.20.3
60
+
61
+ ## Intended Use
62
+
63
+ This model is intended for:
64
+ - Instruction following tasks
65
+ - Conversational AI applications
66
+ - Research and development in natural language processing
67
+
68
+ ## Training Infrastructure
69
+
70
+ The model was trained using the Axolotl framework version 0.5.0.
71
+
72
+
73
+ ## License
74
+
75
+ This model is subject to the Llama 3.2 Community License Agreement. Users must comply with all terms and conditions specified in the license.
76
+
77
+ [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)