Text Generation
Transformers
Safetensors
English
doge
conversational
custom_code
JingzeShi commited on
Commit
5f7d5d3
·
verified ·
1 Parent(s): 5ad110a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -1
README.md CHANGED
@@ -34,11 +34,13 @@ In addition, Doge uses Dynamic Mask Attention as sequence transformation and can
34
 
35
  ## Model Details
36
 
 
 
37
  > NOTE: These models has not been fine-tuned for instruction, the instruction model is [here](https://huggingface.co/JingzeShi/Doge-20M-Instruct).
38
 
39
  > TODO: The larger model is under training and will be uploaded soon.
40
 
41
- **Training**:
42
 
43
  | Model | Training Data | Steps | Content Length | Tokens | LR | Batch Size | Precision |
44
  |---|---|---|---|---|---|---|---|
@@ -55,6 +57,11 @@ In addition, Doge uses Dynamic Mask Attention as sequence transformation and can
55
  > All evaluations are done using five-shot settings, without additional training on the benchmarks.
56
 
57
 
 
 
 
 
 
58
  **Environment**:
59
 
60
  - Image: nvcr.io/nvidia/pytorch:24.12-py3
 
34
 
35
  ## Model Details
36
 
37
+ We build the Doge by doing Per-Training on [Smollm-Corpus](https://huggingface.co/datasets/HuggingFaceTB/smollm-corpus).
38
+
39
  > NOTE: These models has not been fine-tuned for instruction, the instruction model is [here](https://huggingface.co/JingzeShi/Doge-20M-Instruct).
40
 
41
  > TODO: The larger model is under training and will be uploaded soon.
42
 
43
+ **Pre-Training**:
44
 
45
  | Model | Training Data | Steps | Content Length | Tokens | LR | Batch Size | Precision |
46
  |---|---|---|---|---|---|---|---|
 
57
  > All evaluations are done using five-shot settings, without additional training on the benchmarks.
58
 
59
 
60
+ **Procedure**:
61
+
62
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/loser_cheems/huggingface/runs/p8x93v5l)
63
+
64
+
65
  **Environment**:
66
 
67
  - Image: nvcr.io/nvidia/pytorch:24.12-py3