LLM360
/

Crystal

@@ -89,9 +89,15 @@ Our training has 3 stages:
 For details of the training dataset for each stage, please refer to the Dataset section and our CrystalCoder Data Card.
 For hyperparameters used in each stage, please refer to the following table:
-<|TABLE_NEEDED|>
-For more details of training, please refer to our future paper and blog.
 # Dataset

 For details of the training dataset for each stage, please refer to the Dataset section and our CrystalCoder Data Card.
 For hyperparameters used in each stage, please refer to the following table:
+|     | **Phase 1** | **Phase 2** | **Phase 3** |
+| --- | --- | --- | --- | --- |
+| LR Warmup Steps| 86 | 86 | 176 |
+| LR Start Value | 0.012 | 0.0087825 | 0.002 |
+| LR Final Value | 0.00012408 | 0.00013679 | 0.0002 |
+| LR Decay | Linear | Linear | Linear |
+For more details of training, please refer to [our paper](https://arxiv.org/pdf/2312.06550.pdf).
 # Dataset