IvanHU commited on
Commit
4cd27bd
·
verified ·
1 Parent(s): 0bbde8b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -35,14 +35,14 @@ This version includes the optimizer, allowing you to resume training using the H
35
 
36
  ## Continual Training Tutorial
37
 
38
- ### Step 1: Modify the `config.json`
39
 
40
- Due to the implementation of Hugging Face Trainer, certain parameters are stored in the `config.json` file and cannot be modified through the Trainer's command-line arguments. Therefore, you need to update these parameters in the `config.json` file first, particularly:
41
 
42
  - **`save_steps`**: The frequency of saving intermediate checkpoints.
43
  - **`train_batch_size`**: The batch size per GPU (equivalent to `per_device_train_batch_size` in the Trainer). We used a batch size of 1008 (approximately 4M tokens) during the stable training stage. Maintaining this same batch size is equally important for training effectiveness.
44
 
45
- Below is an example of a properly configured `config.json` file:
46
 
47
  ```json
48
  {
 
35
 
36
  ## Continual Training Tutorial
37
 
38
+ ### Step 1: Modify the `trainer_state.json`
39
 
40
+ Due to the implementation of Hugging Face Trainer, certain parameters are stored in the `trainer_state.json` file and cannot be modified through the Trainer's command-line arguments. Therefore, you need to update these parameters in the `trainer_state.json` file first, particularly:
41
 
42
  - **`save_steps`**: The frequency of saving intermediate checkpoints.
43
  - **`train_batch_size`**: The batch size per GPU (equivalent to `per_device_train_batch_size` in the Trainer). We used a batch size of 1008 (approximately 4M tokens) during the stable training stage. Maintaining this same batch size is equally important for training effectiveness.
44
 
45
+ Below is an example of a properly configured `trainer_state.json` file:
46
 
47
  ```json
48
  {