derek-thomas commited on
Commit
5b52ebb
·
1 Parent(s): dcfeee1

Updating for falcon

Browse files
Files changed (2) hide show
  1. 01-poe-dataset-creation.ipynb +1 -3
  2. 02-autotrain.ipynb +171 -35
01-poe-dataset-creation.ipynb CHANGED
@@ -110,10 +110,8 @@
110
  "\n",
111
  "In each of these scenarios I will build prompts with strucutred generation to fine-tune with. I noticed some difficulty in a first pass with getting consistent response formats, but thats out of scope, so structured generation can help a lot here.\n",
112
  "\n",
113
- "Datasets wont store complex structures like lists of dicts of different types (needed for structured generation, so its easiest if I tokenize. Ill be using Mistral, so Ill skip the system prompt. Its simple enough to come back and change this for a different model in this notebook.\n",
114
- "\n",
115
  "## Implementation\n",
116
- "To explore this goal, we will start with [layoric/labeled-multiple-choice-explained](https://huggingface.co/datasets/layoric/labeled-multiple-choice-explained) as our dataset. It has explanations already provided by GPT-3.5-turbo. Given that these explanations are a bit different than what mistral would do, it might be useful if we generate some from mistral as well. Based on [this notebook](./poe-generate-mistral-reasoning.ipynb) we have been able to generate mistral reasoning in this refined dataset [derek-thomas/labeled-multiple-choice-explained-mistral-reasoning](https://huggingface.co/datasets/derek-thomas/labeled-multiple-choice-explained-mistral-reasoning).\n",
117
  "\n",
118
  "In this notebook we will format our data such that we can try each experiment and then we will push it to my repo: [derek-thomas/labeled-multiple-choice-explained](https://huggingface.co/datasets/derek-thomas/labeled-multiple-choice-explained)."
119
  ]
 
110
  "\n",
111
  "In each of these scenarios I will build prompts with strucutred generation to fine-tune with. I noticed some difficulty in a first pass with getting consistent response formats, but thats out of scope, so structured generation can help a lot here.\n",
112
  "\n",
 
 
113
  "## Implementation\n",
114
+ "To explore this goal, we will start with [layoric/labeled-multiple-choice-explained](https://huggingface.co/datasets/layoric/labeled-multiple-choice-explained) as our dataset. It has explanations already provided by GPT-3.5-turbo. Given that these explanations are a bit different than what falcon would do, it might be useful if we generate some from falcon as well. Based on [this notebook](./poe-generate-falcon-reasoning.ipynb) we have been able to generate falcon reasoning in this refined dataset [derek-thomas/labeled-multiple-choice-explained-falcon-reasoning](https://huggingface.co/datasets/derek-thomas/labeled-multiple-choice-explained-falcon-reasoning).\n",
115
  "\n",
116
  "In this notebook we will format our data such that we can try each experiment and then we will push it to my repo: [derek-thomas/labeled-multiple-choice-explained](https://huggingface.co/datasets/derek-thomas/labeled-multiple-choice-explained)."
117
  ]
02-autotrain.ipynb CHANGED
@@ -50,7 +50,7 @@
50
  {
51
  "data": {
52
  "application/vnd.jupyter.widget-view+json": {
53
- "model_id": "b5441f4018234a25a299775d77f880b3",
54
  "version_major": 2,
55
  "version_minor": 0
56
  },
@@ -63,7 +63,7 @@
63
  }
64
  ],
65
  "source": [
66
- "from huggingface_hub import login, get_token\n",
67
  "login()"
68
  ]
69
  },
@@ -96,15 +96,15 @@
96
  "# Base config\n",
97
  "config_template = {\n",
98
  " \"task\": \"llm-sft\",\n",
99
- " \"base_model\": \"mistralai/Mistral-7B-Instruct-v0.3\",\n",
100
  " \"project_name\": \"\",\n",
101
  " \"log\": \"tensorboard\",\n",
102
  " \"backend\": \"spaces-l4x1\",\n",
103
  " \"data\": {\n",
104
- " \"path\": \"derek-thomas/labeled-multiple-choice-explained-mistral-tokenized\",\n",
105
  " \"train_split\": \"train\",\n",
106
  " \"valid_split\": None,\n",
107
- " \"chat_template\": \"none\",\n",
108
  " \"column_mapping\": {\n",
109
  " \"text_column\": \"\"\n",
110
  " },\n",
@@ -112,9 +112,9 @@
112
  " \"params\": {\n",
113
  " \"block_size\": 512,\n",
114
  " \"model_max_length\": 1500,\n",
115
- " \"epochs\": 2,\n",
116
  " \"batch_size\": 1,\n",
117
- " \"lr\": 3e-5,\n",
118
  " \"peft\": True,\n",
119
  " \"quantization\": \"int4\",\n",
120
  " \"target_modules\": \"all-linear\",\n",
@@ -191,40 +191,67 @@
191
  "output_type": "stream",
192
  "text": [
193
  "Running autotrain with config: ./autotrain_configs/conversation_RFA_gpt3_5.yml\n",
194
- "INFO | 2025-01-08 10:20:38 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_RFA_gpt3_5.yml\n",
195
- "INFO | 2025-01-08 10:20:38 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
196
- "INFO | 2025-01-08 10:20:38 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
197
- "INFO | 2025-01-08 10:20:38 | autotrain.parser:run:224 - {'model': 'mistralai/Mistral-7B-Instruct-v0.3', 'project_name': 'falcon-v03-poe-RFA-gpt3-5', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-mistral-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-05, 'epochs': 2, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'none', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_RFA_gpt3_5', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
198
- "INFO | 2025-01-08 10:20:43 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-RFA-gpt3-5\n",
 
 
 
 
 
199
  "Running autotrain with config: ./autotrain_configs/conversation_RFA_falcon.yml\n",
200
- "INFO | 2025-01-08 10:20:46 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_RFA_falcon.yml\n",
201
- "INFO | 2025-01-08 10:20:46 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
202
- "INFO | 2025-01-08 10:20:46 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
203
- "INFO | 2025-01-08 10:20:46 | autotrain.parser:run:224 - {'model': 'mistralai/Mistral-7B-Instruct-v0.3', 'project_name': 'falcon-v03-poe-RFA-falcon', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-mistral-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-05, 'epochs': 2, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'none', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_RFA_falcon', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
204
- "INFO | 2025-01-08 10:20:53 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-RFA-falcon\n",
 
 
 
 
 
205
  "Running autotrain with config: ./autotrain_configs/conversation_FAR_gpt3_5.yml\n",
206
- "INFO | 2025-01-08 10:20:56 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_FAR_gpt3_5.yml\n",
207
- "INFO | 2025-01-08 10:20:56 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
208
- "INFO | 2025-01-08 10:20:56 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
209
- "INFO | 2025-01-08 10:20:56 | autotrain.parser:run:224 - {'model': 'mistralai/Mistral-7B-Instruct-v0.3', 'project_name': 'falcon-v03-poe-FAR-gpt3-5', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-mistral-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-05, 'epochs': 2, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'none', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_FAR_gpt3_5', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
210
- "INFO | 2025-01-08 10:21:02 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-FAR-gpt3-5\n",
 
 
 
 
 
211
  "Running autotrain with config: ./autotrain_configs/conversation_FAR_falcon.yml\n",
212
- "INFO | 2025-01-08 10:21:05 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_FAR_falcon.yml\n",
213
- "INFO | 2025-01-08 10:21:05 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
214
- "INFO | 2025-01-08 10:21:05 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
215
- "INFO | 2025-01-08 10:21:05 | autotrain.parser:run:224 - {'model': 'mistralai/Mistral-7B-Instruct-v0.3', 'project_name': 'falcon-v03-poe-FAR-falcon', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-mistral-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-05, 'epochs': 2, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'none', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_FAR_falcon', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
216
- "INFO | 2025-01-08 10:21:12 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-FAR-falcon\n",
 
 
 
 
 
217
  "Running autotrain with config: ./autotrain_configs/conversation_FA.yml\n",
218
- "INFO | 2025-01-08 10:21:15 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_FA.yml\n",
219
- "INFO | 2025-01-08 10:21:15 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
220
- "INFO | 2025-01-08 10:21:15 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
221
- "INFO | 2025-01-08 10:21:15 | autotrain.parser:run:224 - {'model': 'mistralai/Mistral-7B-Instruct-v0.3', 'project_name': 'falcon-v03-poe-FA', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-mistral-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-05, 'epochs': 2, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'none', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_FA', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
222
- "INFO | 2025-01-08 10:21:22 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-FA\n"
 
 
 
 
 
223
  ]
224
  }
225
  ],
226
  "source": [
227
  "# Generate configs and run commands\n",
 
 
228
  "for project_suffix, text_column in zip(project_suffixes, text_columns):\n",
229
  " # Modify the config\n",
230
  " config = config_template.copy()\n",
@@ -236,15 +263,124 @@
236
  " with open(config_path, \"w\") as f:\n",
237
  " yaml.dump(config, f)\n",
238
  "\n",
239
- " # Run the command\n",
240
  " print(f\"Running autotrain with config: {config_path}\")\n",
241
- " subprocess.run([\"autotrain\", \"--config\", config_path])"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
242
  ]
243
  },
244
  {
245
  "cell_type": "code",
246
  "execution_count": null,
247
- "id": "67675837-2a38-4427-9186-32a25a970ff3",
248
  "metadata": {},
249
  "outputs": [],
250
  "source": []
 
50
  {
51
  "data": {
52
  "application/vnd.jupyter.widget-view+json": {
53
+ "model_id": "928f44f483504b438e0fdbd4df3d7dd5",
54
  "version_major": 2,
55
  "version_minor": 0
56
  },
 
63
  }
64
  ],
65
  "source": [
66
+ "from huggingface_hub import login, get_token, whoami\n",
67
  "login()"
68
  ]
69
  },
 
96
  "# Base config\n",
97
  "config_template = {\n",
98
  " \"task\": \"llm-sft\",\n",
99
+ " \"base_model\": \"tiiuae/Falcon3-7B-Instruct\",\n",
100
  " \"project_name\": \"\",\n",
101
  " \"log\": \"tensorboard\",\n",
102
  " \"backend\": \"spaces-l4x1\",\n",
103
  " \"data\": {\n",
104
+ " \"path\": \"derek-thomas/labeled-multiple-choice-explained-falcon-tokenized\",\n",
105
  " \"train_split\": \"train\",\n",
106
  " \"valid_split\": None,\n",
107
+ " \"chat_template\": \"tokenizer\",\n",
108
  " \"column_mapping\": {\n",
109
  " \"text_column\": \"\"\n",
110
  " },\n",
 
112
  " \"params\": {\n",
113
  " \"block_size\": 512,\n",
114
  " \"model_max_length\": 1500,\n",
115
+ " \"epochs\": 4,\n",
116
  " \"batch_size\": 1,\n",
117
+ " \"lr\": 3e-7,\n",
118
  " \"peft\": True,\n",
119
  " \"quantization\": \"int4\",\n",
120
  " \"target_modules\": \"all-linear\",\n",
 
191
  "output_type": "stream",
192
  "text": [
193
  "Running autotrain with config: ./autotrain_configs/conversation_RFA_gpt3_5.yml\n",
194
+ "INFO | 2025-01-08 14:33:16 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_RFA_gpt3_5.yml\n",
195
+ "INFO | 2025-01-08 14:33:16 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
196
+ "INFO | 2025-01-08 14:33:16 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
197
+ "INFO | 2025-01-08 14:33:16 | autotrain.parser:run:224 - {'model': 'tiiuae/Falcon3-7B-Instruct', 'project_name': 'falcon-v03-poe-RFA-gpt3-5', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-falcon-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-07, 'epochs': 4, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'tokenizer', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_RFA_gpt3_5', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
198
+ "INFO | 2025-01-08 14:33:23 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-RFA-gpt3-5\n",
199
+ "\n",
200
+ "---\n",
201
+ "https://huggingface.co/spaces/derek-thomas/autotrain-falcon-v03-poe-RFA-gpt3-5\n",
202
+ "---\n",
203
+ "\n",
204
  "Running autotrain with config: ./autotrain_configs/conversation_RFA_falcon.yml\n",
205
+ "INFO | 2025-01-08 14:33:26 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_RFA_falcon.yml\n",
206
+ "INFO | 2025-01-08 14:33:26 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
207
+ "INFO | 2025-01-08 14:33:26 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
208
+ "INFO | 2025-01-08 14:33:26 | autotrain.parser:run:224 - {'model': 'tiiuae/Falcon3-7B-Instruct', 'project_name': 'falcon-v03-poe-RFA-falcon', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-falcon-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-07, 'epochs': 4, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'tokenizer', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_RFA_falcon', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
209
+ "INFO | 2025-01-08 14:33:32 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-RFA-falcon\n",
210
+ "\n",
211
+ "---\n",
212
+ "https://huggingface.co/spaces/derek-thomas/autotrain-falcon-v03-poe-RFA-falcon\n",
213
+ "---\n",
214
+ "\n",
215
  "Running autotrain with config: ./autotrain_configs/conversation_FAR_gpt3_5.yml\n",
216
+ "INFO | 2025-01-08 14:33:36 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_FAR_gpt3_5.yml\n",
217
+ "INFO | 2025-01-08 14:33:36 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
218
+ "INFO | 2025-01-08 14:33:36 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
219
+ "INFO | 2025-01-08 14:33:36 | autotrain.parser:run:224 - {'model': 'tiiuae/Falcon3-7B-Instruct', 'project_name': 'falcon-v03-poe-FAR-gpt3-5', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-falcon-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-07, 'epochs': 4, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'tokenizer', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_FAR_gpt3_5', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
220
+ "INFO | 2025-01-08 14:33:41 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-FAR-gpt3-5\n",
221
+ "\n",
222
+ "---\n",
223
+ "https://huggingface.co/spaces/derek-thomas/autotrain-falcon-v03-poe-FAR-gpt3-5\n",
224
+ "---\n",
225
+ "\n",
226
  "Running autotrain with config: ./autotrain_configs/conversation_FAR_falcon.yml\n",
227
+ "INFO | 2025-01-08 14:33:45 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_FAR_falcon.yml\n",
228
+ "INFO | 2025-01-08 14:33:45 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
229
+ "INFO | 2025-01-08 14:33:45 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
230
+ "INFO | 2025-01-08 14:33:45 | autotrain.parser:run:224 - {'model': 'tiiuae/Falcon3-7B-Instruct', 'project_name': 'falcon-v03-poe-FAR-falcon', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-falcon-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-07, 'epochs': 4, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'tokenizer', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_FAR_falcon', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
231
+ "INFO | 2025-01-08 14:33:51 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-FAR-falcon\n",
232
+ "\n",
233
+ "---\n",
234
+ "https://huggingface.co/spaces/derek-thomas/autotrain-falcon-v03-poe-FAR-falcon\n",
235
+ "---\n",
236
+ "\n",
237
  "Running autotrain with config: ./autotrain_configs/conversation_FA.yml\n",
238
+ "INFO | 2025-01-08 14:33:54 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_FA.yml\n",
239
+ "INFO | 2025-01-08 14:33:54 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
240
+ "INFO | 2025-01-08 14:33:54 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
241
+ "INFO | 2025-01-08 14:33:54 | autotrain.parser:run:224 - {'model': 'tiiuae/Falcon3-7B-Instruct', 'project_name': 'falcon-v03-poe-FA', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-falcon-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-07, 'epochs': 4, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'tokenizer', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_FA', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
242
+ "INFO | 2025-01-08 14:34:00 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-FA\n",
243
+ "\n",
244
+ "---\n",
245
+ "https://huggingface.co/spaces/derek-thomas/autotrain-falcon-v03-poe-FA\n",
246
+ "---\n",
247
+ "\n"
248
  ]
249
  }
250
  ],
251
  "source": [
252
  "# Generate configs and run commands\n",
253
+ "autotrain_spaces = []\n",
254
+ "autotrain_models = []\n",
255
  "for project_suffix, text_column in zip(project_suffixes, text_columns):\n",
256
  " # Modify the config\n",
257
  " config = config_template.copy()\n",
 
263
  " with open(config_path, \"w\") as f:\n",
264
  " yaml.dump(config, f)\n",
265
  "\n",
266
+ " # # Run the command\n",
267
  " print(f\"Running autotrain with config: {config_path}\")\n",
268
+ " subprocess.run([\"autotrain\", \"--config\", config_path])\n",
269
+ "\n",
270
+ " space_name = f\"{whoami()['name']}/autotrain-{config['project_name']}\"\n",
271
+ " model_name = f\"{whoami()['name']}/{config['project_name']}\"\n",
272
+ " autotrain_spaces.append(space_name)\n",
273
+ " autotrain_models.append(model_name)\n",
274
+ " print(f'\\n---\\nhttps://huggingface.co/spaces/{space_name}\\n---\\n')"
275
+ ]
276
+ },
277
+ {
278
+ "cell_type": "markdown",
279
+ "id": "bf3d3324-202b-45df-8897-58cf89931c45",
280
+ "metadata": {},
281
+ "source": [
282
+ "# Cleanup"
283
+ ]
284
+ },
285
+ {
286
+ "cell_type": "code",
287
+ "execution_count": 7,
288
+ "id": "adf09687-ab1e-4f1e-8bf9-317cc928467a",
289
+ "metadata": {},
290
+ "outputs": [],
291
+ "source": [
292
+ "from huggingface_hub import HfApi\n",
293
+ "api = HfApi()"
294
+ ]
295
+ },
296
+ {
297
+ "cell_type": "code",
298
+ "execution_count": 8,
299
+ "id": "19d80d26-cda4-41fb-a125-06060c3f90ce",
300
+ "metadata": {},
301
+ "outputs": [
302
+ {
303
+ "name": "stdout",
304
+ "output_type": "stream",
305
+ "text": [
306
+ "['derek-thomas/autotrain-falcon-v03-poe-RFA-gpt3-5',\n",
307
+ " 'derek-thomas/autotrain-falcon-v03-poe-RFA-falcon',\n",
308
+ " 'derek-thomas/autotrain-falcon-v03-poe-FAR-gpt3-5',\n",
309
+ " 'derek-thomas/autotrain-falcon-v03-poe-FAR-falcon',\n",
310
+ " 'derek-thomas/autotrain-falcon-v03-poe-FA']\n",
311
+ "\n",
312
+ "['derek-thomas/falcon-v03-poe-RFA-gpt3-5',\n",
313
+ " 'derek-thomas/falcon-v03-poe-RFA-falcon',\n",
314
+ " 'derek-thomas/falcon-v03-poe-FAR-gpt3-5',\n",
315
+ " 'derek-thomas/falcon-v03-poe-FAR-falcon',\n",
316
+ " 'derek-thomas/falcon-v03-poe-FA']\n"
317
+ ]
318
+ }
319
+ ],
320
+ "source": [
321
+ "from pprint import pprint\n",
322
+ "pprint(autotrain_spaces)\n",
323
+ "print()\n",
324
+ "pprint(autotrain_models)"
325
+ ]
326
+ },
327
+ {
328
+ "cell_type": "markdown",
329
+ "id": "0040e05b-39c2-4aff-a40e-01577a388eff",
330
+ "metadata": {},
331
+ "source": [
332
+ "<span style=\"color:red; font-size:20px; font-weight:bold;\">\n",
333
+ "WAIT TO RUN THIS UNTIL YOUR SPACES ARE FINISHED TRAINING!\n",
334
+ "</span>"
335
+ ]
336
+ },
337
+ {
338
+ "cell_type": "code",
339
+ "execution_count": null,
340
+ "id": "f86ed8ad-4e38-454a-a2c1-b1f075399c37",
341
+ "metadata": {},
342
+ "outputs": [],
343
+ "source": [
344
+ "for space in autotrain_spaces:\n",
345
+ " confirm = input(f\"Are you sure you want to delete the space '{space}'? (y/n): \")\n",
346
+ " if confirm.lower() == 'y':\n",
347
+ " api.delete_repo(space, repo_type='space')\n",
348
+ " print(f\"Deleted {space}\")\n",
349
+ " else:\n",
350
+ " print(f\"Skipped {space}\")\n"
351
+ ]
352
+ },
353
+ {
354
+ "cell_type": "markdown",
355
+ "id": "2182f8fe-8504-4cb9-a0a6-4b143541158d",
356
+ "metadata": {},
357
+ "source": [
358
+ "<span style=\"color:red; font-size:20px; font-weight:bold;\">\n",
359
+ "ONLY RUN THIS IF YOU NEED TO RESTART FROM SCRATCH\n",
360
+ "THIS WILL DELETE YOUR MODELS\n",
361
+ "</span>"
362
+ ]
363
+ },
364
+ {
365
+ "cell_type": "code",
366
+ "execution_count": null,
367
+ "id": "12939405-a731-4a7c-ab4a-e1a4f1850bb6",
368
+ "metadata": {},
369
+ "outputs": [],
370
+ "source": [
371
+ "# for model in autotrain_models:\n",
372
+ "# confirm = input(f\"Are you sure you want to delete the model '{model}'? (y/n): \")\n",
373
+ "# if confirm.lower() == 'y':\n",
374
+ "# api.delete_repo(model, repo_type='model')\n",
375
+ "# print(f\"Deleted {model}\")\n",
376
+ "# else:\n",
377
+ "# print(f\"Skipped {model}\")\n"
378
  ]
379
  },
380
  {
381
  "cell_type": "code",
382
  "execution_count": null,
383
+ "id": "c2a2c864-2082-4be9-8e28-92fd01833e38",
384
  "metadata": {},
385
  "outputs": [],
386
  "source": []