derek-thomas
/

prompt-order-experiment

Model card Files Files and versions Community

derek-thomas commited on Jan 9

Commit

5b52ebb

1 Parent(s): dcfeee1

Updating for falcon

Browse files

Files changed (2) hide show

01-poe-dataset-creation.ipynb +1 -3
02-autotrain.ipynb +171 -35

01-poe-dataset-creation.ipynb CHANGED Viewed

@@ -110,10 +110,8 @@
     "\n",
     "In each of these scenarios I will build prompts with strucutred generation to fine-tune with. I noticed some difficulty in a first pass with getting consistent response formats, but thats out of scope, so structured generation can help a lot here.\n",
     "\n",
-    "Datasets wont store complex structures like lists of dicts of different types (needed for structured generation, so its easiest if I tokenize. Ill be using Mistral, so Ill skip the system prompt. Its simple enough to come back and change this for a different model in this notebook.\n",
-    "\n",
     "## Implementation\n",
-    "To explore this goal, we will start with [layoric/labeled-multiple-choice-explained](https://huggingface.co/datasets/layoric/labeled-multiple-choice-explained) as our dataset. It has explanations already provided by GPT-3.5-turbo. Given that these explanations are a bit different than what mistral would do, it might be useful if we generate some from mistral as well. Based on [this notebook](./poe-generate-mistral-reasoning.ipynb) we have been able to generate mistral reasoning in this refined dataset [derek-thomas/labeled-multiple-choice-explained-mistral-reasoning](https://huggingface.co/datasets/derek-thomas/labeled-multiple-choice-explained-mistral-reasoning).\n",
     "\n",
     "In this notebook we will format our data such that we can try each experiment and then we will push it to my repo: [derek-thomas/labeled-multiple-choice-explained](https://huggingface.co/datasets/derek-thomas/labeled-multiple-choice-explained)."
    ]

     "\n",
     "In each of these scenarios I will build prompts with strucutred generation to fine-tune with. I noticed some difficulty in a first pass with getting consistent response formats, but thats out of scope, so structured generation can help a lot here.\n",
     "\n",
     "## Implementation\n",
+    "To explore this goal, we will start with [layoric/labeled-multiple-choice-explained](https://huggingface.co/datasets/layoric/labeled-multiple-choice-explained) as our dataset. It has explanations already provided by GPT-3.5-turbo. Given that these explanations are a bit different than what falcon would do, it might be useful if we generate some from falcon as well. Based on [this notebook](./poe-generate-falcon-reasoning.ipynb) we have been able to generate falcon reasoning in this refined dataset [derek-thomas/labeled-multiple-choice-explained-falcon-reasoning](https://huggingface.co/datasets/derek-thomas/labeled-multiple-choice-explained-falcon-reasoning).\n",
     "\n",
     "In this notebook we will format our data such that we can try each experiment and then we will push it to my repo: [derek-thomas/labeled-multiple-choice-explained](https://huggingface.co/datasets/derek-thomas/labeled-multiple-choice-explained)."
    ]

02-autotrain.ipynb CHANGED Viewed

@@ -50,7 +50,7 @@
     {
      "data": {
       "application/vnd.jupyter.widget-view+json": {
-       "model_id": "b5441f4018234a25a299775d77f880b3",
        "version_major": 2,
        "version_minor": 0
       },
@@ -63,7 +63,7 @@
     }
    ],
    "source": [
-    "from huggingface_hub import login, get_token\n",
     "login()"
    ]
   },
@@ -96,15 +96,15 @@
     "# Base config\n",
     "config_template = {\n",
     "    \"task\": \"llm-sft\",\n",
-    "    \"base_model\": \"mistralai/Mistral-7B-Instruct-v0.3\",\n",
     "    \"project_name\": \"\",\n",
     "    \"log\": \"tensorboard\",\n",
     "    \"backend\": \"spaces-l4x1\",\n",
     "    \"data\": {\n",
-    "        \"path\": \"derek-thomas/labeled-multiple-choice-explained-mistral-tokenized\",\n",
     "        \"train_split\": \"train\",\n",
     "        \"valid_split\": None,\n",
-    "        \"chat_template\": \"none\",\n",
     "        \"column_mapping\": {\n",
     "            \"text_column\": \"\"\n",
     "            },\n",
@@ -112,9 +112,9 @@
     "    \"params\": {\n",
     "        \"block_size\": 512,\n",
     "        \"model_max_length\": 1500,\n",
-    "        \"epochs\": 2,\n",
     "        \"batch_size\": 1,\n",
-    "        \"lr\": 3e-5,\n",
     "        \"peft\": True,\n",
     "        \"quantization\": \"int4\",\n",
     "        \"target_modules\": \"all-linear\",\n",
@@ -191,40 +191,67 @@
      "output_type": "stream",
      "text": [
       "Running autotrain with config: ./autotrain_configs/conversation_RFA_gpt3_5.yml\n",
-      "INFO     | 2025-01-08 10:20:38 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_RFA_gpt3_5.yml\n",
-      "INFO     | 2025-01-08 10:20:38 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
-      "INFO     | 2025-01-08 10:20:38 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
-      "INFO     | 2025-01-08 10:20:38 | autotrain.parser:run:224 - {'model': 'mistralai/Mistral-7B-Instruct-v0.3', 'project_name': 'falcon-v03-poe-RFA-gpt3-5', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-mistral-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-05, 'epochs': 2, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'none', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_RFA_gpt3_5', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
-      "INFO     | 2025-01-08 10:20:43 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-RFA-gpt3-5\n",
       "Running autotrain with config: ./autotrain_configs/conversation_RFA_falcon.yml\n",
-      "INFO     | 2025-01-08 10:20:46 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_RFA_falcon.yml\n",
-      "INFO     | 2025-01-08 10:20:46 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
-      "INFO     | 2025-01-08 10:20:46 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
-      "INFO     | 2025-01-08 10:20:46 | autotrain.parser:run:224 - {'model': 'mistralai/Mistral-7B-Instruct-v0.3', 'project_name': 'falcon-v03-poe-RFA-falcon', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-mistral-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-05, 'epochs': 2, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'none', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_RFA_falcon', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
-      "INFO     | 2025-01-08 10:20:53 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-RFA-falcon\n",
       "Running autotrain with config: ./autotrain_configs/conversation_FAR_gpt3_5.yml\n",
-      "INFO     | 2025-01-08 10:20:56 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_FAR_gpt3_5.yml\n",
-      "INFO     | 2025-01-08 10:20:56 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
-      "INFO     | 2025-01-08 10:20:56 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
-      "INFO     | 2025-01-08 10:20:56 | autotrain.parser:run:224 - {'model': 'mistralai/Mistral-7B-Instruct-v0.3', 'project_name': 'falcon-v03-poe-FAR-gpt3-5', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-mistral-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-05, 'epochs': 2, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'none', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_FAR_gpt3_5', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
-      "INFO     | 2025-01-08 10:21:02 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-FAR-gpt3-5\n",
       "Running autotrain with config: ./autotrain_configs/conversation_FAR_falcon.yml\n",
-      "INFO     | 2025-01-08 10:21:05 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_FAR_falcon.yml\n",
-      "INFO     | 2025-01-08 10:21:05 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
-      "INFO     | 2025-01-08 10:21:05 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
-      "INFO     | 2025-01-08 10:21:05 | autotrain.parser:run:224 - {'model': 'mistralai/Mistral-7B-Instruct-v0.3', 'project_name': 'falcon-v03-poe-FAR-falcon', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-mistral-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-05, 'epochs': 2, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'none', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_FAR_falcon', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
-      "INFO     | 2025-01-08 10:21:12 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-FAR-falcon\n",
       "Running autotrain with config: ./autotrain_configs/conversation_FA.yml\n",
-      "INFO     | 2025-01-08 10:21:15 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_FA.yml\n",
-      "INFO     | 2025-01-08 10:21:15 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
-      "INFO     | 2025-01-08 10:21:15 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
-      "INFO     | 2025-01-08 10:21:15 | autotrain.parser:run:224 - {'model': 'mistralai/Mistral-7B-Instruct-v0.3', 'project_name': 'falcon-v03-poe-FA', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-mistral-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-05, 'epochs': 2, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'none', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_FA', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
-      "INFO     | 2025-01-08 10:21:22 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-FA\n"
      ]
     }
    ],
    "source": [
     "# Generate configs and run commands\n",
     "for project_suffix, text_column in zip(project_suffixes, text_columns):\n",
     "    # Modify the config\n",
     "    config = config_template.copy()\n",
@@ -236,15 +263,124 @@
     "    with open(config_path, \"w\") as f:\n",
     "        yaml.dump(config, f)\n",
     "\n",
-    "    # Run the command\n",
     "    print(f\"Running autotrain with config: {config_path}\")\n",
-    "    subprocess.run([\"autotrain\", \"--config\", config_path])"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "67675837-2a38-4427-9186-32a25a970ff3",
    "metadata": {},
    "outputs": [],
    "source": []

     {
      "data": {
       "application/vnd.jupyter.widget-view+json": {
+       "model_id": "928f44f483504b438e0fdbd4df3d7dd5",
        "version_major": 2,
        "version_minor": 0
       },
     }
    ],
    "source": [
+    "from huggingface_hub import login, get_token, whoami\n",
     "login()"
    ]
   },
     "# Base config\n",
     "config_template = {\n",
     "    \"task\": \"llm-sft\",\n",
+    "    \"base_model\": \"tiiuae/Falcon3-7B-Instruct\",\n",
     "    \"project_name\": \"\",\n",
     "    \"log\": \"tensorboard\",\n",
     "    \"backend\": \"spaces-l4x1\",\n",
     "    \"data\": {\n",
+    "        \"path\": \"derek-thomas/labeled-multiple-choice-explained-falcon-tokenized\",\n",
     "        \"train_split\": \"train\",\n",
     "        \"valid_split\": None,\n",
+    "        \"chat_template\": \"tokenizer\",\n",
     "        \"column_mapping\": {\n",
     "            \"text_column\": \"\"\n",
     "            },\n",
     "    \"params\": {\n",
     "        \"block_size\": 512,\n",
     "        \"model_max_length\": 1500,\n",
+    "        \"epochs\": 4,\n",
     "        \"batch_size\": 1,\n",
+    "        \"lr\": 3e-7,\n",
     "        \"peft\": True,\n",
     "        \"quantization\": \"int4\",\n",
     "        \"target_modules\": \"all-linear\",\n",
      "output_type": "stream",
      "text": [
       "Running autotrain with config: ./autotrain_configs/conversation_RFA_gpt3_5.yml\n",
+      "INFO     | 2025-01-08 14:33:16 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_RFA_gpt3_5.yml\n",
+      "INFO     | 2025-01-08 14:33:16 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
+      "INFO     | 2025-01-08 14:33:16 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
+      "INFO     | 2025-01-08 14:33:16 | autotrain.parser:run:224 - {'model': 'tiiuae/Falcon3-7B-Instruct', 'project_name': 'falcon-v03-poe-RFA-gpt3-5', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-falcon-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-07, 'epochs': 4, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'tokenizer', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_RFA_gpt3_5', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
+      "INFO     | 2025-01-08 14:33:23 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-RFA-gpt3-5\n",
+      "\n",
+      "---\n",
+      "https://huggingface.co/spaces/derek-thomas/autotrain-falcon-v03-poe-RFA-gpt3-5\n",
+      "---\n",
+      "\n",
       "Running autotrain with config: ./autotrain_configs/conversation_RFA_falcon.yml\n",
+      "INFO     | 2025-01-08 14:33:26 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_RFA_falcon.yml\n",
+      "INFO     | 2025-01-08 14:33:26 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
+      "INFO     | 2025-01-08 14:33:26 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
+      "INFO     | 2025-01-08 14:33:26 | autotrain.parser:run:224 - {'model': 'tiiuae/Falcon3-7B-Instruct', 'project_name': 'falcon-v03-poe-RFA-falcon', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-falcon-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-07, 'epochs': 4, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'tokenizer', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_RFA_falcon', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
+      "INFO     | 2025-01-08 14:33:32 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-RFA-falcon\n",
+      "\n",
+      "---\n",
+      "https://huggingface.co/spaces/derek-thomas/autotrain-falcon-v03-poe-RFA-falcon\n",
+      "---\n",
+      "\n",
       "Running autotrain with config: ./autotrain_configs/conversation_FAR_gpt3_5.yml\n",
+      "INFO     | 2025-01-08 14:33:36 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_FAR_gpt3_5.yml\n",
+      "INFO     | 2025-01-08 14:33:36 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
+      "INFO     | 2025-01-08 14:33:36 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
+      "INFO     | 2025-01-08 14:33:36 | autotrain.parser:run:224 - {'model': 'tiiuae/Falcon3-7B-Instruct', 'project_name': 'falcon-v03-poe-FAR-gpt3-5', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-falcon-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-07, 'epochs': 4, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'tokenizer', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_FAR_gpt3_5', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
+      "INFO     | 2025-01-08 14:33:41 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-FAR-gpt3-5\n",
+      "\n",
+      "---\n",
+      "https://huggingface.co/spaces/derek-thomas/autotrain-falcon-v03-poe-FAR-gpt3-5\n",
+      "---\n",
+      "\n",
       "Running autotrain with config: ./autotrain_configs/conversation_FAR_falcon.yml\n",
+      "INFO     | 2025-01-08 14:33:45 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_FAR_falcon.yml\n",
+      "INFO     | 2025-01-08 14:33:45 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
+      "INFO     | 2025-01-08 14:33:45 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
+      "INFO     | 2025-01-08 14:33:45 | autotrain.parser:run:224 - {'model': 'tiiuae/Falcon3-7B-Instruct', 'project_name': 'falcon-v03-poe-FAR-falcon', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-falcon-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-07, 'epochs': 4, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'tokenizer', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_FAR_falcon', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
+      "INFO     | 2025-01-08 14:33:51 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-FAR-falcon\n",
+      "\n",
+      "---\n",
+      "https://huggingface.co/spaces/derek-thomas/autotrain-falcon-v03-poe-FAR-falcon\n",
+      "---\n",
+      "\n",
       "Running autotrain with config: ./autotrain_configs/conversation_FA.yml\n",
+      "INFO     | 2025-01-08 14:33:54 | autotrain.cli.autotrain:main:58 - Using AutoTrain configuration: ./autotrain_configs/conversation_FA.yml\n",
+      "INFO     | 2025-01-08 14:33:54 | autotrain.parser:__post_init__:165 - Running task: lm_training\n",
+      "INFO     | 2025-01-08 14:33:54 | autotrain.parser:__post_init__:166 - Using backend: spaces-l4x1\n",
+      "INFO     | 2025-01-08 14:33:54 | autotrain.parser:run:224 - {'model': 'tiiuae/Falcon3-7B-Instruct', 'project_name': 'falcon-v03-poe-FA', 'data_path': 'derek-thomas/labeled-multiple-choice-explained-falcon-tokenized', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 512, 'model_max_length': 1500, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'bf16', 'lr': 3e-07, 'epochs': 4, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 8, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'tokenizer', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': None, 'text_column': 'conversation_FA', 'rejected_text_column': None, 'push_to_hub': True, 'username': 'derek-thomas', 'token': '*****', 'unsloth': False, 'distributed_backend': None}\n",
+      "INFO     | 2025-01-08 14:34:00 | autotrain.parser:run:229 - Job ID: derek-thomas/autotrain-falcon-v03-poe-FA\n",
+      "\n",
+      "---\n",
+      "https://huggingface.co/spaces/derek-thomas/autotrain-falcon-v03-poe-FA\n",
+      "---\n",
+      "\n"
      ]
     }
    ],
    "source": [
     "# Generate configs and run commands\n",
+    "autotrain_spaces = []\n",
+    "autotrain_models = []\n",
     "for project_suffix, text_column in zip(project_suffixes, text_columns):\n",
     "    # Modify the config\n",
     "    config = config_template.copy()\n",
     "    with open(config_path, \"w\") as f:\n",
     "        yaml.dump(config, f)\n",
     "\n",
+    "    # # Run the command\n",
     "    print(f\"Running autotrain with config: {config_path}\")\n",
+    "    subprocess.run([\"autotrain\", \"--config\", config_path])\n",
+    "\n",
+    "    space_name = f\"{whoami()['name']}/autotrain-{config['project_name']}\"\n",
+    "    model_name = f\"{whoami()['name']}/{config['project_name']}\"\n",
+    "    autotrain_spaces.append(space_name)\n",
+    "    autotrain_models.append(model_name)\n",
+    "    print(f'\\n---\\nhttps://huggingface.co/spaces/{space_name}\\n---\\n')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "bf3d3324-202b-45df-8897-58cf89931c45",
+   "metadata": {},
+   "source": [
+    "# Cleanup"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "adf09687-ab1e-4f1e-8bf9-317cc928467a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from huggingface_hub import HfApi\n",
+    "api = HfApi()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "19d80d26-cda4-41fb-a125-06060c3f90ce",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "['derek-thomas/autotrain-falcon-v03-poe-RFA-gpt3-5',\n",
+      " 'derek-thomas/autotrain-falcon-v03-poe-RFA-falcon',\n",
+      " 'derek-thomas/autotrain-falcon-v03-poe-FAR-gpt3-5',\n",
+      " 'derek-thomas/autotrain-falcon-v03-poe-FAR-falcon',\n",
+      " 'derek-thomas/autotrain-falcon-v03-poe-FA']\n",
+      "\n",
+      "['derek-thomas/falcon-v03-poe-RFA-gpt3-5',\n",
+      " 'derek-thomas/falcon-v03-poe-RFA-falcon',\n",
+      " 'derek-thomas/falcon-v03-poe-FAR-gpt3-5',\n",
+      " 'derek-thomas/falcon-v03-poe-FAR-falcon',\n",
+      " 'derek-thomas/falcon-v03-poe-FA']\n"
+     ]
+    }
+   ],
+   "source": [
+    "from pprint import pprint\n",
+    "pprint(autotrain_spaces)\n",
+    "print()\n",
+    "pprint(autotrain_models)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0040e05b-39c2-4aff-a40e-01577a388eff",
+   "metadata": {},
+   "source": [
+    "<span style=\"color:red; font-size:20px; font-weight:bold;\">\n",
+    "WAIT TO RUN THIS UNTIL YOUR SPACES ARE FINISHED TRAINING!\n",
+    "</span>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f86ed8ad-4e38-454a-a2c1-b1f075399c37",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for space in autotrain_spaces:\n",
+    "    confirm = input(f\"Are you sure you want to delete the space '{space}'? (y/n): \")\n",
+    "    if confirm.lower() == 'y':\n",
+    "        api.delete_repo(space, repo_type='space')\n",
+    "        print(f\"Deleted {space}\")\n",
+    "    else:\n",
+    "        print(f\"Skipped {space}\")\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2182f8fe-8504-4cb9-a0a6-4b143541158d",
+   "metadata": {},
+   "source": [
+    "<span style=\"color:red; font-size:20px; font-weight:bold;\">\n",
+    "ONLY RUN THIS IF YOU NEED TO RESTART FROM SCRATCH\n",
+    "THIS WILL DELETE YOUR MODELS\n",
+    "</span>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "12939405-a731-4a7c-ab4a-e1a4f1850bb6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# for model in autotrain_models:\n",
+    "#     confirm = input(f\"Are you sure you want to delete the model '{model}'? (y/n): \")\n",
+    "#     if confirm.lower() == 'y':\n",
+    "#         api.delete_repo(model, repo_type='model')\n",
+    "#         print(f\"Deleted {model}\")\n",
+    "#     else:\n",
+    "#         print(f\"Skipped {model}\")\n"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "c2a2c864-2082-4be9-8e28-92fd01833e38",
    "metadata": {},
    "outputs": [],
    "source": []