Upload folder using huggingface_hub

Browse files

Files changed (8) hide show

README.md +165 -0
config.json +28 -0
generation_config.json +7 -0
pytorch_model.bin +3 -0
scripts/llm-eval.py +80 -0
special_tokens_map.json +24 -0
tokenizer.model +3 -0
tokenizer_config.json +45 -0

README.md ADDED Viewed

	@@ -0,0 +1,165 @@

+---
+license: apache-2.0
+library_name: peft
+tags:
+- generated_from_trainer
+base_model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
+model-index:
+- name: TinyLlama-1.1B-SlimOrca-Function-Calling-3T
+  results: []
+datasets:
+  - Open-Orca/SlimOrca-Dedup
+  - gardner/glaive-function-calling-v2-sharegpt
+language: en
+---
+# TinyLlama-1.1B-SlimOrca-Function-Calling-3T
+This model is a fine-tuned version of [TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T) on the [SlimOrca](https://huggingface.co/datasets/Open-Orca/SlimOrca-Dedup) and [glaive-function-calling-v2](https://huggingface.co/datasets/gardner/glaive-function-calling-v2-sharegpt) datasets.
+# Evaluation
+It achieves the following results on the evaluation set:
+- Loss: 0.7403
+Please see the `scripts/llm-eval.py` to recreate the evaluation results from the test split as published here: [gardner/tinyllama-function-calling-eval](https://huggingface.co/datasets/gardner/tinyllama-function-calling-eval). The model responds with function calling when expected and refuses when it doesn't have access to tools. In the linked dataset, `result1` is generated by this model and `result2` is from the test dataset.
+[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
+<details><summary>See axolotl config</summary>
+axolotl version: `0.3.0`
+```yaml
+base_model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
+model_type: LlamaForCausalLM
+tokenizer_type: LlamaTokenizer
+is_llama_derived_model: true
+load_in_8bit: true
+load_in_4bit: false
+strict: false
+datasets:
+  - path: Open-Orca/SlimOrca-Dedup
+    type: sharegpt
+    conversation: chatml
+  - path: gardner/glaive-function-calling-v2-sharegpt
+    type: sharegpt
+    conversation: chatml
+dataset_prepared_path: ./.prepared-datasets/glaive-function-calling-v2-sharegpt
+val_set_size: 0.05
+output_dir: ./tinyllama/function-calling/chatml
+sequence_len: 4096
+sample_packing: true
+pad_to_sequence_len: true
+adapter: lora
+lora_model_dir:
+lora_r: 32
+lora_alpha: 16
+lora_dropout: 0.05
+lora_target_linear: true
+lora_fan_in_fan_out:
+wandb_project:
+wandb_entity:
+wandb_watch:
+wandb_name:
+wandb_log_model:
+gradient_accumulation_steps: 4
+micro_batch_size: 2
+num_epochs: 4
+optimizer: adamw_bnb_8bit
+lr_scheduler: cosine
+learning_rate: 0.0002
+train_on_inputs: false
+group_by_length: false
+bf16: true
+fp16: false
+tf32: false
+gradient_checkpointing: true
+early_stopping_patience:
+resume_from_checkpoint:
+local_rank:
+logging_steps: 1
+xformers_attention:
+flash_attention: true
+warmup_steps: 10
+evals_per_epoch: 4
+saves_per_epoch: 1
+debug:
+deepspeed:
+weight_decay: 0.0
+fsdp:
+fsdp_config:
+special_tokens:
+```
+</details><br>
+## Training procedure
+The following `bitsandbytes` quantization config was used during training:
+- quant_method: bitsandbytes
+- load_in_8bit: True
+- load_in_4bit: False
+- llm_int8_threshold: 6.0
+- llm_int8_skip_modules: None
+- llm_int8_enable_fp32_cpu_offload: False
+- llm_int8_has_fp16_weight: False
+- bnb_4bit_quant_type: fp4
+- bnb_4bit_use_double_quant: False
+- bnb_4bit_compute_dtype: float32
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0002
+- train_batch_size: 2
+- eval_batch_size: 2
+- seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_steps: 10
+- num_epochs: 4
+### Training results
+| Training Loss | Epoch | Step  | Validation Loss |
+|:-------------:|:-----:|:-----:|:---------------:|
+| 1.2492        | 0.0   | 1     | 1.2363          |
+| 0.7621        | 0.25  | 1896  | 0.8096          |
+| 0.757         | 0.5   | 3792  | 0.7852          |
+| 0.6424        | 0.75  | 5688  | 0.7717          |
+| 0.5944        | 1.04  | 7584  | 0.7625          |
+| 0.73          | 1.29  | 9480  | 0.7585          |
+| 0.6781        | 1.54  | 11376 | 0.7521          |
+| 0.829         | 1.79  | 13272 | 0.7471          |
+| 0.6964        | 2.08  | 15168 | 0.7467          |
+| 0.6652        | 2.33  | 17064 | 0.7453          |
+| 0.7645        | 2.58  | 18960 | 0.7420          |
+| 0.5702        | 2.83  | 20856 | 0.7392          |
+| 0.7049        | 3.12  | 22752 | 0.7418          |
+| 0.6087        | 3.37  | 24648 | 0.7412          |
+| 0.6064        | 3.62  | 26544 | 0.7405          |
+| 0.7125        | 3.87  | 28440 | 0.7403          |
+### Framework versions
+- PEFT 0.7.0
+- Transformers 4.37.0.dev0
+- Pytorch 2.0.1+cu118
+- Datasets 2.16.1
+- Tokenizers 0.15.0

config.json ADDED Viewed

	@@ -0,0 +1,28 @@

+{
+  "_name_or_path": "TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T",
+  "architectures": [
+    "LlamaForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "bos_token_id": 1,
+  "eos_token_id": 2,
+  "hidden_act": "silu",
+  "hidden_size": 2048,
+  "initializer_range": 0.02,
+  "intermediate_size": 5632,
+  "max_position_embeddings": 4096,
+  "model_type": "llama",
+  "num_attention_heads": 32,
+  "num_hidden_layers": 22,
+  "num_key_value_heads": 4,
+  "pretraining_tp": 1,
+  "rms_norm_eps": 1e-05,
+  "rope_scaling": null,
+  "rope_theta": 10000.0,
+  "tie_word_embeddings": false,
+  "torch_dtype": "bfloat16",
+  "transformers_version": "4.37.0.dev0",
+  "use_cache": false,
+  "vocab_size": 32000
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "bos_token_id": 1,
+  "eos_token_id": 2,
+  "max_length": 2048,
+  "pad_token_id": 0,
+  "transformers_version": "4.37.0.dev0"
+}

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c00c53d49251b24c2b789c84e862a73b4679297e88816923bd94f1d0bd62111b
+size 2200164273

scripts/llm-eval.py ADDED Viewed

	@@ -0,0 +1,80 @@

+from transformers import AutoModelForCausalLM, AutoTokenizer
+from datasets import load_dataset
+import argilla as rg
+import os
+from tqdm import tqdm
+dir_path = os.path.dirname(os.path.realpath(__file__))
+rg.init(
+    api_url="<argilla-api-url>",
+    api_key="<argilla-api-key>"
+)
+ds = rg.FeedbackDataset.for_preference_modeling(
+    number_of_responses=2,
+    context=False,
+    use_markdown=False,
+    guidelines=None,
+    metadata_properties=None,
+    vectors_settings=None,
+)
+model_dir = dir_path +'/lora-out/merged'
+tokenizer = AutoTokenizer.from_pretrained(model_dir)
+dataset = load_dataset("gardner/glaive-function-calling-v2-sharegpt", split="test")
+def preprocess_function_calling(examples):
+    texts = []
+    answers = []
+    for chat in examples["conversations"]:
+        for turn in chat:
+            # from -> role
+            turn['role'] = turn.pop('from')
+            turn['content'] = turn.pop('value')
+            if turn['role'] == 'human':
+                turn['role'] = 'user'
+            if turn['role'] == 'gpt':
+                turn['role'] = 'assistant'
+        if chat[-1]['role'] == 'assistant':
+            answers.append(chat[-1]['content'])
+            del chat[-1] # remove the last assistant turn
+        texts.append(tokenizer.apply_chat_template(chat, tokenize=False))
+    return { "texts": texts, "answer": answers }
+texts = dataset.map(preprocess_function_calling, batched=True)
+model = AutoModelForCausalLM.from_pretrained(model_dir).to("cuda")
+records = []
+for text in tqdm(texts):
+    prompt = text['texts'] + "<|im_start|>assistant\n"
+    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
+    outputs = model.generate(**inputs, max_new_tokens=512)
+    response = tokenizer.decode(outputs[0], skip_special_tokens=True, temperature=0.9)
+    prompt_len = len(text['texts'] + "<|im_start|>assistant\n")
+    response1 = response[prompt_len:].replace("<|im_end|>\n", "").strip()
+    response2 = text['answer']
+    print(response1)
+    records.append(rg.FeedbackRecord(fields={
+        "prompt": prompt,
+        "response1": response1,
+        "response2": response2,
+    }))
+    # text['response'] = response
+ds.add_records(records)
+ds.push_to_argilla(name="function-calling", workspace="argilla")

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": "</s>",
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
+size 499723

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,45 @@

+{
+  "add_bos_token": true,
+  "add_eos_token": false,
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<unk>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<s>",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "</s>",
+  "legacy": false,
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "</s>",
+  "padding_side": "right",
+  "sp_model_kwargs": {},
+  "spaces_between_special_tokens": false,
+  "tokenizer_class": "LlamaTokenizer",
+  "trust_remote_code": false,
+  "unk_token": "<unk>",
+  "use_default_system_prompt": false,
+  "use_fast": true,
+  "chat_template": "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}"
+}