--- license: apache-2.0 base_model: distilgpt2 tags: - generated_from_trainer - gpt model-index: - name: gpt2_dolly_lite results: [] datasets: - tatsu-lab/alpaca language: - en metrics: - accuracy pipeline_tag: text2text-generation --- <!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. --> # gpt2_dolly_lite This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 2.4067 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.001 - train_batch_size: 8 - eval_batch_size: 32 - seed: 42 - gradient_accumulation_steps: 4 - total_train_batch_size: 32 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 3 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 2.708 | 1.0 | 1300 | 2.5611 | | 2.1768 | 2.0 | 2600 | 2.4149 | | 1.7189 | 3.0 | 3900 | 2.4067 | ### USAGE ``` MODEL = 'distilgpt2' tokenizer = AutoTokenizer.from_pretrained(MODEL) tokenizer.pad_token = tokenizer.eos_token def respond(instruction, generator, _input=None, verbose=False, **options): if not _input: prompt = f'Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:\n' else: prompt = f'Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Input: {_input}\n\n### Response:\n' if verbose: print(prompt) generated_texts = generator( prompt, num_return_sequences=3, temperature=options.get('temperature', 0.7), max_new_tokens=options.get('max_new_tokens', 128) ) for generated_text in generated_texts: print(generated_text['generated_text'].split('### Response:\n')[1]) print('----') loaded_model = AutoModelForCausalLM.from_pretrained('Andyrasika/gpt2_dolly_lite') dolly_lite = pipeline('text-generation', model=loaded_model, tokenizer=tokenizer) respond( 'Write me an email to my boss, telling her I quit because I made a cool LLM.', dolly_lite ) ``` ### Framework versions - Transformers 4.32.1 - Pytorch 2.0.1+cu118 - Datasets 2.14.4 - Tokenizers 0.13.3