--- license: bigcode-openrail-m library_name: peft tags: - trl - sft - generated_from_trainer base_model: bigcode/starcoder2-3b model-index: - name: finetunedPHP_starcoder2 results: [] datasets: - bigcode/the-stack-smol language: - en --- # finetunedPHP_starcoder2 This model is a fine-tuned version of [bigcode/starcoder2-3b](https://huggingface.co/bigcode/starcoder2-3b) on [bigcode/the-stack-smol](https://huggingface.co/datasets/bigcode/the-stack-smol). ## Model description The `finetunedPHP_starcoder2` model is based on the `starcoder2-3b` architecture, fine-tuned specifically on PHP code from `the-stack-smol` dataset. It is intended for code generation tasks related to PHP programming. ## Intended uses & limitations The `finetunedPHP_starcoder2` model is suitable for generating PHP code snippets for various purposes, including code completion, syntax suggestions, and code generation tasks. However, it may have limitations in generating complex or domain-specific code, and users should verify the generated code for correctness and security. ## Training and evaluation data The model was trained on a dataset consisting of PHP code samples collected from `the-stack-smol` dataset. The training data included code snippets from PHP repositories, forums, and online tutorials. ## Training procedure **1. Data and Model Preparation:** - Load the PHP dataset from my repository `bigcode/the-stack-smol`. - Extract the relevant PHP data `data/php` samples for training. - Utilize the `starcoder2-3b` model pre-trained on a diverse range of programming languages, including PHP, from the Hugging Face Hub. - Ensure the model is configured with '4-bit' quantization for efficient computation. **2. Data Processing:** - Tokenize the PHP code snippets using the model's tokenizer. - Clean the code by removing comments and normalizing indentation. - Prepare input examples suitable for the model, considering its architecture and objectives. **3. Training Configuration:** - Initialize a Trainer object for fine-tuning, leveraging the Transformers library. - Define training parameters, including: - Learning rate, optimizer, and scheduler settings. - Gradient accumulation steps to balance memory usage. - Loss function, typically cross-entropy for language modeling. - Metrics for evaluating model performance. - Specify GPU utilization for accelerated training. - Handle potential distributed training with multiple processes. **4. Model Training:** - Commence training for a specified number of steps. - Iterate through batches of preprocessed PHP code examples. - Feed examples into the model and compute predictions. - Calculate loss based on predicted and actual outcomes. - Update model weights by backpropagating gradients through the network. **5. Evaluation (Optional):** - Periodically assess the model's performance on a validation set. - Measure key metrics such as code completion accuracy or perplexity. - Monitor training progress to fine-tune hyperparameters if necessary. - Use wandb metric monitoring for live monitoring. **6. Save the Fine-tuned Model:** - Store the optimized model weights and configuration in the designated `output_dir`. **7. Model Sharing (Optional):** - Optionally, create a model card documenting the fine-tuning process and model specifications. - Share the finetunedPHP_starcoder2 model on the Hugging Face Hub for broader accessibility and collaboration. ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0002 - train_batch_size: 1 - eval_batch_size: 8 - seed: 0 - gradient_accumulation_steps: 4 - total_train_batch_size: 4 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 100 - training_steps: 1000 - mixed_precision_training: Native AMP ### Training results Training results and performance metrics are present in the repo. ### Framework versions - PEFT 0.8.2 - Transformers 4.40.0.dev0 - Pytorch 2.2.1+cu121 - Datasets 2.18.0 - Tokenizers 0.15.2