Pranabit
/

finetune_starcoder2_3b

Generated from Trainer

Model card Files Files and versions Community

finetune_starcoder2_3b / README.md

Pranabit's picture

Update README.md

9be712b verified 7 months ago

|

history blame contribute delete

1.88 kB

	---
	license: bigcode-openrail-m
	library_name: peft
	tags:
	- trl
	- sft
	- generated_from_trainer
	base_model: bigcode/starcoder2-3b
	model-index:
	- name: finetune_starcoder2_3b
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# finetune_starcoder2_3b

	This model is a fine-tuned version of [bigcode/starcoder2-3b](https://huggingface.co/bigcode/starcoder2-3b) on an unknown dataset.

	## Model description

	StarCoder2-3B model is a 3B parameter model trained on 17 programming languages from The Stack v2, with opt-out requests excluded. The model uses Grouped Query Attention, a context window of 16,384 tokens with a sliding window attention of 4,096 tokens, and was trained using the Fill-in-the-Middle objective on 3+ trillion tokens.

	## Intended uses & limitations

	The finetune_starcoder2_3b is the trained model that is capable of generating JavvaScript code snippets for various purposes like code completion, syntax suggestion tasks and code generation tasks. It has some limitations in generating complex levvell codes and users should check the code and use it at their own risk

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0002
	- train_batch_size: 1
	- eval_batch_size: 8
	- seed: 0
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 4
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 100
	- training_steps: 500
	- mixed_precision_training: Native AMP

	### Training results



	### Framework versions

	- PEFT 0.10.0
	- Transformers 4.40.0.dev0
	- Pytorch 2.2.1+cu121
	- Datasets 2.18.0
	- Tokenizers 0.15.2