--- license: openrail datasets: - codeparrot/github-jupyter-code-to-text library_name: transformers tags: - code --- # Santacoder code-to-text This model is a fine-tuned version of [bigcode/santacoder](https://huggingface.co/bigcode/santacoder) on [copdeparrot/gitub-jupyter-code-to-text](https://huggingface.co/datasets/codeparrot/github-jupyter-code-to-text). ## Training procedure The model was trained on 4 A100 for 3h with the following hyperparameters were used during training on 4 A100: - learning_rate: 5e-05 - train_batch_size: 2 - eval_batch_size: 2 - seed: 42 - gradient_accumulation_steps: 4 - total_train_batch_size: 4 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 100 - training_steps: 800