kanishka
/

smolm-autoreg-bpe-seed_555

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

smolm-autoreg-bpe-seed_555 / README.md

kanishka's picture

Model save

2f2f79c over 1 year ago

|

2.04 kB

	---
	base_model: models/smolm-autoreg-bpe-seed_555/config.json
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: smolm-autoreg-bpe-seed_555
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# smolm-autoreg-bpe-seed_555

	This model is a fine-tuned version of [models/smolm-autoreg-bpe-seed_555/config.json](https://huggingface.co/models/smolm-autoreg-bpe-seed_555/config.json) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.2878
	- Accuracy: 0.5417

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.003
	- train_batch_size: 64
	- eval_batch_size: 512
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 24000
	- num_epochs: 10.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|
	\| 5.8796 \| 1.0 \| 826 \| 3.1083 \| 0.4611 \|
	\| 2.802 \| 2.0 \| 1652 \| 2.7455 \| 0.4965 \|
	\| 2.6268 \| 3.0 \| 2478 \| 2.5734 \| 0.5116 \|
	\| 2.4165 \| 4.0 \| 3304 \| 2.4667 \| 0.5211 \|
	\| 2.2892 \| 5.0 \| 4130 \| 2.3949 \| 0.5288 \|
	\| 2.2315 \| 6.0 \| 4956 \| 2.3446 \| 0.5338 \|
	\| 2.1587 \| 7.0 \| 5782 \| 2.3208 \| 0.5374 \|
	\| 2.1253 \| 8.0 \| 6608 \| 2.3044 \| 0.5394 \|
	\| 2.0858 \| 9.0 \| 7434 \| 2.2940 \| 0.5404 \|
	\| 2.0556 \| 10.0 \| 8260 \| 2.2878 \| 0.5417 \|


	### Framework versions

	- Transformers 4.32.1
	- Pytorch 1.13.1+cu117
	- Datasets 2.12.0
	- Tokenizers 0.13.3