SQCU
/

greener-13-2-base-8bitD

Model card Files Files and versions Metrics Training metrics Community

greener-13-2-base-8bitD / README.md

SQCU's picture

Update README.md

028345c about 1 year ago

|

history blame contribute delete

1.65 kB


	## OVERVIEW
	PEFT model trained on a strided, raw text dataset.
	Full training parameters are available in training_parameters.json.
	The most salient model feature is a training context window of 1536 tokens, the maximum possible while holding all previous training parameters constant.
	Each training window is overlapped with the last 512 tokens of the previous window.
	This overlap ratio is less unusual than the 'C' type model and produces less objectionable results in eval.

	---
	library_name: peft
	---
	## Training procedure


	The following `bitsandbytes` quantization config was used during training:
	- load_in_8bit: True
	- load_in_4bit: False
	- llm_int8_threshold: 6.0
	- llm_int8_skip_modules: None
	- llm_int8_enable_fp32_cpu_offload: False
	- llm_int8_has_fp16_weight: False
	- bnb_4bit_quant_type: fp4
	- bnb_4bit_use_double_quant: False
	- bnb_4bit_compute_dtype: float32

	The following `bitsandbytes` quantization config was used during training:
	- load_in_8bit: True
	- load_in_4bit: False
	- llm_int8_threshold: 6.0
	- llm_int8_skip_modules: None
	- llm_int8_enable_fp32_cpu_offload: False
	- llm_int8_has_fp16_weight: False
	- bnb_4bit_quant_type: fp4
	- bnb_4bit_use_double_quant: False
	- bnb_4bit_compute_dtype: float32

	The following `bitsandbytes` quantization config was used during training:
	- load_in_8bit: True
	- load_in_4bit: False
	- llm_int8_threshold: 6.0
	- llm_int8_skip_modules: None
	- llm_int8_enable_fp32_cpu_offload: False
	- llm_int8_has_fp16_weight: False
	- bnb_4bit_quant_type: fp4
	- bnb_4bit_use_double_quant: False
	- bnb_4bit_compute_dtype: float32
	### Framework versions

	- PEFT 0.5.0.dev0
	- PEFT 0.5.0.dev0

	- PEFT 0.5.0.dev0