SQCU
/

greener-13-2-base-8bitB

Model card Files Files and versions Metrics Training metrics Community

greener-13-2-base-8bitB / README.md

SQCU's picture

Update README.md

799be5a about 1 year ago

|

history blame contribute delete

1.53 kB

	## OVERVIEW
	PEFT model trained on a strided, raw text dataset.
	Full training parameters are available in training_parameters.json.
	The most salient model feature is a training context window of 256 tokens, with each training window overlapped with the last 128 tokens of the previous window.
	This produces objectionable qualitative results in inference.

	---
	library_name: peft
	---
	## Training procedure


	The following `bitsandbytes` quantization config was used during training:
	- load_in_8bit: True
	- load_in_4bit: False
	- llm_int8_threshold: 6.0
	- llm_int8_skip_modules: None
	- llm_int8_enable_fp32_cpu_offload: False
	- llm_int8_has_fp16_weight: False
	- bnb_4bit_quant_type: fp4
	- bnb_4bit_use_double_quant: False
	- bnb_4bit_compute_dtype: float32

	The following `bitsandbytes` quantization config was used during training:
	- load_in_8bit: True
	- load_in_4bit: False
	- llm_int8_threshold: 6.0
	- llm_int8_skip_modules: None
	- llm_int8_enable_fp32_cpu_offload: False
	- llm_int8_has_fp16_weight: False
	- bnb_4bit_quant_type: fp4
	- bnb_4bit_use_double_quant: False
	- bnb_4bit_compute_dtype: float32

	The following `bitsandbytes` quantization config was used during training:
	- load_in_8bit: True
	- load_in_4bit: False
	- llm_int8_threshold: 6.0
	- llm_int8_skip_modules: None
	- llm_int8_enable_fp32_cpu_offload: False
	- llm_int8_has_fp16_weight: False
	- bnb_4bit_quant_type: fp4
	- bnb_4bit_use_double_quant: False
	- bnb_4bit_compute_dtype: float32
	### Framework versions

	- PEFT 0.5.0.dev0
	- PEFT 0.5.0.dev0

	- PEFT 0.5.0.dev0