Spaces:

Robzy
/

llm

Runtime error

llm / README.md

updated README

74df2e3 2 months ago

1.08 kB

	---
	title: quantized-LLM comparison
	emoji: 💬
	colorFrom: yellow
	colorTo: purple
	sdk: gradio
	sdk_version: 5.0.1
	app_file: app.py
	pinned: false
	short_descriptions: Fine-tuned Llama-3.2-1B-Instruct with different quantizations
	---

	An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).

	### [HuggingFace Space with Quantized LLMs](https://huggingface.co/spaces/Robzy/llm)

	Baseline model: Llama-3.2-1B-Instruct with 4-bit quantization

	Training infrastracture:
	* Google Colab with NVIDIA Tesla T4 GPU
	* Finetuning with parameter-effecient finetuning (PEFT) by low-rank adaption (LORA) using Unsloth and HuggingFace's supervised finetuning libraries.
	* Weight & Biases for model training monitoring and model checkpointing. Checkpointing every 10 steps.

	Finetuning details

	Datasets:
	* [Code instructions Alpaca 120k](https://huggingface.co/datasets/iamtarun/code_instructions_120k_alpaca)