kaitchup-toolboxes
/

Llama3

Inference Endpoints

Model card Files Files and versions Community

Llama3 / README.md

bnjmnmarie's picture

Upload 5 files

9be732b verified 15 days ago

|

history blame contribute delete

2.18 kB

	---
	language:
	- en
	library_name: transformers
	extra_gated_prompt: >-
	To gain access, [subscribe to The Kaitchup
	Pro](https://newsletter.kaitchup.com/subscribe). You will receive an access
	token for all the toolboxes in your welcome email. You can also purchase an
	access specifically for this repository on
	[Gumroad](https://benjaminmarie.gumroad.com/l/llama-3-toolbox). Once you have access, you can request for help and suggest new notebooks through the community tab.
	datasets:
	- mlabonne/orpo-dpo-mix-40k
	- HuggingFaceH4/ultrachat_200k
	---

	This toolbox already includes 19 Jupyter notebooks specially optimized for [Llama 3.1](https://huggingface.co/collections/meta-llama/llama-31-669fc079a0c406a149a5738f) amd [Llama 3.2](https://huggingface.co/collections/meta-llama/llama-32-66f448ffc8c32f949b04c8cf) LLMs. The logs of successful runs are also provided. More notebooks will be regularly added.

	Once you've subscribed to The Kaitchup Pro or purchased access, you can also request repository access here.

	To run the code in the toolbox, CUDA 12.4 and PyTorch 2.4 are recommended. PyTorch 2.5 might already work but I didn't test it yet.

	# Toolbox content

	* Supervised Fine-Tuning with Chat Templates (6 notebooks)

	* Full fine-tuning

	* LoRA fine-tuning

	* LoRA fine-tuning (with Llama 3.1/3.2 Instruct)

	* Multi-GPU QLoRA/LoRA fine-tuning with FSDP (with Llama 3.1/3.2 Instruct)

	* QLoRA fine-tuning with Bitsandbytes quantization

	* QLoRA fine-tuning with AutoRound quantization

	* LoRA and QLoRA fine-tuning with Unsloth

	* Preference Optimization (2 notebooks)

	* DPO training with LoRA (TRL and Transformers)

	* ORPO training with LoRA (TRL and Transformers)

	* Multi-GPU QLoRA/LoRA DPO Training with FSDP

	* Quantization (3 notebooks)

	* AWQ

	* AutoRound

	* GGUF for llama.cpp

	* Inference (4 notebooks)

	* Transformers with and without a LoRA adapter

	* vLLM offline and online inference

	* Ollama

	* llama.cpp

	* Merging (3 notebooks)

	* Merge a LoRA adapter into the base model

	* Merge a QLoRA adapter into the base model

	* Merge several Llama 3.1/3.2 models into one with mergekit (not released yet)