File size: 1,630 Bytes
7383613
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
## What Works

| Loader         | Loading 1 LoRA | Loading 2 or more LoRAs | Training LoRAs | Multimodal extension | Perplexity evaluation |
|----------------|----------------|-------------------------|----------------|----------------------|-----------------------|
| Transformers   |       βœ…       |           βœ…\*\*        |       βœ…\*     |          βœ…          |           βœ…          |
| llama.cpp      |       ❌       |           ❌            |       ❌       |          ❌          |    use llamacpp_HF    |

| llamacpp_HF    |       ❌       |           ❌            |       ❌       |          ❌          |           βœ…          |
| ExLlamav2_HF   |       βœ…       |           βœ…            |       ❌       |          ❌          |           βœ…          |

| ExLlamav2      |       βœ…       |           βœ…            |       ❌       |          ❌          |   use ExLlamav2_HF    |
| AutoGPTQ       |       βœ…       |           ❌            |       ❌       |          βœ…          |           βœ…          |
| AutoAWQ        |       ?        |           ❌            |       ?        |          ?           |           βœ…          |
| HQQ            |       ?        |           ?             |       ?        |          ?           |           βœ…          |

❌ = not implemented

βœ… = implemented

\* Training LoRAs with GPTQ models also works with the Transformers loader. Make sure to check "auto-devices" and "disable_exllama" before loading the model.



\*\* Multi-LoRA in PEFT is tricky and the current implementation does not work reliably in all cases.