File size: 1,024 Bytes
ff197b4
 
 
b43db50
6f248db
 
b43db50
6f248db
b43db50
 
 
 
 
 
6f248db
ecdf188
6f248db
ecdf188
b43db50
6f248db
7f0f761
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
---
license: mit
---
# 🔥 MoE-Mixtral-7B-8Expert
[mixtral-8x7b](https://huggingface.co/someone13574/mixtral-8x7b-32kseqlen) is a Mixture-of-Expert (MoE) model. 
[LLaMA2-Accessory](https://github.com/Alpha-VLLM/LLaMA2-Accessory) has supported its inference and finetuning. 

## 🚀 Features
With LLaMA2-Accessory, mixtral-8x7b enjoys the following features:
1. Distributed MoE (namely instantiating experts on multiple processes/gpus)
2. Load Balancing Loss
3. Tensor Parallel and FSDP for efficiently training
4. Distributed and/or quantized inference

## 🔥 Online Demo
We host a web demo [💻here](http://106.14.127.192/), which shows a mixtral-8x7b model finetuned on 
[evol-codealpaca-v1](https://huggingface.co/datasets/theblackcat102/evol-codealpaca-v1) and 
[ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k), with LoRA and Bias tuning.

## 💡 Tutorial
A detailed tutorial is available at our [document](https://llama2-accessory.readthedocs.io/en/latest/projects/mixtral-8x7b.html)