aria-dev commited on
Commit
021f586
1 Parent(s): 5c6db29

update readme

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -12,7 +12,7 @@ tags:
12
  <br>Aria</br>
13
  </p> -->
14
 
15
- This is a fork of the [rhymes-ai/Aria](https://huggingface.co/rhymes-ai/Aria) model. The main modification involves replacing [grouped GEMM](https://github.com/tgale96/grouped_gemm) with a sequential MLP. In this configuration, each expert is implemented as a `torch.nn.Linear` layer executed in sequence. This adjustment simplifies quantization with current open-source libraries, which are optimized for `nn.Linear` layers.
16
 
17
  While the sequential MLP approach aids in easier quantization, using grouped GEMM provides the advantage of faster inference speed.
18
 
 
12
  <br>Aria</br>
13
  </p> -->
14
 
15
+ This is a fork of the [rhymes-ai/Aria](https://huggingface.co/rhymes-ai/Aria) model. The only modification is replacing [grouped GEMM](https://github.com/tgale96/grouped_gemm) with a sequential MLP. In this configuration, each expert is implemented as a `torch.nn.Linear` layer executed in sequence. This adjustment simplifies quantization with current open-source libraries, which are optimized for `nn.Linear` layers.
16
 
17
  While the sequential MLP approach aids in easier quantization, using grouped GEMM provides the advantage of faster inference speed.
18