webbigdata
/

C3TR-Adapter_gguf

Inference Endpoints

Model card Files Files and versions Community

dahara1 commited on Jul 23

Commit

5e6404c

•

1 Parent(s): 0bf7b57

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -50,7 +50,7 @@ Press the [Open in Colab] button on the link to start Colab
 ### 利用可能なVersion(Available Versions)
-llama.cppを使うと、様々な量子化手法でファイルのサイズを小さくする事が出来ますが、本サンプルでは6種類のみを扱います。小さいサイズのモデルは、少ないメモリで高速に動作させることができますが、モデルの性能も低下します。4ビット(Q4_K_M)くらいがバランスが良いと言われています。
 Although llama.cpp can be used to reduce the size of the file with various quantization methods, this sample deals with only six types. Smaller models can run faster with less memory, but also reduce the performance of the models. 4 bits (Q4_K_M) is said to be a good balance.
 - C3TR-Adapter-IQ3_XXS.gguf 3.6GB

 ### 利用可能なVersion(Available Versions)
+llama.cppを使うと、様々な量子化手法でファイルのサイズを小さくする事が出来ますが、本モデルでは7種類のみを扱います。小さいサイズのモデルは、少ないメモリで高速に動作させることができますが、モデルの性能も低下します。4ビット(Q4_K_M)くらいがバランスが良いと言われています。
 Although llama.cpp can be used to reduce the size of the file with various quantization methods, this sample deals with only six types. Smaller models can run faster with less memory, but also reduce the performance of the models. 4 bits (Q4_K_M) is said to be a good balance.
 - C3TR-Adapter-IQ3_XXS.gguf 3.6GB