how is it used?

#48

by Codigoabierto - opened Mar 19, 2024

Discussion

Codigoabierto

Mar 19, 2024

When entering the test an error appears.

mahiatlinux

Mar 19, 2024

@Codigoabierto It isn't a test for inferencing.

Codigoabierto

Mar 19, 2024

Will they place one to test the operation?

mahiatlinux

Mar 19, 2024

Will they place one to test the operation?

The model is 314 B. I don't think yet, as this is still a raw model. They need to quantise it, as the model is around 228 GB.

MrRaja

Mar 20, 2024

If someone creates a 4-bit quantized model of the 314B it would be around ~60gb which means it will still need around that much VRAM minimum right?

Considering that you need about 30-40GB VRAM to engage in inference with a 70b model.

mahiatlinux

Mar 20, 2024

If someone creates a 4-bit quantized model of the 314B it would be around ~60gb which means it will still need around that much VRAM minimum right?

Considering that you need about 30-40GB VRAM to engage in inference with a 70b model.

Yes, probably. Probably even more VRAM.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment