Warning: target device or backend do not support efficient int8_float16 computation.

by AByrth - opened Feb 13, 2024

Discussion

AByrth

Feb 13, 2024

•

edited Feb 18, 2024

Hi!

I'm having the following warning:

[ctranslate2] [thread 15416] [warning] The compute type inferred from the saved model is int8_float16, 
but the target device or backend do not support efficient int8_float16 computation. 
The model weights have been automatically converted to use the int8_float32 compute type instead.

My PC is a Win11 with Ryzen 5600 and RTX 3070. It should handle float16, right?

Rohith04

Owner Feb 14, 2024

Wikipedia says that the 3000 series supports bfloat16.

Third-generation Tensor Cores with FP16, bfloat16, TensorFloat-32 (TF32) and sparsity acceleration

might be a driver issue? I'll have to look into it

AByrth

Feb 18, 2024

Yes, NV series 30 are CUDA capability 8.6 , and it does support bfloat16 float16.
It is not a show-stopper issue, as the lib works anyway.
But would be nice to have it optimized, with gains in speed and memory economy.

Rohith04

Owner Feb 20, 2024

I don't have a 30 series GPU to test the model on, but can you try setting the compute_type?

translator = ctranslate2.Translator("ct2fast_m2m100_418M", device="cuda", compute_type="float16")

I'm not sure if this will help but worth a try.

Rohith04 changed discussion status to closed Apr 17, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment