Warning: target device or backend do not support efficient int8_float16 computation.

#1
by AByrth - opened

Hi!

I'm having the following warning:

[ctranslate2] [thread 15416] [warning] The compute type inferred from the saved model is int8_float16, 
but the target device or backend do not support efficient int8_float16 computation. 
The model weights have been automatically converted to use the int8_float32 compute type instead.

My PC is a Win11 with Ryzen 5600 and RTX 3070. It should handle float16, right?

Wikipedia says that the 3000 series supports bfloat16.

Third-generation Tensor Cores with FP16, bfloat16, TensorFloat-32 (TF32) and sparsity acceleration

might be a driver issue? I'll have to look into it

Yes, NV series 30 are CUDA capability 8.6 , and it does support bfloat16 float16.
It is not a show-stopper issue, as the lib works anyway.
But would be nice to have it optimized, with gains in speed and memory economy.

I don't have a 30 series GPU to test the model on, but can you try setting the compute_type?

translator = ctranslate2.Translator("ct2fast_m2m100_418M", device="cuda", compute_type="float16")

I'm not sure if this will help but worth a try.

Rohith04 changed discussion status to closed

Sign up or log in to comment