Inference taking 2 or 3 minutes on A100

#15
by karthikeyanvijayan - opened

How long it will usually take to process one image? How to speed up the inference in A100?

Any tricks with image resizing? If yes, How to do that?

karthikeyanvijayan changed discussion title from Inference taking 2 or minutes on A100 to Inference taking 2 or 3 minutes on A100
Llava Hugging Face org

Hey!

You can try using Flash Attention-2 for long context generation, as llava-1.6 uses several patch images which increases token sequence length

Hey!

You can try using Flash Attention-2 for long context generation, as llava-1.6 uses several patch images which increases token sequence length

Tried Flash Attention too. But no luck!

Sign up or log in to comment