Inference taking 2 or 3 minutes on A100
#15
by
karthikeyanvijayan
- opened
How long it will usually take to process one image? How to speed up the inference in A100?
Any tricks with image resizing? If yes, How to do that?
karthikeyanvijayan
changed discussion title from
Inference taking 2 or minutes on A100
to Inference taking 2 or 3 minutes on A100
Hey!
You can try using Flash Attention-2 for long context generation, as llava-1.6 uses several patch images which increases token sequence length
Hey!
You can try using Flash Attention-2 for long context generation, as llava-1.6 uses several patch images which increases token sequence length
Tried Flash Attention too. But no luck!