How to use this model and what structure is this model?

#1
by Labmem009 - opened

Could u pls offer a running script
And I wonder what tasks this model can do?
Thanks a lot!

This is a base Miru model, designed only for the image Q&A and captioning tasks. It is used by Yuna AI as an improvised multimodality. (Don’t not use it just like a VLLM). Check the implementation in the Yuna Server (vision.py):

https://github.com/yukiarimo/yuna-ai/blob/main/lib/vision.py

Note: Yuna AI V4 is coming soon with native multimodality, so it would be a great replacement for this

Sign up or log in to comment