How to use this model and what structure is this model?

by Labmem009 - opened Sep 10

Discussion

Labmem009

Sep 10

•

edited Sep 10

Could u pls offer a running script
And I wonder what tasks this model can do?
Thanks a lot!

yukiarimo

Owner Sep 10

•

edited Sep 10

This is a base Miru model, designed only for the image Q&A and captioning tasks. It is used by Yuna AI as an improvised multimodality. (Don’t not use it just like a VLLM). Check the implementation in the Yuna Server (vision.py):

https://github.com/yukiarimo/yuna-ai/blob/main/lib/vision.py

Note: Yuna AI V4 is coming soon with native multimodality, so it would be a great replacement for this

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment