Update README.md
Browse files
README.md
CHANGED
@@ -5,11 +5,15 @@ tags: []
|
|
5 |
|
6 |
# [E5-V: Universal Embeddings with Multimodal Large Language Models](https://arxiv.org/abs/2407.12580)
|
7 |
|
|
|
|
|
8 |
## Overview
|
9 |
We propose a framework, called E5-V, to adpat MLLMs for achieving multimodal embeddings. E5-V effectively bridges the modality gap between different types of inputs, demonstrating strong performance in multimodal embeddings even without fine-tuning. We also propose a single modality training approach for E5-V, where the model is trained exclusively on text pairs, demonstrating better performance than multimodal training.
|
10 |
|
11 |
More details can be found in https://github.com/kongds/E5-V
|
12 |
|
|
|
|
|
13 |
## Example
|
14 |
``` python
|
15 |
import torch
|
|
|
5 |
|
6 |
# [E5-V: Universal Embeddings with Multimodal Large Language Models](https://arxiv.org/abs/2407.12580)
|
7 |
|
8 |
+
E5-V is fine-tuned based on lmms-lab/llama3-llava-next-8b.
|
9 |
+
|
10 |
## Overview
|
11 |
We propose a framework, called E5-V, to adpat MLLMs for achieving multimodal embeddings. E5-V effectively bridges the modality gap between different types of inputs, demonstrating strong performance in multimodal embeddings even without fine-tuning. We also propose a single modality training approach for E5-V, where the model is trained exclusively on text pairs, demonstrating better performance than multimodal training.
|
12 |
|
13 |
More details can be found in https://github.com/kongds/E5-V
|
14 |
|
15 |
+
|
16 |
+
|
17 |
## Example
|
18 |
``` python
|
19 |
import torch
|