新闻 | News

[2024-04-06] 开源puff系列模型,专门针对检索和语义匹配任务,更多的考虑泛化性和私有通用测试集效果,向量维度可变,中英双语

[2024-02-27] 开源stella-mrl-large-zh-v3.5-1792d模型,支持向量可变维度

[2024-02-17] 开源stella v3系列、dialogue编码模型和相关训练数据。

[2023-10-19] 开源stella-base-en-v2 使用简单,不需要任何前缀文本

[2023-10-12] 开源stella-base-zh-v2和stella-large-zh-v2, 效果更好且使用简单,不需要任何前缀文本

[2023-09-11] 开源stella-base-zh和stella-large-zh

欢迎去本人主页查看最新模型,并提出您的宝贵意见!

1 开源模型

本次开源stella-mrl-large-zh-v3.5-1792d模型, 本模型是在stella-large-zh-v3-1792d的基础上使用MRL方法训练而成。 其主要特点是可变的向量维度

2 使用方法

from sentence_transformers import SentenceTransformer
from sklearn.preprocessing import normalize

model = SentenceTransformer("infgrad/stella-mrl-large-zh-v3.5-1792d")
# 注意先不要normalize! 选取前n维后再normalize
vectors = model.encode(["text1", "text2"], normalize_embeddings=False)
print(vectors.shape)  # shape is [2,1792]
# n_dims越大效果越好,但是时空消耗就越大。建议维度选取128的倍数,因为是这么训练的
n_dims = 768
cut_vecs = normalize(vectors[:, :n_dims])

3 不同向量维度的CMTEB得分

stella-mrl-large-zh-v3.5-1792d_1024 代表取前1024维。整体趋势是维度越大效果越好。

Model Retrieval STS PairClassification Classification Reranking Clustering CMTEB-Score
stella-mrl-large-zh-v3.5-1792d_128 70.01 62.17 87.99 70.67 66.77 53.55 67.16
stella-mrl-large-zh-v3.5-1792d_256 72.19 62.41 88.09 71.22 68.32 53.38 68.02
stella-mrl-large-zh-v3.5-1792d_384 72.77 62.43 88.26 71.34 68.31 53.87 68.25
stella-mrl-large-zh-v3.5-1792d_512 73.11 62.45 88.16 71.46 68.32 53.28 68.29
stella-mrl-large-zh-v3.5-1792d_640 73.27 62.49 88.21 71.46 68.69 53.63 68.42
stella-mrl-large-zh-v3.5-1792d_768 73.38 62.5 88.19 71.49 68.64 53.77 68.47
stella-mrl-large-zh-v3.5-1792d_896 73.37 62.5 88.14 71.51 68.44 54.13 68.49
stella-mrl-large-zh-v3.5-1792d_1024 73.43 62.51 88.16 71.52 68.59 53.43 68.44
stella-mrl-large-zh-v3.5-1792d_1152 73.46 62.49 88.16 71.57 68.55 53.67 68.49
stella-mrl-large-zh-v3.5-1792d_1280 73.48 62.51 88.12 71.55 68.44 53.74 68.48
stella-mrl-large-zh-v3.5-1792d_1408 73.48 62.51 88.14 71.58 68.46 53.69 68.48
stella-mrl-large-zh-v3.5-1792d_1536 73.49 62.5 88.11 71.55 68.5 54.06 68.52
stella-mrl-large-zh-v3.5-1792d_1664 73.56 62.49 88.06 71.56 68.47 54.28 68.56
stella-mrl-large-zh-v3.5-1792d_1792 73.51 62.48 88.09 71.56 68.45 54.39 68.56

上述表格中stella-mrl-large-zh-v3.5-1792d_1792的得分为68.56和榜单68.55得分不一致,原因和权重类型有关,小差异请忽略不计。

Downloads last month
14,156
Safetensors
Model size
326M params
Tensor type
F32
·
Inference API

Spaces using dunzhang/stella-mrl-large-zh-v3.5-1792d 4

Evaluation results