metadata

license: apache-2.0
datasets:
  - truro7/vn-law-questions-and-corpus
language:
  - vi
metrics:
  - accuracy
  - precision
  - recall
base_model: hiieu/halong_embedding
pipeline_tag: sentence-similarity
library_name: sentence-transformers
tags:
  - legal

VN Law Embedding

VN Law Embedding is a Vietnamese text embedding model designed for Retrieval-Augmented Generation (RAG), specifically to retrieve precise legal documents in response to legal questions.

The model is trained on a dataset of Vietnamese legal questions and corresponding legal documents and evaluated using an Information Retrieval Evaluator.

It uses Matryoshka loss during training and can be truncated to smaller dimensions, allowing for faster comparisons between queries and documents without sacrificing performance.