--- library_name: transformers license: apache-2.0 language: - en - fr - de - es - zh - it - ru - pl - pt - ja - vi - nl - ar - tr - hi pipeline_tag: fill-mask tags: - code --- # EuroBERT-210m
EuroBERT
## Table of Contents 1. [Overview](#overview) 2. [Usage](#Usage) 3. [Evaluation](#Evaluation) 4. [License](#license) 5. [Citation](#citation) ## Overview EuroBERT is a family of multilingual encoder models designed for a variety of tasks such as retrieval, classification and regression supporting 15 languages, mathematics and code, supporting sequences of up to 8,192 tokens. EuroBERT models exhibit the strongest multilingual performance across [domains and tasks](#Evaluation) compared to similarly sized systems. It is available in 3 sizes: - [EuroBERT-210m](https://huggingface.co/EuroBERT/EuroBERT-210m) - 210 million parameters - [EuroBERT-610m](https://huggingface.co/EuroBERT/EuroBERT-610m) - 610 million parameters - [EuroBERT-2.1B](https://huggingface.co/EuroBERT/EuroBERT-2.1B) - 2.1 billion parameters For more information about EuroBERT, please check our [blog](***) post and the [arXiv](https://arxiv.org/abs/2503.05500) preprint. ## Usage ```python from transformers import AutoTokenizer, AutoModelForMaskedLM model_id = "EuroBERT/EuroBERT-210m" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForMaskedLM.from_pretrained(model_id, trust_remote_code=True) text = "The capital of France is <|mask|>." inputs = tokenizer(text, return_tensors="pt") outputs = model(**inputs) # To get predictions for the mask: masked_index = inputs["input_ids"][0].tolist().index(tokenizer.mask_token_id) predicted_token_id = outputs.logits[0, masked_index].argmax(axis=-1) predicted_token = tokenizer.decode(predicted_token_id) print("Predicted token:", predicted_token) # Predicted token: Paris ``` **💻 You can use these models directly with the transformers library starting from v4.48.0:** ```sh pip install -U transformers>=4.48.0 ``` **🏎️ If your GPU supports it, we recommend using EuroBERT with Flash Attention 2 to achieve the highest efficiency. To do so, install Flash Attention 2 as follows, then use the model as normal:** ```bash pip install flash-attn ``` ## Evaluation We evaluate EuroBERT on a suite of tasks to cover various real-world use cases for multilingual encoders, including retrieval performance, classification, sequence regression, quality estimation, summary evaluation, code-related tasks, and mathematical tasks. **Key highlights:** The EuroBERT family exhibits strong multilingual performance across domains and tasks. - EuroBERT-2.1B, our largest model, achieves the highest performance among all evaluated systems. It outperforms the largest system, XLM-RoBERTa-XL. - EuroBERT-610m is competitive with XLM-RoBERTa-XL, a model 5 times its size, on most multilingual tasks and surpasses it in code and mathematics tasks. - The smaller EuroBERT-210m generally outperforms all similarly sized systems.
EuroBERT
EuroBERT
EuroBERT
## License We release the EuroBERT model architectures, model weights, and training codebase under the Apache 2.0 license. ## Citation If you use EuroBERT in your work, please cite: ``` SOON ```