metadata
library_name: transformers
license: apache-2.0
base_model: Helsinki-NLP/opus-mt-en-fr
tags:
- translation
- generated_from_trainer
datasets:
- kde4
metrics:
- bleu
model-index:
- name: marian-finetuned-kde4-en-to-fr
results:
- task:
name: Sequence-to-sequence Language Modeling
type: text2text-generation
dataset:
name: kde4
type: kde4
config: en-fr
split: train
args: en-fr
metrics:
- name: Bleu
type: bleu
value: 50.54449537679619
Marian Fine-Tuned KDE4 (English-to-French)
This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-fr using the KDE4 dataset. It achieves the following results on the evaluation set:
- Loss: 0.9620
- BLEU: 50.5445
Model Description
This English-to-French translation model has been fine-tuned specifically on the KDE4 dataset. The base model, Helsinki-NLP/opus-mt-en-fr, is part of the MarianMT family, renowned for its efficiency and high-quality neural machine translation capabilities.
Intended Uses & Limitations
Intended Uses
- Translating English text into French.
- High-quality translations in the context of software localization, especially related to KDE4.
Limitations
- Performance may decline on texts outside the KDE4 domain.
- Struggles with idiomatic expressions, specialized technical jargon, or ambiguous content.
Training & Evaluation Data
The model was fine-tuned on the KDE4 dataset, a specialized resource for machine translation in software localization. The evaluation metrics reflect the model's performance on this domain-specific data.
Training Procedure
Hyperparameters
- Learning Rate: 2e-05
- Train Batch Size: 32
- Eval Batch Size: 64
- Seed: 42
- Optimizer: AdamW with
betas=(0.9, 0.999)
,epsilon=1e-08
- LR Scheduler: Linear
- Epochs: 1
- Mixed Precision Training: Native AMP
Results
- Loss: 0.9620
- BLEU: 50.5445
Training Loss Progression
Step | Training Loss |
---|---|
500 | 1.2253 |
1000 | 1.2165 |
1500 | 1.1913 |
2000 | 1.1404 |
2500 | 1.1178 |
3000 | 1.0900 |
3500 | 1.0594 |
4000 | 1.0512 |
4500 | 1.0633 |
5000 | 1.0405 |
5500 | 1.0316 |
Framework Versions
- Transformers: 4.47.1
- PyTorch: 2.5.1+cu121
- Datasets: 3.2.0
- Tokenizers: 0.21.0
Example Usage
from transformers import pipeline
# Load the model
model_checkpoint = "ParitKansal/marian-finetuned-kde4-en-to-fr"
translator = pipeline("translation", model=model_checkpoint)
# Translate text
translation = translator("Default to expanded threads")
print(translation)
This script demonstrates how to use the model for English-to-French translation tasks.