DeBERTa-v3-base-Zyda-2

Model Description

This model is a fine-tuned version of microsoft/deberta-v3-base on a subset of the Zyphra/Zyda-2 dataset. It was trained using the Masked Language Modeling (MLM) objective to enhance its understanding of the English language.

Performance

The model achieves the following results on the evaluation set:

Loss: 2.1833
Accuracy: 0.6191

Intended Uses & Limitations

This model is designed to be used and finetuned for the following tasks:

Text embedding
Text classification
Fill-in-the-blank tasks

Limitations:

English language only
May be inaccurate for specialized jargon, dialects, slang, code, and LaTeX

Training Data

The model was trained on the first 300 000 rows of the Zyphra/Zyda-2 dataset. 5% of that data was used for validation.

Training Procedure

Hyperparameters

The following hyperparameters were used during training:

Learning rate: 5e-05
Train batch size: 8
Eval batch size: 8
Seed: 42
Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
Learning rate scheduler: Linear
Number of epochs: 1.0

Framework versions

Transformers: 4.46.3
Pytorch: 2.5.1+cu124
Datasets: 3.1.0
Tokenizers: 0.20.3

Usage Examples

Masked Language Modeling

from transformers import pipeline

unmasker = pipeline('fill-mask', model='agentlans/deberta-v3-base-zyda-2')
result = unmasker("[MASK] is the capital of France.")
print(result)

Text Embedding

from transformers import AutoTokenizer, AutoModel
import torch

model_name = "agentlans/deberta-v3-base-zyda-2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

text = "Example sentence for embedding."
inputs = tokenizer(text, return_tensors='pt')
with torch.no_grad():
    outputs = model(**inputs)

embeddings = outputs.last_hidden_state.mean(dim=1)
print(embeddings)

Ethical Considerations and Bias

As this model is trained on a subset of the Zyda-2 dataset, it may inherit biases present in that data. Users should be aware of potential biases and evaluate the model's output critically, especially for sensitive applications.

Additional Information

For more details about the base model, please refer to microsoft/deberta-v3-base.

agentlans
/

deberta-v3-base-zyda-2