DeBERTa-v3-base-Zyda-2

Model Description

This model is a fine-tuned version of microsoft/deberta-v3-base on a subset of the Zyphra/Zyda-2 dataset. It was trained using the Masked Language Modeling (MLM) objective to enhance its understanding of the English language.

Performance

The model achieves the following results on the evaluation set:

  • Loss: 2.1833
  • Accuracy: 0.6191

Intended Uses & Limitations

This model is designed to be used and finetuned for the following tasks:

  • Text embedding
  • Text classification
  • Fill-in-the-blank tasks

Limitations:

  • English language only
  • May be inaccurate for specialized jargon, dialects, slang, code, and LaTeX

Training Data

The model was trained on the first 300 000 rows of the Zyphra/Zyda-2 dataset. 5% of that data was used for validation.

Training Procedure

Hyperparameters

The following hyperparameters were used during training:

  • Learning rate: 5e-05
  • Train batch size: 8
  • Eval batch size: 8
  • Seed: 42
  • Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • Learning rate scheduler: Linear
  • Number of epochs: 1.0

Framework versions

  • Transformers: 4.46.3
  • Pytorch: 2.5.1+cu124
  • Datasets: 3.1.0
  • Tokenizers: 0.20.3

Usage Examples

Masked Language Modeling

from transformers import pipeline

unmasker = pipeline('fill-mask', model='agentlans/deberta-v3-base-zyda-2')
result = unmasker("[MASK] is the capital of France.")
print(result)

Text Embedding

from transformers import AutoTokenizer, AutoModel
import torch

model_name = "agentlans/deberta-v3-base-zyda-2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

text = "Example sentence for embedding."
inputs = tokenizer(text, return_tensors='pt')
with torch.no_grad():
    outputs = model(**inputs)

embeddings = outputs.last_hidden_state.mean(dim=1)
print(embeddings)

Ethical Considerations and Bias

As this model is trained on a subset of the Zyda-2 dataset, it may inherit biases present in that data. Users should be aware of potential biases and evaluate the model's output critically, especially for sensitive applications.

Additional Information

For more details about the base model, please refer to microsoft/deberta-v3-base.

Downloads last month
57
Safetensors
Model size
185M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for agentlans/deberta-v3-base-zyda-2

Finetuned
(280)
this model
Finetunes
3 models

Evaluation results