|
--- |
|
language: |
|
- en |
|
inference: false |
|
--- |
|
|
|
# XLM RoBERTa Base ANCE-Warmuped |
|
This is a XLM RoBERTa Base model trained with ANCE warmup script. |
|
RobertaForSequenceClassification is replaced to XLMRobertaForSequenceClassification in warmup script. |
|
|
|
trained 60k steps. |
|
|
|
train args is below: |
|
``` text |
|
data_dir: ../data/raw_data/ |
|
train_model_type: rdot_nll |
|
model_name_or_path: xlm-roberta-base |
|
task_name: msmarco |
|
output_dir: |
|
config_name: |
|
tokenizer_name: |
|
cache_dir: |
|
max_seq_length: 128 |
|
do_train: True |
|
do_eval: False |
|
evaluate_during_training: True |
|
do_lower_case: False |
|
log_dir: ../logs/ |
|
eval_type: full |
|
optimizer: lamb |
|
scheduler: linear |
|
per_gpu_train_batch_size: 32 |
|
per_gpu_eval_batch_size: 32 |
|
gradient_accumulation_steps: 1 |
|
learning_rate: 0.0002 |
|
weight_decay: 0.0 |
|
adam_epsilon: 1e-08 |
|
max_grad_norm: 1.0 |
|
num_train_epochs: 2.0 |
|
max_steps: -1 |
|
warmup_steps: 1000 |
|
logging_steps: 1000 |
|
logging_steps_per_eval: 20 |
|
save_steps: 30000 |
|
eval_all_checkpoints: False |
|
no_cuda: False |
|
overwrite_output_dir: True |
|
overwrite_cache: False |
|
seed: 42 |
|
fp16: True |
|
fp16_opt_level: O1 |
|
expected_train_size: 35000000 |
|
load_optimizer_scheduler: False |
|
local_rank: 0 |
|
server_ip: |
|
server_port: |
|
n_gpu: 1 |
|
device: cuda:0 |
|
output_mode: classification |
|
num_labels: 2 |
|
train_batch_size: 32 |
|
``` |
|
|
|
# Eval Result |
|
``` text |
|
Reranking/Full ranking mrr: 0.27380855732933/0.24284821712830248 |
|
{"learning_rate": 0.00019460324719871943, "loss": 0.0895877162806064, "step": 60000} |
|
``` |
|
|
|
# Usage |
|
``` python3 |
|
from transformers import XLMRobertaForSequenceClassification, XLMRobertaTokenizer |
|
repo = "k-ush/xlm-roberta-base-ance-warmup" |
|
model = XLMRobertaForSequenceClassification.from_pretrained(repo) |
|
tokenizer = XLMRobertaTokenizer.from_pretrained(repo) |
|
``` |