--- pipeline_tag: zero-shot-classification license: mit datasets: - xnli language: - fr tags: - camembert - text-classification - nli - xnli --- This is a copy of the original BaptisteDoyen/camembert-base-xnli model as it gives a 404 error right now.\ Here is the model card as it was on BaptisteDoyen/camembert-base-xnli page. # camembert-base-xnli ## Model description Camembert-base model fine-tuned on french part of XNLI dataset. One of the few Zero-Shot classification model working on French 🇫🇷 ## Intended uses & limitations #### How to use Two different usages : - As a Zero-Shot sequence classifier : ``` classifier = pipeline("zero-shot-classification", model="BaptisteDoyen/camembert-base-xnli") sequence = "L'équipe de France joue aujourd'hui au Parc des Princes" candidate_labels = ["sport","politique","science"] hypothesis_template = "Ce texte parle de {}." classifier(sequence, candidate_labels, hypothesis_template=hypothesis_template) # outputs : # {'sequence': "L'équipe de France joue aujourd'hui au Parc des Princes", # 'labels': ['sport', 'politique', 'science'], # 'scores': [0.8595073223114014, 0.10821866989135742, 0.0322740375995636]} ``` - As a premise/hypothesis checker : The idea is here to compute a probability of the form P(premise∣hypothesis) P(premise|hypothesis ) P(premise∣hypothesis) ``` # load model and tokenizer nli_model = AutoModelForSequenceClassification.from_pretrained("BaptisteDoyen/camembert-base-xnli") tokenizer = AutoTokenizer.from_pretrained("BaptisteDoyen/camembert-base-xnli") # sequences premise = "le score pour les bleus est élevé" hypothesis = "L'équipe de France a fait un bon match" # tokenize and run through model x = tokenizer.encode(premise, hypothesis, return_tensors='pt') logits = nli_model(x)[0] # we throw away "neutral" (dim 1) and take the probability of # "entailment" (0) as the probability of the label being true entail_contradiction_logits = logits[:,::2] probs = entail_contradiction_logits.softmax(dim=1) prob_label_is_true = probs[:,0] prob_label_is_true[0].tolist() * 100 # outputs # 86.40775084495544 ``` ## Training data Training data is the french fold of the [XNLI](https://research.fb.com/publications/xnli-evaluating-cross-lingual-sentence-representations/) dataset released in 2018 by Facebook. Available with great ease using the datasets library : ``` from datasets import load_dataset dataset = load_dataset('xnli', 'fr') ``` ## Training/Fine-Tuning procedure Training procedure is here pretty basic and was performed on the cloud using a single GPU. Main training parameters : - `lr = 2e-5 with lr_scheduler_type = "linear"` - `num_train_epochs = 4` - `batch_size = 12 (limited by GPU-memory)` - `weight_decay = 0.01` - `metric_for_best_model = "eval_accuracy"` ## Eval results We obtain the following results on validation and test sets: Set|Accuracy ---|--- validation|81.4 test|81.7