mtheo
/

camembert-base-xnli

Zero-Shot Classification

text-classification

Inference Endpoints

Model card Files Files and versions Community

camembert-base-xnli / README.md

toniopelo's picture

Update model card as it was on the original model

e68b583 verified 8 months ago

|

3.03 kB

	---
	pipeline_tag: zero-shot-classification
	license: mit
	datasets:
	- xnli
	language:
	- fr
	tags:
	- camembert
	- text-classification
	- nli
	- xnli
	---
	This is a copy of the original BaptisteDoyen/camembert-base-xnli model as it gives a 404 error right now.\
	Here is the model card as it was on BaptisteDoyen/camembert-base-xnli page.

	# camembert-base-xnli

	## Model description

	Camembert-base model fine-tuned on french part of XNLI dataset.
	One of the few Zero-Shot classification model working on French 🇫🇷

	## Intended uses & limitations

	#### How to use

	Two different usages :

	- As a Zero-Shot sequence classifier :
	```
	classifier = pipeline("zero-shot-classification",
	model="BaptisteDoyen/camembert-base-xnli")

	sequence = "L'équipe de France joue aujourd'hui au Parc des Princes"
	candidate_labels = ["sport","politique","science"]
	hypothesis_template = "Ce texte parle de {}."

	classifier(sequence, candidate_labels, hypothesis_template=hypothesis_template)
	# outputs :
	# {'sequence': "L'équipe de France joue aujourd'hui au Parc des Princes",
	# 'labels': ['sport', 'politique', 'science'],
	# 'scores': [0.8595073223114014, 0.10821866989135742, 0.0322740375995636]}
	```
	- As a premise/hypothesis checker :
	The idea is here to compute a probability of the form P(premise∣hypothesis) P(premise\|hypothesis ) P(premise∣hypothesis)

	```
	# load model and tokenizer
	nli_model = AutoModelForSequenceClassification.from_pretrained("BaptisteDoyen/camembert-base-xnli")
	tokenizer = AutoTokenizer.from_pretrained("BaptisteDoyen/camembert-base-xnli")
	# sequences
	premise = "le score pour les bleus est élevé"
	hypothesis = "L'équipe de France a fait un bon match"
	# tokenize and run through model
	x = tokenizer.encode(premise, hypothesis, return_tensors='pt')
	logits = nli_model(x)[0]
	# we throw away "neutral" (dim 1) and take the probability of
	# "entailment" (0) as the probability of the label being true
	entail_contradiction_logits = logits[:,::2]
	probs = entail_contradiction_logits.softmax(dim=1)
	prob_label_is_true = probs[:,0]
	prob_label_is_true[0].tolist() * 100
	# outputs
	# 86.40775084495544
	```

	## Training data

	Training data is the french fold of the [XNLI](https://research.fb.com/publications/xnli-evaluating-cross-lingual-sentence-representations/) dataset released in 2018 by Facebook.
	Available with great ease using the datasets library :

	```
	from datasets import load_dataset
	dataset = load_dataset('xnli', 'fr')
	```

	## Training/Fine-Tuning procedure

	Training procedure is here pretty basic and was performed on the cloud using a single GPU.
	Main training parameters :

	- `lr = 2e-5 with lr_scheduler_type = "linear"`
	- `num_train_epochs = 4`
	- `batch_size = 12 (limited by GPU-memory)`
	- `weight_decay = 0.01`
	- `metric_for_best_model = "eval_accuracy"`

	## Eval results

	We obtain the following results on validation and test sets:
	Set\|Accuracy
	---\|---
	validation\|81.4
	test\|81.7