Recognai
/

distilbert-base-es-multilingual-cased

Inference Endpoints

Model card Files Files and versions Community

Dani commited on Mar 8, 2021

Commit

255b995

•

1 Parent(s): d2f8e65

Update readme and config

Files changed (2) hide show

README.md +31 -0
config.json +1 -1

README.md ADDED Viewed

	@@ -0,0 +1,31 @@

+---
+language: spanish
+license: apache-2.0
+datasets:
+- wikipedia
+widget:
+- text: "El español es un idioma muy [MASK] en el mundo."
+---
+# DistilBERT base multilingual model Spanish subset (cased)
+This model is the Spanish extract of `distilbert-base-multilingual-cased`, a distilled version of the [BERT base multilingual model](bert-base-multilingual-cased). It uses the extraction method proposed by Geotrend, which is described in https://github.com/Geotrend-research/smaller-transformers.
+In particular, we've ran the following script:
+```sh
+python reduce_model.py \
+ --source_model distilbert-base-multilingual-cased \
+ --vocab_file notebooks/selected_tokens/selected_es_tokens.txt \
+ --output_model distilbert-base-es-multilingual-cased \
+ --convert_to_tf False
+```
+The resulting model has the same architecture as DistilmBERT: 6 layers, 768 dimension and 12 heads, with a total of **65M parameters** (compared to 134M parameters for DistilmBERT).
+The goal of this model is to reduce even further the size of the `distilbert-base-multilingual` multilingual model by selecting only most frequent tokens for Spanish, reducing the size of the embedding layer. For more details visit the paper from the Geotrend team: Load What You Need: Smaller Versions of Multilingual BERT.

config.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
  "activation": "gelu",
  "architectures": [
- "DistilBertModel"
  ],
  "attention_dropout": 0.1,
  "dim": 768,

 {
  "activation": "gelu",
  "architectures": [
+ "DistilBertForMaskedLM"
  ],
  "attention_dropout": 0.1,
  "dim": 768,