Divyasreepat commited on
Commit
83f09c5
1 Parent(s): 4dcf4ef

Update README.md with new model card content

Browse files
Files changed (1) hide show
  1. README.md +42 -15
README.md CHANGED
@@ -1,18 +1,45 @@
1
  ---
2
  library_name: keras-hub
3
  ---
4
- This is a [`Electra` model](https://keras.io/api/keras_hub/models/electra) uploaded using the KerasHub library and can be used with JAX, TensorFlow, and PyTorch backends.
5
- Model config:
6
- * **name:** electra_backbone
7
- * **trainable:** True
8
- * **vocab_size:** 30522
9
- * **num_layers:** 24
10
- * **num_heads:** 16
11
- * **hidden_dim:** 1024
12
- * **embedding_dim:** 1024
13
- * **intermediate_dim:** 4096
14
- * **dropout:** 0.1
15
- * **max_sequence_length:** 512
16
- * **num_segments:** 2
17
-
18
- This model card has been generated automatically and should be completed by the model author. See [Model Cards documentation](https://huggingface.co/docs/hub/model-cards) for more information.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  library_name: keras-hub
3
  ---
4
+ ### Model Overview
5
+ ELECTRA model is a pretraining approach for language models published by Google. Two transformer models are trained, a generator and a discriminator. The generator replaces tokens in a sequence and is trained as a masked language model. The discriminator is trained to discern what tokens have been replaced. This method of pretraining is more efficient than comparable methods like masked language modeling, especially for small models.
6
+
7
+ Weights are released under the [MIT License](https://opensource.org/license/mit). Keras model code is released under the [Apache 2 License](https://github.com/keras-team/keras-hub/blob/master/LICENSE).
8
+
9
+ ## Links
10
+
11
+ * [Phi-3 API Documentation](https://keras.io/api/keras_hub/models/phi3/)
12
+ * [ELECTRA Model Paper](https://openreview.net/pdf?id=r1xMH1BtvB)
13
+ * [KerasHub Beginner Guide](https://keras.io/guides/keras_hub/getting_started/)
14
+ * [KerasHub Model Publishing Guide](https://keras.io/guides/keras_hub/upload/)
15
+
16
+ ## Installation
17
+
18
+ Keras and KerasHub can be installed with:
19
+
20
+ ```
21
+ pip install -U -q keras-hub
22
+ pip install -U -q keras>=3
23
+ ```
24
+
25
+ Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instruction on installing them in another environment see the [Keras Getting Started](https://keras.io/getting_started/) page.
26
+
27
+ ## Presets
28
+
29
+ The following model checkpoints are provided by the Keras team. Full code examples for each are available below.
30
+
31
+ | Preset name | Parameters | Description |
32
+ |---------------------------------------|------------|--------------------------------------------------------------------------------------------------------------|
33
+ | `electra_small_discriminator_uncased_en` | 13.55M | 12-layer small ELECTRA discriminator model. All inputs are lowercased. Trained on English Wikipedia + BooksCorpus. |
34
+ | `electra_small_generator_uncased_en` | 13.55M | 12-layer small ELECTRA generator model. All inputs are lowercased. Trained on English Wikipedia + BooksCorpus. |
35
+ | `electra_base_discriminator_uncased_en` | 109.48M | 12-layer base ELECTRA discriminator model. All inputs are lowercased. Trained on English Wikipedia + BooksCorpus. |
36
+ | `electra_base_generator_uncased_en` | 33.58M | 12-layer base ELECTRA generator model. All inputs are lowercased. Trained on English Wikipedia + BooksCorpus. |
37
+ | `electra_large_discriminator_uncased_en` | 335.14M | 24-layer large ELECTRA discriminator model. All inputs are lowercased. Trained on English Wikipedia + BooksCorpus. |
38
+ | `electra_large_generator_uncased_en` | 51.07M | 24-layer large ELECTRA generator model. All inputs are lowercased. Trained on English Wikipedia + BooksCorpus. |
39
+
40
+ ### Example Usage
41
+
42
+
43
+ ## Example Usage with Hugging Face URI
44
+
45
+