Divyasreepat commited on
Commit
8727c9c
1 Parent(s): c6b7661

Update README.md with new model card content

Browse files
Files changed (1) hide show
  1. README.md +200 -13
README.md CHANGED
@@ -1,16 +1,203 @@
1
  ---
2
  library_name: keras-hub
3
  ---
4
- This is a [`Bart` model](https://keras.io/api/keras_hub/models/bart) uploaded using the KerasHub library and can be used with JAX, TensorFlow, and PyTorch backends.
5
- Model config:
6
- * **name:** bart_backbone
7
- * **trainable:** True
8
- * **vocabulary_size:** 50265
9
- * **num_layers:** 12
10
- * **num_heads:** 16
11
- * **hidden_dim:** 1024
12
- * **intermediate_dim:** 4096
13
- * **dropout:** 0.1
14
- * **max_sequence_length:** 1024
15
-
16
- This model card has been generated automatically and should be completed by the model author. See [Model Cards documentation](https://huggingface.co/docs/hub/model-cards) for more information.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  library_name: keras-hub
3
  ---
4
+ ### Model Overview
5
+ BART encoder-decoder network.
6
+
7
+ This class implements a Transformer-based encoder-decoder model as
8
+ described in
9
+ ["BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension"](https://arxiv.org/abs/1910.13461).
10
+
11
+ The default constructor gives a fully customizable, randomly initialized BART
12
+ model with any number of layers, heads, and embedding dimensions. To load
13
+ preset architectures and weights, use the `from_preset` constructor.
14
+
15
+ Disclaimer: Pre-trained models are provided on an "as is" basis, without
16
+ warranties or conditions of any kind. The underlying model is provided by a
17
+ third party and subject to a separate license, available
18
+ [here](https://github.com/facebookresearch/fairseq/).
19
+
20
+
21
+ __Arguments__
22
+
23
+
24
+ - __vocabulary_size__: int. The size of the token vocabulary.
25
+ - __num_layers__: int. The number of transformer encoder layers and
26
+ transformer decoder layers.
27
+ - __num_heads__: int. The number of attention heads for each transformer.
28
+ The hidden size must be divisible by the number of attention heads.
29
+ - __hidden_dim__: int. The size of the transformer encoding and pooler layers.
30
+ - __intermediate_dim__: int. The output dimension of the first Dense layer in
31
+ a two-layer feedforward network for each transformer.
32
+ - __dropout__: float. Dropout probability for the Transformer encoder.
33
+ - __max_sequence_length__: int. The maximum sequence length that this encoder
34
+ can consume. If None, `max_sequence_length` uses the value from
35
+ sequence length. This determines the variable shape for positional
36
+ embeddings.
37
+
38
+ ### Example Usage
39
+ ```python
40
+ import keras
41
+ import keras_hub
42
+ import numpy as np
43
+ ```
44
+
45
+ Use `generate()` to do text generation, given an input context.
46
+ ```python
47
+ bart_lm = keras_hub.models.BartSeq2SeqLM.from_preset("bart_large_en")
48
+ bart_lm.generate("The quick brown fox", max_length=30)
49
+
50
+ # Generate with batched inputs.
51
+ bart_lm.generate(["The quick brown fox", "The whale"], max_length=30)
52
+ ```
53
+
54
+ Compile the `generate()` function with a custom sampler.
55
+ ```python
56
+ bart_lm = keras_hub.models.BartSeq2SeqLM.from_preset("bart_large_en")
57
+ bart_lm.compile(sampler="greedy")
58
+ bart_lm.generate("The quick brown fox", max_length=30)
59
+ ```
60
+
61
+ Use `generate()` with encoder inputs and an incomplete decoder input (prompt).
62
+ ```python
63
+ bart_lm = keras_hub.models.BartSeq2SeqLM.from_preset("bart_large_en")
64
+ bart_lm.generate(
65
+ {
66
+ "encoder_text": "The quick brown fox",
67
+ "decoder_text": "The fast"
68
+ }
69
+ )
70
+ ```
71
+
72
+ Use `generate()` without preprocessing.
73
+ ```python
74
+ # Preprocessed inputs, with encoder inputs corresponding to
75
+ # "The quick brown fox", and the decoder inputs to "The fast". Use
76
+ # `"padding_mask"` to indicate values that should not be overridden.
77
+ prompt = {
78
+ "encoder_token_ids": np.array([[0, 133, 2119, 6219, 23602, 2, 1, 1]]),
79
+ "encoder_padding_mask": np.array(
80
+ [[True, True, True, True, True, True, False, False]]
81
+ ),
82
+ "decoder_token_ids": np.array([[2, 0, 133, 1769, 2, 1, 1]]),
83
+ "decoder_padding_mask": np.array([[True, True, True, True, False, False]])
84
+ }
85
+
86
+ bart_lm = keras_hub.models.BartSeq2SeqLM.from_preset(
87
+ "bart_large_en",
88
+ preprocessor=None,
89
+ )
90
+ bart_lm.generate(prompt)
91
+ ```
92
+
93
+ Call `fit()` on a single batch.
94
+ ```python
95
+ features = {
96
+ "encoder_text": ["The quick brown fox jumped.", "I forgot my homework."],
97
+ "decoder_text": ["The fast hazel fox leapt.", "I forgot my assignment."]
98
+ }
99
+ bart_lm = keras_hub.models.BartSeq2SeqLM.from_preset("bart_large_en")
100
+ bart_lm.fit(x=features, batch_size=2)
101
+ ```
102
+
103
+ Call `fit()` without preprocessing.
104
+ ```python
105
+ x = {
106
+ "encoder_token_ids": np.array([[0, 133, 2119, 2, 1]] * 2),
107
+ "encoder_padding_mask": np.array([[1, 1, 1, 1, 0]] * 2),
108
+ "decoder_token_ids": np.array([[2, 0, 133, 1769, 2]] * 2),
109
+ "decoder_padding_mask": np.array([[1, 1, 1, 1, 1]] * 2),
110
+ }
111
+ y = np.array([[0, 133, 1769, 2, 1]] * 2)
112
+ sw = np.array([[1, 1, 1, 1, 0]] * 2)
113
+
114
+ bart_lm = keras_hub.models.BartSeq2SeqLM.from_preset(
115
+ "bart_large_en",
116
+ preprocessor=None,
117
+ )
118
+ bart_lm.fit(x=x, y=y, sample_weight=sw, batch_size=2)
119
+ ```
120
+
121
+ ## Example Usage with Hugging Face URI
122
+
123
+ ```python
124
+ import keras
125
+ import keras_hub
126
+ import numpy as np
127
+ ```
128
+
129
+ Use `generate()` to do text generation, given an input context.
130
+ ```python
131
+ bart_lm = keras_hub.models.BartSeq2SeqLM.from_preset("hf://keras/bart_large_en")
132
+ bart_lm.generate("The quick brown fox", max_length=30)
133
+
134
+ # Generate with batched inputs.
135
+ bart_lm.generate(["The quick brown fox", "The whale"], max_length=30)
136
+ ```
137
+
138
+ Compile the `generate()` function with a custom sampler.
139
+ ```python
140
+ bart_lm = keras_hub.models.BartSeq2SeqLM.from_preset("hf://keras/bart_large_en")
141
+ bart_lm.compile(sampler="greedy")
142
+ bart_lm.generate("The quick brown fox", max_length=30)
143
+ ```
144
+
145
+ Use `generate()` with encoder inputs and an incomplete decoder input (prompt).
146
+ ```python
147
+ bart_lm = keras_hub.models.BartSeq2SeqLM.from_preset("hf://keras/bart_large_en")
148
+ bart_lm.generate(
149
+ {
150
+ "encoder_text": "The quick brown fox",
151
+ "decoder_text": "The fast"
152
+ }
153
+ )
154
+ ```
155
+
156
+ Use `generate()` without preprocessing.
157
+ ```python
158
+ # Preprocessed inputs, with encoder inputs corresponding to
159
+ # "The quick brown fox", and the decoder inputs to "The fast". Use
160
+ # `"padding_mask"` to indicate values that should not be overridden.
161
+ prompt = {
162
+ "encoder_token_ids": np.array([[0, 133, 2119, 6219, 23602, 2, 1, 1]]),
163
+ "encoder_padding_mask": np.array(
164
+ [[True, True, True, True, True, True, False, False]]
165
+ ),
166
+ "decoder_token_ids": np.array([[2, 0, 133, 1769, 2, 1, 1]]),
167
+ "decoder_padding_mask": np.array([[True, True, True, True, False, False]])
168
+ }
169
+
170
+ bart_lm = keras_hub.models.BartSeq2SeqLM.from_preset(
171
+ "hf://keras/bart_large_en",
172
+ preprocessor=None,
173
+ )
174
+ bart_lm.generate(prompt)
175
+ ```
176
+
177
+ Call `fit()` on a single batch.
178
+ ```python
179
+ features = {
180
+ "encoder_text": ["The quick brown fox jumped.", "I forgot my homework."],
181
+ "decoder_text": ["The fast hazel fox leapt.", "I forgot my assignment."]
182
+ }
183
+ bart_lm = keras_hub.models.BartSeq2SeqLM.from_preset("hf://keras/bart_large_en")
184
+ bart_lm.fit(x=features, batch_size=2)
185
+ ```
186
+
187
+ Call `fit()` without preprocessing.
188
+ ```python
189
+ x = {
190
+ "encoder_token_ids": np.array([[0, 133, 2119, 2, 1]] * 2),
191
+ "encoder_padding_mask": np.array([[1, 1, 1, 1, 0]] * 2),
192
+ "decoder_token_ids": np.array([[2, 0, 133, 1769, 2]] * 2),
193
+ "decoder_padding_mask": np.array([[1, 1, 1, 1, 1]] * 2),
194
+ }
195
+ y = np.array([[0, 133, 1769, 2, 1]] * 2)
196
+ sw = np.array([[1, 1, 1, 1, 0]] * 2)
197
+
198
+ bart_lm = keras_hub.models.BartSeq2SeqLM.from_preset(
199
+ "hf://keras/bart_large_en",
200
+ preprocessor=None,
201
+ )
202
+ bart_lm.fit(x=x, y=y, sample_weight=sw, batch_size=2)
203
+ ```