patrickvonplaten
commited on
Commit
·
a4228fd
1
Parent(s):
b43a91b
Update README.md
Browse files
README.md
CHANGED
@@ -85,7 +85,7 @@ FNet-base was fine-tuned and evaluated on the validation data of the [GLUE bench
|
|
85 |
For comparison, this model (ported to PyTorch) was fine-tuned and evaluated using the [official Hugging Face GLUE evaluation scripts](https://github.com/huggingface/transformers/tree/master/examples/pytorch/text-classification#glue-tasks) alongside [bert-base-cased](https://hf.co/models/bert-base-cased) for comparison.
|
86 |
The training was done on a single 16GB NVIDIA Tesla V100 GPU. For MRPC/WNLI, the models were trained for 5 epochs, while for other tasks, the models were trained for 3 epochs. A sequence length of 512 was used with batch size 16 and learning rate 2e-5.
|
87 |
|
88 |
-
The following table summarizes the results for [fnet-base](https://huggingface.co/google/fnet-base) (called *FNet (PyTorch) - Reproduced*) and [bert-base-cased](https://hf.co/models/bert-base-cased) (called *Bert (PyTorch) - Reproduced*) in terms of
|
89 |
|
90 |
| Task/Model | FNet-base (PyTorch) |Bert-base (PyTorch)|
|
91 |
|:----:|:-----------:|:----:|
|
|
|
85 |
For comparison, this model (ported to PyTorch) was fine-tuned and evaluated using the [official Hugging Face GLUE evaluation scripts](https://github.com/huggingface/transformers/tree/master/examples/pytorch/text-classification#glue-tasks) alongside [bert-base-cased](https://hf.co/models/bert-base-cased) for comparison.
|
86 |
The training was done on a single 16GB NVIDIA Tesla V100 GPU. For MRPC/WNLI, the models were trained for 5 epochs, while for other tasks, the models were trained for 3 epochs. A sequence length of 512 was used with batch size 16 and learning rate 2e-5.
|
87 |
|
88 |
+
The following table summarizes the results for [fnet-base](https://huggingface.co/google/fnet-base) (called *FNet (PyTorch) - Reproduced*) and [bert-base-cased](https://hf.co/models/bert-base-cased) (called *Bert (PyTorch) - Reproduced*) in terms of **fine-tuning** speed. The format is *hour:min:seconds*. **Note** that the authors compared **pre-traning** speed in [the official paper](https://arxiv.org/abs/2105.03824) instead.
|
89 |
|
90 |
| Task/Model | FNet-base (PyTorch) |Bert-base (PyTorch)|
|
91 |
|:----:|:-----------:|:----:|
|