Update README.md
Browse files
README.md
CHANGED
@@ -76,6 +76,10 @@ widget:
|
|
76 |
|
77 |
<h1 style="font-size: 42px">GPT-JT<h1/>
|
78 |
|
|
|
|
|
|
|
|
|
79 |
# Model Summary
|
80 |
|
81 |
> With a new decentralized training algorithm, we fine-tuned GPT-J (6B) on 3.53 billion tokens, resulting in GPT-JT (6B), a model that outperforms many 100B+ parameter models on classification benchmarks.
|
@@ -87,8 +91,6 @@ We incorporated a collection of open techniques and datasets to build GPT-JT:
|
|
87 |
|
88 |
With the help of techniques mentioned above, GPT-JT significantly improves the performance of classification tasks over the original GPT-J, and even outperforms most 100B+ parameter models!
|
89 |
|
90 |
-
***<p style="font-size: 24px">Feel free to try out our [Online Demo](https://huggingface.co/spaces/togethercomputer/GPT-JT)!</p>***
|
91 |
-
|
92 |
# Quick Start
|
93 |
|
94 |
```python
|
|
|
76 |
|
77 |
<h1 style="font-size: 42px">GPT-JT<h1/>
|
78 |
|
79 |
+
|
80 |
+
***<p style="font-size: 24px">Feel free to try out our [Online Demo](https://huggingface.co/spaces/togethercomputer/GPT-JT)!</p>***
|
81 |
+
|
82 |
+
|
83 |
# Model Summary
|
84 |
|
85 |
> With a new decentralized training algorithm, we fine-tuned GPT-J (6B) on 3.53 billion tokens, resulting in GPT-JT (6B), a model that outperforms many 100B+ parameter models on classification benchmarks.
|
|
|
91 |
|
92 |
With the help of techniques mentioned above, GPT-JT significantly improves the performance of classification tasks over the original GPT-J, and even outperforms most 100B+ parameter models!
|
93 |
|
|
|
|
|
94 |
# Quick Start
|
95 |
|
96 |
```python
|