togethercomputer
/

GPT-JT-6B-v1

Text Generation

Inference Endpoints

Model card Files Files and versions Community

juewang commited on Nov 29, 2022

Commit

f16ab32

•

1 Parent(s): bd0bd9e

Update README.md

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -76,6 +76,10 @@ widget:
 <h1 style="font-size: 42px">GPT-JT<h1/>
 # Model Summary
 > With a new decentralized training algorithm, we fine-tuned GPT-J (6B) on 3.53 billion tokens, resulting in GPT-JT (6B), a model that outperforms many 100B+ parameter models on classification benchmarks.
@@ -87,8 +91,6 @@ We incorporated a collection of open techniques and datasets to build GPT-JT:
 With the help of techniques mentioned above, GPT-JT significantly improves the performance of classification tasks over the original GPT-J, and even outperforms most 100B+ parameter models!
-***<p style="font-size: 24px">Feel free to try out our [Online Demo](https://huggingface.co/spaces/togethercomputer/GPT-JT)!</p>***
 # Quick Start
 ```python

 <h1 style="font-size: 42px">GPT-JT<h1/>
+***<p style="font-size: 24px">Feel free to try out our [Online Demo](https://huggingface.co/spaces/togethercomputer/GPT-JT)!</p>***
 # Model Summary
 > With a new decentralized training algorithm, we fine-tuned GPT-J (6B) on 3.53 billion tokens, resulting in GPT-JT (6B), a model that outperforms many 100B+ parameter models on classification benchmarks.
 With the help of techniques mentioned above, GPT-JT significantly improves the performance of classification tasks over the original GPT-J, and even outperforms most 100B+ parameter models!
 # Quick Start
 ```python