Update README.md
Browse filesAdd intro and reference, and one more link to NGC
README.md
CHANGED
@@ -18,6 +18,10 @@
|
|
18 |
# ##############################################################################################
|
19 |
-->
|
20 |
|
|
|
|
|
|
|
|
|
21 |
# How to run Megatron GPT2 using Transformers
|
22 |
|
23 |
## Prerequisites
|
@@ -44,7 +48,7 @@ You must create a directory called `nvidia/megatron-gpt2-345m`:
|
|
44 |
mkdir -p $MYDIR/nvidia/megatron-gpt2-345m
|
45 |
```
|
46 |
|
47 |
-
You can download the checkpoints from the NVIDIA GPU Cloud (NGC). For that you
|
48 |
have to [sign up](https://ngc.nvidia.com/signup) for and setup the NVIDIA GPU
|
49 |
Cloud (NGC) Registry CLI. Further documentation for downloading models can be
|
50 |
found in the [NGC
|
|
|
18 |
# ##############################################################################################
|
19 |
-->
|
20 |
|
21 |
+
[Megatron](https://arxiv.org/pdf/1909.08053.pdf) is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA. This particular Megatron model was trained from a generative, left-to-right transformer in the style of GPT-2. This model was trained on text sourced from Wikipedia, RealNews, OpenWebText, and CC-Stories. It contains 345 million parameters.
|
22 |
+
|
23 |
+
Find more information at [https://github.com/NVIDIA/Megatron-LM](https://github.com/NVIDIA/Megatron-LM)
|
24 |
+
|
25 |
# How to run Megatron GPT2 using Transformers
|
26 |
|
27 |
## Prerequisites
|
|
|
48 |
mkdir -p $MYDIR/nvidia/megatron-gpt2-345m
|
49 |
```
|
50 |
|
51 |
+
You can download the checkpoints from the [NVIDIA GPU Cloud (NGC)](https://ngc.nvidia.com/catalog/models/nvidia:megatron_lm_345m). For that you
|
52 |
have to [sign up](https://ngc.nvidia.com/signup) for and setup the NVIDIA GPU
|
53 |
Cloud (NGC) Registry CLI. Further documentation for downloading models can be
|
54 |
found in the [NGC
|