llm.c checkpoint: GPT-2 774M

This is a HF/safetensors conversion of the llm.c checkpoint of a 774M parameter run on 150B tokens from FineWeb.

Training was conducted on a single 8xA100 80GB SXM node for ~6 days.

See discussion on GitHub for more information.

Downloads last month
11
Safetensors
Model size
774M params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train mdouglas/llmc-gpt2-774M-150B

Collection including mdouglas/llmc-gpt2-774M-150B