diffugpt-s

This model is a fine-tuned version of gpt2 on Fineweb dataset.

Model description

Details and model loading can be seen https://github.com/HKUNLP/DiffuLLaMA.

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.1.1+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1
@misc{gong2024scalingdiffusionlanguagemodels,
      title={Scaling Diffusion Language Models via Adaptation from Autoregressive Models}, 
      author={Shansan Gong and Shivam Agarwal and Yizhe Zhang and Jiacheng Ye and Lin Zheng and Mukai Li and Chenxin An and Peilin Zhao and Wei Bi and Jiawei Han and Hao Peng and Lingpeng Kong},
      year={2024},
      eprint={2410.17891},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2410.17891}, 
}
Downloads last month
187
Safetensors
Model size
124M params
Tensor type
FP16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for diffusionfamily/diffugpt-s

Finetuned
(1524)
this model

Dataset used to train diffusionfamily/diffugpt-s