This model provides a GPT-2 language model trained with SimCTG on the Wikitext-103 benchmark [(Merity et al., 2016)](https://arxiv.org/abs/1609.07843) based on our paper [_A Contrastive Framework for Neural Text Generation_](https://arxiv.org/abs/2202.06417). We provide a detailed tutorial on how to apply SimCTG and Contrastive Search in our [project repo](https://github.com/yxuansu/SimCTG#4-huggingface-style-tutorials-back-to-top). In the following, we illustrate a brief tutorial on how to use our approach to perform text generation. ## 1. Installation of SimCTG: ```yaml pip install simctg --upgrade ``` ## 2. Initialize SimCTG Model: ```python import torch # load SimCTG language model from simctg.simctggpt import SimCTGGPT model_name = r'cambridgeltl/simctg_wikitext103' model = SimCTGGPT(model_name) model.eval() tokenizer = model.tokenizer ``` ## 3. Prepare the Text Prefix: ```python prefix_text = r"Butt criticized Donald 's controls in certain situations in the game , as well as the difficulty of some levels and puzzles . Buchanan also criticized the controls , calling" print ('Prefix is: {}'.format(prefix_text)) tokens = tokenizer.tokenize(prefix_text) input_ids = tokenizer.convert_tokens_to_ids(tokens) input_ids = torch.LongTensor(input_ids).view(1,-1) ``` ## 4. Generate Text with Contrastive Search: ```python beam_width, alpha, decoding_len = 8, 0.6, 128 output = model.fast_contrastive_search(input_ids=input_ids, beam_width=beam_width, alpha=alpha, decoding_len=decoding_len) print("Output:\n" + 100 * '-') print(tokenizer.decode(output)) ''' Prefix is: Butt criticized Donald 's controls in certain situations in the game , as well as the difficulty of some levels and puzzles . Buchanan also criticized the controls , calling Output: ---------------------------------------------------------------------------------------------------- Butt criticized Donald's controls in certain situations in the game, as well as the difficulty of some levels and puzzles. Buchanan also criticized the controls, calling them " unimpressive " and a " nightmare " of an experience to play with players unfamiliar with Tetris. On the other hand, his opinion was shared by other reviewers, and some were critical of the game's technical design for the Wii version of Tetris. In addition, Tintin's review included a quote from Roger Ebert, who said that Tetris was better than the original game due to its simplicity and ease of play. Ebert's comments were included in the game's DVD commentary, released on March 22, 2010. It is unclear if any of the video commentary was taken from the DVD ''' ``` For more details of our work, please refer to our main [project repo](https://github.com/yxuansu/SimCTG). ## 5. Citation: If you find our paper and resources useful, please kindly leave a star and cite our paper. Thanks! ```bibtex @article{su2022contrastive, title={A Contrastive Framework for Neural Text Generation}, author={Su, Yixuan and Lan, Tian and Wang, Yan and Yogatama, Dani and Kong, Lingpeng and Collier, Nigel}, journal={arXiv preprint arXiv:2202.06417}, year={2022} } ```