TinyLlama-NoPE-1.1B
NoPE is a transformer model without positional encoding.
The model is trained following TinyLlama code base (https://github.com/jzhang38/TinyLlama)
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers.models.llama import modeling_llama
def nope_monkey_patch(q, k, cos, sin, position_ids, unsqueeze_dim=1):
return q, k
modeling_llama.apply_rotary_pos_emb = nope_monkey_patch
model_path = "AntNLP/TinyLlama-NoPE-1.1B"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path).cuda()
input_ids = tokenizer("Hello, TinyLlama-NoPE", return_tensors="pt").input_ids.cuda()
output = model.generate(input_ids, do_sample=True, max_length=50)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Citation
@misc{wang2024length,
title={Length Generalization of Causal Transformers without Position Encoding},
author={Jie Wang and Tao Ji and Yuanbin Wu and Hang Yan and Tao Gui and Qi Zhang and Xuanjing Huang and Xiaoling Wang},
year={2024},
eprint={2404.12224},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 19
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.