TinyLlama-NoPE-1.1B

NoPE is a transformer model without positional encoding.

The model is trained following TinyLlama code base (https://github.com/jzhang38/TinyLlama)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers.models.llama import modeling_llama


def nope_monkey_patch(q, k, cos, sin, position_ids, unsqueeze_dim=1):
    return q, k


modeling_llama.apply_rotary_pos_emb = nope_monkey_patch

model_path = "AntNLP/TinyLlama-NoPE-1.1B"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path).cuda()

input_ids = tokenizer("Hello, TinyLlama-NoPE", return_tensors="pt").input_ids.cuda()
output = model.generate(input_ids, do_sample=True, max_length=50)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Citation

@misc{wang2024length,
      title={Length Generalization of Causal Transformers without Position Encoding}, 
      author={Jie Wang and Tao Ji and Yuanbin Wu and Hang Yan and Tao Gui and Qi Zhang and Xuanjing Huang and Xiaoling Wang},
      year={2024},
      eprint={2404.12224},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
Downloads last month
19
Safetensors
Model size
1.1B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.