issues about config.json and model.py

#4
by J22 - opened
  1. max_position_embeddingsis not used model.py (hardcoded as 32768);

    It is confusing that for 14B-chat, max_position_embeddings equals to 8k.

  2. rope_theta duplicates rotary_emb_base and not used in model.py. It can be deleted.

  1. Yes, max_position_embeddings is supposed to be different (for now).
  2. they may be consumed by other libraries, and are kept to maintain compatibility.
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment