Tony W's picture

7

Tony W

tonyaw

AI & ML interests

None yet

Organizations

None yet

tonyaw's activity

New activity in mattshumer/Reflection-Llama-3.1-70B 3 months ago

It looks the model is in 8K context length. May I ask why context length of Llama3.1 is 128K

#153 opened 3 months ago by

New activity in ise-uiuc/Magicoder-S-DS-6.7B about 1 year ago

Incorrect vocab size?

#2 opened about 1 year ago by

New activity in deepseek-ai/deepseek-coder-6.7b-instruct about 1 year ago

"vocab_size" is inconsistent with tokenizer.get_vocab()

#7 opened about 1 year ago by

New activity in HuggingFaceH4/starchat-alpha over 1 year ago

How to use PEFT+LoRA to fine-tune starchat-alpha

#17 opened over 1 year ago by

New activity in lmsys/vicuna-13b-delta-v0 over 1 year ago

Python library version recommendation

#3 opened over 1 year ago by

New activity in chavinlo/gpt4-x-alpaca over 1 year ago

KeyError: 'model.layers.0.self_attn.rotary_emb.cos_cached'

#9 opened over 1 year ago by