The tokenizer config does not match the version shared by the original author
#17
by
GohioAC
- opened
Original config: https://huggingface.co/liuhaotian/llava-v1.6-34b-tokenizer/blob/main/tokenizer_config.json
Config provided here: https://huggingface.co/llava-hf/llava-v1.6-34b-hf/blob/main/tokenizer_config.json
There is a huge difference in the 2 configs. The most concerning part is that the pad token is different.
Hey! You should be looking at https://huggingface.co/liuhaotian/llava-v1.6-34b/blob/main/tokenizer_config.json for the original config, as that is the one loaded when generating with LLaVa.
In that case there's only one difference between the two: HF implementation has a special "" token used internally to inject image embeddings, which does not affect the generation in any way