403 Forbidden error when accessing the model

#3
by imhaggarwal - opened

model_id = "elyza/ELYZA-japanese-Llama-2-7b-instruct"
llm_hub = HuggingFaceEndpoint(repo_id=model_id, temperature= 0.1, max_new_tokens=600, model_kwargs={"max_length": 600})

I am using the above code to load the model. Since the size of the model is more than my RAM I gues it won`t be possible to load it locally.
So I want to use the inference to load the model.

I am even setting the HuggingFace token using os.environ["HUGGINGFACEHUB_API_TOKEN"] but getting the below error:
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://api-inference.huggingface.co/models/elyza/ELYZA-japanese-Llama-2-7b-instruct

The same code works for other heavy models. I even tried changing the access token from Inference to Read & Write but did not work.

Does this have something to do with the HuggingFace plan?
Can anyone please help me with this?

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment