簡易練習測試問題?

by NCGWRjason - opened May 9, 2024

Discussion

NCGWRjason

May 9, 2024

您好我有使用您給的簡易練習程式碼

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

tokens 設定一個new User Access Tokens

my_token = "hf_XXXXXXXXXXXXXXXXXXXXXXXXXXXXX"

load model

model_name = "taide/TAIDE-LX-7B-Chat"

程式碼和taide資料夾是同一個路徑，taide底下的TAIDE-LX-7B-Chat資料夾裡面下載 taide/TAIDE-LX-7B-Chat/tree/main 所有檔案

tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)
model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True, device_map="auto", token=my_token)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

prepare prompt

question = "臺灣最高的建築物是?"
chat = [
{"role": "user", "content": f"{question}"},
]

generate response

x = pipe(chat, max_new_tokens=1024)
print(f"TAIDE: {x}")

------但還是有以下錯誤，好像我還是被檔的樣子，不曉得是什麼原因，我明明有申請token 而且

我都有填寫授權資料上傳了謝謝------------------------------------

Traceback (most recent call last):
File "c:\Users....................\run_t.py", line 8, in
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 819, in from_pretrained
config = AutoConfig.from_pretrained(
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\auto\configuration_auto.py", line 928, in from_pretrained
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\configuration_utils.py", line 631, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\configuration_utils.py", line 686, in _get_config_dict
resolved_config_file = cached_file(
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\utils\hub.py", line 416, in cached_file
raise EnvironmentError(
OSError: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/taide/TAIDE-LX-7B-Chat.
401 Client Error. (Request ID: Root=1-663c1de2-0a1477535ad9168138f579be;daa556e3-531f-4d76-8e56-f88f4ea0c19d)

Cannot access gated repo for url https://huggingface.co/taide/TAIDE-LX-7B-Chat/resolve/main/config.json.
Access to model taide/TAIDE-LX-7B-Chat is restricted. You must be authenticated to access it.

nctu6

TAIDE org May 9, 2024

您好，

謝謝您的回饋，範例已更新這個問題，請參考：https://huggingface.co/taide/TAIDE-LX-7B-Chat-4bit/discussions/3

Best regards.

NCGWRjason

May 9, 2024

您好，

謝謝您的回饋，範例已更新這個問題，請參考：https://huggingface.co/taide/TAIDE-LX-7B-Chat-4bit/discussions/3

Best regards.

後來還是不行有這些錯誤

似乎說 load_in_4bit and load_in_8bit 目前都沒有這種參數了

而且還叫我安裝這些套件，但是我明明 pip install之後再次執行還是有這些error

pip install accelerate
pip install -i https://pypi.org/simple/ bitsandbytes

The load_in_4bit and load_in_8bit arguments are deprecated and will be removed in the future versions. Please, pass a BitsAndBytesConfig object in quantization_config argument instead.
Traceback (most recent call last):
File "c:\Users\ \run_t.py", line 32, in
model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True, device_map="auto", token=my_token)
File "C:\Users\ \AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\auto\auto_factory.py", line 563, in from_pretrained
return model_class.from_pretrained(
File "C:\Users\ \AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\modeling_utils.py", line 3165, in from_pretrained
hf_quantizer.validate_environment(
File "C:\Users\ \AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\quantizers\quantizer_bnb_4bit.py", line 62, in validate_environment
raise ImportError(
ImportError: Using bitsandbytes 8-bit quantization requires Accelerate: pip install accelerate and the latest version of bitsandbytes: pip install -i https://pypi.org/simple/ bitsandbytes

nctu6

TAIDE org May 9, 2024

您好，

請參考以下討論，解決環境問題，謝謝您。
https://huggingface.co/taide/Llama3-TAIDE-LX-8B-Chat-Alpha1-4bit/discussions/3

Best regards.

NCGWRjason

May 9, 2024

了解謝謝您是說改成這樣嗎?

不過我目前測試的這台電腦沒讀顯 RAM也只有8G 所以這鐵定無法跑是嘛? 謝謝

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, BitsAndBytesConfig
from torch import bfloat16

https://huggingface.co/docs/hub/security-tokens#user-access-tokens

my_token = "***********************************************************************" # 這行需換成您自己的 access token

bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=bfloat16)

load model

model_name = "taide/Llama3-TAIDE-LX-8B-Chat-Alpha1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name,
load_in_4bit=True,
device_map="auto",
trust_remote_code=True,
quantization_config=bnb_config,
token=my_token)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

prepare prompt

question = "臺灣最高的建築物是?"
chat = [
{"role": "user", "content": f"{question}"},
]
prompt = tokenizer.apply_chat_template(chat, tokenize=False)

generate response

x = pipe(f"{prompt}", max_new_tokens=1024)
print(f"TAIDE: {x}")

nctu6

TAIDE org May 9, 2024

您好，

請參考以下文章，改成使用 GGUF 量化模型：
https://www.reddit.com/r/LocalLLaMA/comments/19f9z64/running_a_local_model_with_8gb_vram_is_it_even/
https://stackoverflow.com/questions/77630013/how-to-run-any-gguf-model-using-transformers-or-any-other-library

GGUF 量化模型，可以自己轉換，或用以下 repo：
https://huggingface.co/ZoneTwelve/TAIDE-LX-7B-Chat-GGUF

Best regards.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment