Write a demo on window, can't run end
Dear All
Demo.py file
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoConfig
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using {device} for inference")
config = AutoConfig.from_pretrained("google/gemma-7b")
config.hidden_activation = "gelu_pytorch_tanh"
print(config)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b")
print(tokenizer)
print(tokenizer.decode(tokenizer.encode("Hello, world!")))
try:
model = AutoModelForCausalLM.from_pretrained("google/gemma-7b", config=config).to(device)
except Exception as e:
print(f"Error: {e}")
try:
model = AutoModelForCausalLM.from_pretrained("google/gemma-7b").to(device)
except Exception as e:
print(f"Error: {e}")
model = None
print(model)
input_text = "Write me a poem about Machine Learning."
print(input_text)
input_ids = tokenizer(input_text, return_tensors="pt").to(device)
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
print(outputs)
When i run it on windows, and can't give me outputs result. and the program run end at
model = AutoModelForCausalLM.from_pretrained("google/gemma-7b", config=config).to(device)
and Not any error output. pls help me. Thank you.
Hi @xxl4 ,
The issue is likely when loading the model due to insufficient system RAM or GPU memory. To avoid this problem, ensure that both system and GPU memory are more than 20GB, or even higher if possible. If the issue still persists, please let us know for further assistance.
Thank you.
Hi @xxl4 ,
The issue is likely when loading the model due to insufficient system RAM or GPU memory. To avoid this problem, ensure that both system and GPU memory are more than 20GB, or even higher if possible. If the issue still persists, please let us know for further assistance.
Thank you.
Thank you, and now i don't have GPU, i use CPU, my computer is 32G RAM memory, i want to change a smaller models to debug.