google/gemma-7b · Write a demo on window, can't run end

xxl4

Oct 12, 2024

Dear All

Demo.py file

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoConfig

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

print(f"Using {device} for inference")

config = AutoConfig.from_pretrained("google/gemma-7b")
config.hidden_activation = "gelu_pytorch_tanh"

print(config)

tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b")

print(tokenizer)

print(tokenizer.decode(tokenizer.encode("Hello, world!")))

try:
    model = AutoModelForCausalLM.from_pretrained("google/gemma-7b", config=config).to(device)
except Exception as e:
    print(f"Error: {e}")
    try:
        model = AutoModelForCausalLM.from_pretrained("google/gemma-7b").to(device)
    except Exception as e:
        print(f"Error: {e}")
        model = None

print(model)

input_text = "Write me a poem about Machine Learning."

print(input_text)

input_ids = tokenizer(input_text, return_tensors="pt").to(device)

outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
print(outputs)

When i run it on windows, and can't give me outputs result. and the program run end at

model = AutoModelForCausalLM.from_pretrained("google/gemma-7b", config=config).to(device)

and Not any error output. pls help me. Thank you.

GopiUppari

Google org Oct 15, 2024

Hi @xxl4 ,

The issue is likely when loading the model due to insufficient system RAM or GPU memory. To avoid this problem, ensure that both system and GPU memory are more than 20GB, or even higher if possible. If the issue still persists, please let us know for further assistance.

Thank you.

xxl4

Oct 17, 2024

Hi @xxl4 ,

The issue is likely when loading the model due to insufficient system RAM or GPU memory. To avoid this problem, ensure that both system and GPU memory are more than 20GB, or even higher if possible. If the issue still persists, please let us know for further assistance.

Thank you.

Thank you, and now i don't have GPU, i use CPU, my computer is 32G RAM memory, i want to change a smaller models to debug.