yujiepan
/

hymba-tiny-random

Text Generation

Model card Files Files and versions Community

hymba-tiny-random / README.md

yujiepan's picture

Upload folder using huggingface_hub

41f7583 verified about 1 month ago

|

history blame contribute delete

2.16 kB

	---
	library_name: transformers
	pipeline_tag: text-generation
	inference: true
	widget:
	- text: Hello!
	example_title: Hello world
	group: Python
	---

	This model is for debugging. It is randomly initialized with the config from [nvidia/Hymba-1.5B-Instruct](https://huggingface.co/nvidia/Hymba-1.5B-Instruct) but is of smaller size.

	Codes:
	```python
	from huggingface_hub import create_repo, upload_folder
	import os

	import torch
	import transformers
	from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer, GenerationConfig, pipeline, set_seed

	model_id = "nvidia/Hymba-1.5B-Instruct"
	repo_id = "yujiepan/hymba-tiny-random"
	save_path = f"/tmp/{repo_id}"

	config = AutoConfig.from_pretrained(model_id, trust_remote_code=True)
	config.conv_dim = {str(i): 32 for i in range(3)}
	config.hidden_size = 16
	config.intermediate_size = 32
	config.num_attention_heads = 2
	config.num_key_value_heads = 1
	config.v_head_dim = 8
	config.num_hidden_layers = 3

	tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
	tokenizer.save_pretrained(save_path)

	model = AutoModelForCausalLM.from_config(
	config, torch_dtype=torch.bfloat16, trust_remote_code=True,
	)
	model.generation_config = GenerationConfig.from_pretrained(
	model_id, trust_remote_code=True)

	set_seed(42)
	with torch.no_grad():
	for _, p in sorted(model.named_parameters()):
	torch.nn.init.uniform_(p, -0.2, 0.2)

	model.save_pretrained(save_path)

	prompt = 'Hello!'
	messages = [
	{"role": "system", "content": "You are a helpful assistant."}
	]
	messages.append({"role": "user", "content": prompt})
	tokenized_chat = tokenizer.apply_chat_template(
	messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to('cuda')
	outputs = model.cuda().generate(
	tokenized_chat,
	max_new_tokens=16,
	do_sample=False,
	temperature=0.7,
	use_cache=True,
	)
	input_length = tokenized_chat.shape[1]
	response = tokenizer.decode(
	outputs[0][input_length:], skip_special_tokens=True)
	print(f"Model response: {response}")

	os.system(f"ls -alh {save_path}")
	create_repo(repo_id, exist_ok=True)
	upload_folder(repo_id=repo_id, folder_path=save_path)
	```