Uploaded model

  • Developed by: beyoru
  • License: apache-2.0

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "beyoru/MCQ-o1-512"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

messages = [
    {"role": "system", "content": "Bạn là một trợ lý thông minh có khả năng tạo ra một câu hỏi trắc nghiệm từ bất kỳ ngữ cảnh"},
    {"role": "user", "content": "<YOUR CONTEXT>"}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    do_sample=True
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

Notes:

  • For small datasets with narrow content which the model has already done well on our domain, and doesn't want the model to forget the knowledge => Just need to focus on o.
  • Fine-tuned lora with rank = 1 and alpha = 512, epoch = 1, linear (optim)
  • DoRA

Improvement

  • Increasing rank can help the model do better at robust structure.
  • Try more efficient fine-tuning
Downloads last month
24
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for beyoru/MCQ-o1-512

Base model

Qwen/Qwen2.5-3B
Finetuned
(50)
this model

Dataset used to train beyoru/MCQ-o1-512

Collection including beyoru/MCQ-o1-512