Edit model card

ExtractQueNumberMini Model

This model has been fine-tuned for quick extraction of question numbers from OCRed handwritten text. It is designed to run efficiently on CPU due to its compact size.

Model Usage

To use this model, set the system prompt to the following:

Extract the question number from the given text. Your response should be just an integer representing the question number. Do not provide any explanation or context. Just the number.

Inference Code Example

from transformers import AutoModelForCausalLM, AutoTokenizer

checkpoint = "rahulvk007/ExtractQueNumberMini"
device = "cpu"  # change to "cuda" for GPU

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)

alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

inputs = tokenizer(
    [
        alpaca_prompt.format(
            "Extract the question number from the given text. Your response should be just an integer which is the question number. Do not provide any explanation or context. Just the number.",
            "<Give OCR Text here>",
            "",
        )
    ],
    return_tensors="pt"
).to(device)

outputs = model.generate(**inputs, max_new_tokens=64, use_cache=True)
print(tokenizer.batch_decode(outputs, skip_special_tokens=True))

Datasets

The model was fine-tuned on rahulvk007/quenumber_extraction_v2, specifically curated for this task.


Downloads last month
29
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for rahulvk007/ExtractQueNumberMini

Finetuned
(1)
this model

Dataset used to train rahulvk007/ExtractQueNumberMini