--- base_model: unsloth/SmolLM2-135M language: - en license: apache-2.0 tags: - text-generation-inference - transformers - unsloth - llama - trl - sft datasets: - rahulvk007/quenumber_extraction_v2 --- # ExtractQueNumberMini Model - **Developed by:** [rahulvk007](https://github.com/rahulvk007) ([rahulvk.com](https://www.rahulvk.com)) - **License:** [Apache-2.0](https://opensource.org/licenses/Apache-2.0) - **Base Model:** [unsloth/SmolLM2-135M](https://huggingface.co/unsloth/SmolLM2-135M) - **Finetuning**: Optimized with [Unsloth](https://github.com/unslothai/unsloth) and [Hugging Face's TRL library](https://github.com/huggingface/trl) This model has been fine-tuned for quick extraction of question numbers from OCRed handwritten text. It is designed to run efficiently on CPU due to its compact size. ### Model Usage To use this model, set the system prompt to the following: > **Extract the question number from the given text. Your response should be just an integer representing the question number. Do not provide any explanation or context. Just the number.** ### Inference Code Example ```python from transformers import AutoModelForCausalLM, AutoTokenizer checkpoint = "rahulvk007/ExtractQueNumberMini" device = "cpu" # change to "cuda" for GPU tokenizer = AutoTokenizer.from_pretrained(checkpoint) model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device) alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: {} ### Input: {} ### Response: {}""" inputs = tokenizer( [ alpaca_prompt.format( "Extract the question number from the given text. Your response should be just an integer which is the question number. Do not provide any explanation or context. Just the number.", "", "", ) ], return_tensors="pt" ).to(device) outputs = model.generate(**inputs, max_new_tokens=64, use_cache=True) print(tokenizer.batch_decode(outputs, skip_special_tokens=True)) ``` ### Datasets The model was fine-tuned on [rahulvk007/quenumber_extraction_v2](https://huggingface.co/datasets/rahulvk007/quenumber_extraction_v2), specifically curated for this task. --- [](https://github.com/unslothai/unsloth)