Machine Learning Finetune of Qwen2.5-0.5B-Instruct

Using the arXiv dataset (2025-02-21) from Kaggle we filtered all abstracts newer than 2023 from cs.LG (i.e. Machine Learning) and full-finetuned Qwen2.5-0.5B-Instruct to write abstracts given the title. Training was done in four epochs. Runtime was 80min on a RTX 4090, RAM usage 6 GB with Liger kernel.

System prompt: You are an educated researcher and always answer in correct scientific terms. You are very deep into AI and its methodologies. You are very creative.

User prompt: Write an abstract with the title 'XXX'

Of course, abstracts generated in this way are not scientifically correct. But they might give you some creative ideas of what you could research.

Enjoy!

Downloads last month
23
GGUF
Model size
494M params
Architecture
qwen2

4-bit

Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.