--- license: other license_name: qwen-research license_link: >- https://raw.githubusercontent.com/QwenLM/Qwen/refs/heads/main/Tongyi%20Qianwen%20RESEARCH%20LICENSE%20AGREEMENT --- # Chirp-3b ## Overview Chirp-3b is a high-performing 3B parameter language model crafted by the Ozone Research team. Fine-tuned from a robust base model (Qwen2.5 3B Instruct), it was trained on 50 million tokens of distilled data from GPT-4o. This compact yet powerful model delivers exceptional results, outperforming expectations on benchmarks like MMLU Pro and IFEval. Chirp-3b is an open-source effort to push the limits of what small-scale LLMs can achieve, making it a valuable tool for researchers and enthusiasts alike. ## Key Features - **Parameters**: 3 billion - **Training Data**: 50M tokens distilled from GPT-4o - **Fine-Tuned From**: [Base model name TBD—update if applicable] - **License**: [Specify license, e.g., MIT, Apache 2.0, etc.—update as needed] ## Benchmarks Chirp-3b excels on rigorous evaluation datasets, showcasing its strength for a 3B model. ### MMLU Pro | Subject | Average Accuracy | |---------------------|------------------| | Biology | 0.6234 | | Business | 0.5032 | | Chemistry | 0.3701 | | Computer Science | 0.4268 | | Economics | 0.5284 | | Engineering | 0.3013 | | Health | 0.3900 | | History | 0.3885 | | Law | 0.2252 | | Math | 0.5736 | | Other | 0.4145 | | Philosophy | 0.3687 | | Physics | 0.3995 | | Psychology | 0.5589 | | **Overall Average** | **0.4320** | - **Improvement**: 9 points above the base model. ### IFEval - **Score**: 72% - **Improvement**: 14% better than the base model. More benchmarks are in the works and will be shared soon! ## Download Access Chirp-3b here: https://huggingface.co/ozone-research/Chirp-01 ## Usage ### Requirements - Recommended GPU: 8 GB VRAM Minimum ### Example ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "ozone-research/Chirp-01" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name) input_text = "What’s the future of AI?" inputs = tokenizer(input_text, return_tensors="pt") outputs = model.generate(**inputs, max_length=50) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Future Work The Ozone AI team is exploring additional models, including 2B and larger variants. Keep an eye out for upcoming releases! ## Feedback We’re eager for your input! Try Chirp-3b and let us know your thoughts, use cases, or ideas for improvement. Open an issue here or contact us via [contact method—update as needed]. ## Acknowledgments A big thanks to the open-source community for driving projects like this forward. Chirp-3b is our contribution to making AI research more accessible.