--- metrics: - accuracy - code_eval --- # Model Card for Evaluate360M ## Model Details ### Model Description Evaluate360M is a lightweight large language model optimized for reasoning tasks. It is designed to run efficiently on low-end commercial hardware, such as mobile phones, while maintaining strong performance in logical reasoning and general-purpose applications. - **Developed by:** [More Information Needed] - **Funded by [optional]:** [More Information Needed] - **Shared by [optional]:** [More Information Needed] - **Model type:** Transformer-based decoder model - **Language(s) (NLP):** English - **License:** [More Information Needed] - **Finetuned from model [optional]:** `HuggingFaceTB/SmolLM2-360M-Instruct` ### Model Sources - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses ### Direct Use Evaluate360M is intended for general-purpose reasoning tasks and can be used in applications that require lightweight LLMs, such as: - Mobile-based AI assistants - Low-power embedded systems - Edge computing applications ### Downstream Use It can be further fine-tuned for specific domains, including code generation, summarization, or dialogue systems. ### Out-of-Scope Use - Not optimized for handling very large context windows - Not designed for generating high-fidelity creative text, such as poetry or fiction ## Bias, Risks, and Limitations ### Limitations - Struggles with handling large context windows. - Not evaluated for potential biases yet. ### Recommendations Users should be aware of the model’s limitations in context length and should evaluate its performance for their specific use cases. ## How to Get Started with the Model ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "evaluate360m" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name) inputs = tokenizer("What is the capital of France?", return_tensors="pt") outputs = model.generate(**inputs) print(tokenizer.decode(outputs[0])) ``` ## Training Details ### Training Data - **Dataset:** `HuggingFaceH4/Bespoke-Stratos-17k` - **Preprocessing:** Token packing enabled (`--packing`), sequence length up to 2048 tokens ### Training Procedure - **Optimizer & Precision:** - `bf16` mixed precision - `gradient_accumulation_steps = 8` - Gradient checkpointing enabled - **Hyperparameters:** - Learning rate: `2e-5` - Epochs: `3` - Batch size: `4` (per device, both training and evaluation) - **Evaluation & Saving:** - Evaluation every `500` steps - Model checkpoint saved every `1000` steps, keeping a max of `2` checkpoints ### Compute Infrastructure - **Hardware Used:** A100 GPU - **Training Time:** 6 hours ## Evaluation - **Benchmarks:** No evaluation conducted yet. - **Metrics:** Not available yet. ## Environmental Impact - **Hardware Type:** A100 GPU - **Hours Used:** 6 hours - **Cloud Provider:** [More Information Needed] - **Compute Region:** [More Information Needed] - **Carbon Emitted:** [More Information Needed] ## Technical Specifications ### Model Architecture - Similar to SmolLM2-360M - Inspired by MobileLLM - Uses **Grouped-Query Attention (GQA)** - Prioritizes depth over width ## Citation [optional] **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## More Information [More Information Needed] ## Model Card Authors [optional] [More Information Needed] ## Model Card Contact [More Information Needed]