--- base_model: - prithivMLmods/Llama-3.1-8B-Open-SFT tags: - text-generation-inference - transformers - unsloth - Llama3 - trl - COT - Reasoning license: apache-2.0 language: - en datasets: - Daemontatox/LongCOT-Reason metrics: - accuracy - character - competition_math - code_eval library_name: transformers pipeline_tag: text-generation --- ![image](./image.webp) # AetherDrake-SFT - **Developed by:** Daemontatox - **License:** Apache 2.0 - **Finetuned Using:** [Unsloth](https://github.com/unslothai/unsloth), Hugging Face Transformers, and TRL Library ## Model Overview The **AetherDrake-SFT Model** is an advanced AI system optimized for logical reasoning, multi-step problem-solving, and decision-making tasks. Designed with efficiency and accuracy in mind, it employs a structured system prompt to ensure high-quality answers through a transparent and iterative thought process. ### System Prompt and Workflow This model operates using an innovative reasoning framework structured around the following steps: 1. **Initial Thought:** The model uses `` tags to reason step-by-step and craft its best possible response. Example: 2. **Self-Critique:** It evaluates its initial response within `` tags, focusing on: - **Accuracy:** Is it factually correct and verifiable? - **Clarity:** Is it clear and free of ambiguity? - **Completeness:** Does it fully address the request? - **Improvement:** What can be enhanced? Example: 3. **Revision:** Based on the critique, the model refines its response within `` tags. Example: 4. **Final Response:** The revised response is presented clearly within `` tags. Example: 5. **Tag Innovation:** When needed, the model creates and defines new tags for better structuring or clarity, ensuring consistent usage. Example: ### Key Features - **Structured Reasoning:** Transparent, multi-step approach for generating and refining answers. - **Self-Improvement:** Built-in critique and revision ensure continuous response enhancement. - **Clarity and Adaptability:** Tagging system provides organized, adaptable responses tailored to user needs. - **Creative Flexibility:** Supports dynamic problem-solving with the ability to introduce new tags and concepts. --- ## Use Cases The model is designed for various domains, including: 1. **Research and Analysis:** Extracting insights and providing structured explanations. 2. **Education:** Assisting with tutoring by breaking down complex problems step-by-step. 3. **Problem-Solving:** Offering logical and actionable solutions for multi-step challenges. 4. **Content Generation:** Producing clear, well-organized creative or professional content. --- ## Training Details - **Frameworks:** - [Unsloth](https://github.com/unslothai/unsloth) for accelerated training. - Hugging Face Transformers and the TRL library for reinforcement learning with human feedback (RLHF). - **Dataset:** Finetuned on diverse reasoning-focused tasks, including logical puzzles, mathematical problems, and commonsense reasoning scenarios. - **Hardware Efficiency:** - Trained with bnb-4bit precision for reduced memory usage. - Optimized training pipeline achieving 2x faster development cycles. --- ## Performance Metrics The model excels in reasoning benchmarks: - **ARC (AI2 Reasoning Challenge):** High accuracy in logical and commonsense tasks. - **GSM8K (Math Reasoning):** Superior results in multi-step problem-solving. - **CommonsenseQA:** Strong comprehension of everyday reasoning tasks. --- ## Ethical Considerations - **Transparency:** Responses are structured for verifiability through tagging. - **Bias Mitigation:** Includes self-critique to minimize biases and ensure fairness. - **Safe Deployment:** Users are encouraged to evaluate outputs to prevent harm or misinformation. --- ## License This model is distributed under the Apache 2.0 license, allowing users to use, modify, and share it in compliance with the license terms. --- ## Acknowledgments Special thanks to: - [Unsloth](https://github.com/unslothai/unsloth) for accelerated training workflows. - Hugging Face for their powerful tools and libraries. --- Experience the **AetherDrake-SFT l**, leveraging its structured reasoning and self-improvement capabilities for any task requiring advanced AI reasoning.