--- license: mit datasets: - mengmeong/coding-skill-real-world-needs language: - en base_model: nisten/Biggie-SmoLlm-0.15B-Base pipeline_tag: text-generation inference: parameters: model_file: meng-coding-skill.gguf temperature: 1 --- # Programming Skills Learning Path Model This model is a fine-tuned version of the base mdoel designed to generate path of learning a skill based on input text. It's particularly useful for identifying emerging trends and skill combinations in the rapidly evolving tech landscape. ## Usage & Limitations ![llama.cpp demo](meng-cli.gif) The model is intended for: - Deploying in limited CPU resource, with average about 40 tps on 1 CPU core The model has limits: - The dataset might not capture the very latest tools development in programming world - Chatbot usecase does not fit the model usecase - The model only return the response as JSON list. Please note that this model was trained on a custom dataset and may reflect biases present in that data. ### Training Hyperparameters - **Batch Size:** 4 - **Optimizer:** Experimental GrokAdamW ## Little Training Metrics ![Eval Loss](eval_loss.png) ![Eval Runtime](eval_runtime.png) ![Eval Sample Per Seconds](eval_sample_per_secs.png) ![Eval Steps per Seconds](eval_sps.png) ![Loss on Train](train_loss.png)