--- base_model: deepseek-ai/deepseek-coder-6.7b-instruct library_name: peft tags: - text-generation - causal-lm - peft - fine-tuned --- # Model Card for Abhijith71/Abhijith ## Model Summary This model is a fine-tuned version of `deepseek-ai/deepseek-coder-6.7b-instruct` using Parameter-Efficient Fine-Tuning (PEFT). It is optimized for **text generation tasks**, particularly in **code generation and instruction-based responses**. The model is designed to assist in generating high-quality code snippets, explanations, and other natural language responses related to programming. ## Model Details - **Developed by:** Abhijith71 - **Funded by [optional]:** Self-funded - **Shared by:** Abhijith71 - **Model type:** Causal Language Model (CLM) - **Language(s):** English - **License:** Apache 2.0 - **Fine-tuned from:** deepseek-ai/deepseek-coder-6.7b-instruct ## Model Sources - **Repository:** [Hugging Face Model Page](https://huggingface.co/Abhijith71/Abhijith) - **Paper [optional]:** N/A - **Demo:** N/A ## Uses ### Direct Use - Code generation - Instruction-based responses - General text completion ### Downstream Use - AI-assisted programming - Documentation generation - Code debugging assistance ### Out-of-Scope Use - Not optimized for open-ended conversation beyond code-related tasks - Not suitable for real-time chatbot applications without further tuning ## Bias, Risks, and Limitations - The model might generate **biased** or **inaccurate** responses based on training data. - Outputs should be **verified** before using in production. - Limited knowledge outside the dataset scope (no real-time updating capability). ### Recommendations - Users should **review generated code** for accuracy and security. - Additional **fine-tuning** may be required for specialized use cases. ## How to Use ### Loading the Model ```python from peft import PeftModel from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "Abhijith71/Abhijith" base_model = "deepseek-ai/deepseek-coder-6.7b-instruct" tokenizer = AutoTokenizer.from_pretrained(base_model) model = AutoModelForCausalLM.from_pretrained(base_model) model = PeftModel.from_pretrained(model, model_name) inputs = tokenizer("Generate a Python function for factorial", return_tensors="pt") output = model.generate(**inputs) print(tokenizer.decode(output[0], skip_special_tokens=True)) ``` ## Training Details ### Training Data - Fine-tuned on a dataset containing **programming-related instructions and code snippets**. ### Training Procedure - **Preprocessing:** Tokenization and formatting of instruction-based prompts. - **Training Regime:** Mixed precision training (bf16) using PEFT for efficient fine-tuning. ## Evaluation ### Testing Data, Factors & Metrics - **Testing Data:** Sampled from programming-related sources - **Metrics:** - Perplexity (PPL) - Code quality assessment - Instruction-following accuracy ### Results - Model generates **coherent** and **useful** code snippets for various prompts. - Some limitations exist in **edge cases** and complex multi-step reasoning. ## Environmental Impact - **Hardware:** NVIDIA A100 GPUs - **Training Duration:** Approx. 10-20 hours - **Cloud Provider:** AWS - **Carbon Emissions:** Estimated using [ML Impact Calculator](https://mlco2.github.io/impact#compute) ## Technical Specifications ### Model Architecture - Based on **deepseek-ai/deepseek-coder-6.7b-instruct** - Uses **PEFT** for fine-tuning ### Compute Infrastructure - **Hardware:** 4x A100 GPUs - **Software:** PEFT 0.14.0, Transformers 4.34+ ## Citation ```bibtex @misc{abhijith2024, author = {Abhijith71}, title = {Fine-tuned DeepSeek Coder Model}, year = {2024}, publisher = {Hugging Face}, journal = {Hugging Face Model Hub}, url = {https://huggingface.co/Abhijith71/Abhijith} } ``` ## Contact - **Author:** Abhijith71 - **Hugging Face Profile:** [https://huggingface.co/Abhijith71](https://huggingface.co/Abhijith71) ## Framework Versions - **PEFT:** 0.14.0 - **Transformers:** 4.34+ - **Torch:** 2.0+ --- This model card provides an overview of the fine-tuned model, detailing its purpose, usage, and limitations. Users should review generated outputs and consider fine-tuning for specific applications.