---
base_model: deepseek-ai/deepseek-coder-6.7b-instruct
library_name: peft
tags:
  - text-generation
  - causal-lm
  - peft
  - fine-tuned
---

# Model Card for Abhijith71/Abhijith

## Model Summary

This model is a fine-tuned version of `deepseek-ai/deepseek-coder-6.7b-instruct` using Parameter-Efficient Fine-Tuning (PEFT). It is optimized for **text generation tasks**, particularly in **code generation and instruction-based responses**. The model is designed to assist in generating high-quality code snippets, explanations, and other natural language responses related to programming.

## Model Details

- **Developed by:** Abhijith71
- **Funded by [optional]:** Self-funded
- **Shared by:** Abhijith71
- **Model type:** Causal Language Model (CLM)
- **Language(s):** English
- **License:** Apache 2.0
- **Fine-tuned from:** deepseek-ai/deepseek-coder-6.7b-instruct

## Model Sources

- **Repository:** [Hugging Face Model Page](https://huggingface.co/Abhijith71/Abhijith)
- **Paper [optional]:** N/A
- **Demo:** N/A

## Uses

### Direct Use
- Code generation
- Instruction-based responses
- General text completion

### Downstream Use
- AI-assisted programming
- Documentation generation
- Code debugging assistance

### Out-of-Scope Use
- Not optimized for open-ended conversation beyond code-related tasks
- Not suitable for real-time chatbot applications without further tuning

## Bias, Risks, and Limitations

- The model might generate **biased** or **inaccurate** responses based on training data.
- Outputs should be **verified** before using in production.
- Limited knowledge outside the dataset scope (no real-time updating capability).

### Recommendations

- Users should **review generated code** for accuracy and security.
- Additional **fine-tuning** may be required for specialized use cases.

## How to Use

### Loading the Model
```python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Abhijith71/Abhijith"
base_model = "deepseek-ai/deepseek-coder-6.7b-instruct"

tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model)
model = PeftModel.from_pretrained(model, model_name)

inputs = tokenizer("Generate a Python function for factorial", return_tensors="pt")
output = model.generate(**inputs)
print(tokenizer.decode(output[0], skip_special_tokens=True))
```

## Training Details

### Training Data
- Fine-tuned on a dataset containing **programming-related instructions and code snippets**.

### Training Procedure
- **Preprocessing:** Tokenization and formatting of instruction-based prompts.
- **Training Regime:** Mixed precision training (bf16) using PEFT for efficient fine-tuning.

## Evaluation

### Testing Data, Factors & Metrics
- **Testing Data:** Sampled from programming-related sources
- **Metrics:**
  - Perplexity (PPL)
  - Code quality assessment
  - Instruction-following accuracy

### Results
- Model generates **coherent** and **useful** code snippets for various prompts.
- Some limitations exist in **edge cases** and complex multi-step reasoning.

## Environmental Impact

- **Hardware:** NVIDIA A100 GPUs
- **Training Duration:** Approx. 10-20 hours
- **Cloud Provider:** AWS
- **Carbon Emissions:** Estimated using [ML Impact Calculator](https://mlco2.github.io/impact#compute)

## Technical Specifications

### Model Architecture
- Based on **deepseek-ai/deepseek-coder-6.7b-instruct**
- Uses **PEFT** for fine-tuning

### Compute Infrastructure
- **Hardware:** 4x A100 GPUs
- **Software:** PEFT 0.14.0, Transformers 4.34+

## Citation
```bibtex
@misc{abhijith2024,
  author = {Abhijith71},
  title = {Fine-tuned DeepSeek Coder Model},
  year = {2024},
  publisher = {Hugging Face},
  journal = {Hugging Face Model Hub},
  url = {https://huggingface.co/Abhijith71/Abhijith}
}
```

## Contact
- **Author:** Abhijith71
- **Hugging Face Profile:** [https://huggingface.co/Abhijith71](https://huggingface.co/Abhijith71)

## Framework Versions
- **PEFT:** 0.14.0
- **Transformers:** 4.34+
- **Torch:** 2.0+

---
This model card provides an overview of the fine-tuned model, detailing its purpose, usage, and limitations. Users should review generated outputs and consider fine-tuning for specific applications.