emredeveloper commited on
Commit
0e5a729
·
verified ·
1 Parent(s): 97d04de

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +88 -1
README.md CHANGED
@@ -6,4 +6,91 @@ base_model:
6
  - deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
7
  tags:
8
  - cot
9
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  - deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
7
  tags:
8
  - cot
9
+ ---
10
+ # Model Card for DeepSeek-R1-Distill-Qwen-1.5B-4bit
11
+
12
+ <!-- Provide a quick summary of what the model is/does. -->
13
+
14
+ This is a 4-bit quantized version of the `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B` model, optimized for efficient inference with reduced memory usage. The quantization was performed using the `bitsandbytes` library.
15
+
16
+ ## Model Details
17
+
18
+ ### Model Description
19
+
20
+ - **Developed by:** [Your Name or Organization]
21
+ - **Funded by [optional]:** [Your Funding Source, if applicable]
22
+ - **Shared by:** [Your Name or Organization]
23
+ - **Model type:** Transformer-based Language Model
24
+ - **Language(s) (NLP):** English
25
+ - **License:** MIT
26
+ - **Finetuned from model:** `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B`
27
+
28
+ ### Model Sources [optional]
29
+
30
+ - **Repository:** [Link to your GitHub repository, if applicable]
31
+ - **Paper [optional]:** [Link to the paper, if applicable]
32
+ - **Demo [optional]:** [Link to a live demo, if applicable]
33
+
34
+ ## Uses
35
+
36
+ ### Direct Use
37
+
38
+ This model is intended for research and practical applications where memory efficiency is critical. It can be used for:
39
+
40
+ - Text generation
41
+ - Language understanding tasks
42
+ - Chatbots and conversational AI
43
+
44
+ ### Downstream Use [optional]
45
+
46
+ This model can be fine-tuned for specific tasks such as:
47
+
48
+ - Sentiment analysis
49
+ - Text classification
50
+ - Summarization
51
+
52
+ ### Out-of-Scope Use
53
+
54
+ This model is not suitable for:
55
+
56
+ - High-precision tasks requiring full 16-bit or 32-bit precision
57
+ - Applications requiring extremely low latency
58
+
59
+ ## Bias, Risks, and Limitations
60
+
61
+ The model may inherit biases present in the training data. Users should be cautious when deploying the model in sensitive applications.
62
+
63
+ ### Recommendations
64
+
65
+ Users should evaluate the model's performance on their specific tasks and datasets before deployment. Consider fine-tuning the model for better alignment with your use case.
66
+
67
+ ## How to Get Started with the Model
68
+
69
+ Use the code below to get started with the model:
70
+
71
+ ```python
72
+ from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
73
+ import torch
74
+
75
+ # Quantization configuration
76
+ quantization_config = BitsAndBytesConfig(
77
+ load_in_4bit=True,
78
+ bnb_4bit_quant_type="nf4",
79
+ bnb_4bit_compute_dtype=torch.bfloat16,
80
+ bnb_4bit_use_double_quant=True
81
+ )
82
+
83
+ # Load the model and tokenizer
84
+ tokenizer = AutoTokenizer.from_pretrained("your-username/DeepSeek-R1-Distill-Qwen-1.5B-4bit", trust_remote_code=True)
85
+ model = AutoModelForCausalLM.from_pretrained(
86
+ "your-username/DeepSeek-R1-Distill-Qwen-1.5B-4bit",
87
+ quantization_config=quantization_config,
88
+ device_map="auto",
89
+ trust_remote_code=True
90
+ )
91
+
92
+ # Generate text
93
+ input_text = "Hello, how are you?"
94
+ inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
95
+ outputs = model.generate(**inputs, max_new_tokens=50)
96
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))