zhuqihao
zqh11
AI & ML interests
None yet
Recent Activity
authored
a paper
about 1 month ago
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via
Reinforcement Learning
liked
a model
about 1 month ago
deepseek-ai/DeepSeek-R1-Zero
authored
a paper
6 months ago
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for
Reinforcement Learning and Monte-Carlo Tree Search
Organizations
zqh11's activity
Adding `safetensors` variant of this model
#24 opened 12 months ago
by
Calvinnncy97
inference_params
2
#12 opened about 1 year ago
by
DataSoul

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 196.00 MiB. GPU 0 has a total capacty of 79.11 GiB of which 29.56 MiB is free
2
#21 opened about 1 year ago
by
butujuzipi
Set global data for future chats
#17 opened about 1 year ago
by
Sasori7
[AUTOMATED] Model Memory Requirements
#18 opened about 1 year ago
by
model-sizer-bot
Fine tune the model with part of layers on GPU and rest on CPU
#11 opened over 1 year ago
by
vmirea
Update to deepseek-coder-7b-base-v1.5 in code
#1 opened about 1 year ago
by
bartowski

Context length
1
#13 opened about 1 year ago
by
Rohith1016
Do we need BOS token before each turn of chat during finetuning?
2
#9 opened about 1 year ago
by
Annorita
Wrong result when calling apply_chat_template with add_generation_prompt=False
1
#8 opened about 1 year ago
by
Annorita
[Community Submission] Model: deepseek-ai/deepseek-coder-6.7b-instruct, Username: zqh11
4
#43 opened over 1 year ago
by
zqh11

[Community Submission] Model: deepseek-ai/deepseek-coder-33b-instruct, Username: zqh11
1
#42 opened over 1 year ago
by
zqh11

[Community Submission] Model: deepseek-ai/deepseek-coder-1.3b-base, Username: zqh11
#33 opened over 1 year ago
by
zqh11

[Community Submission] Model: deepseek-ai/deepseek-coder-6.7b-base, Username: zqh11
#32 opened over 1 year ago
by
zqh11

[Community Submission] Model: deepseek-ai/deepseek-coder-33b-base, Username: zqh11
#31 opened over 1 year ago
by
zqh11

Confirming the EOS token? 32021 or 32014? Or both?
4
#1 opened over 1 year ago
by
TheBloke

Cannot sumbit through the button
1
#29 opened over 1 year ago
by
zqh11
