Csaba Kecskemeti PRO
csabakecskemeti
AI & ML interests
None yet
Recent Activity
updated
a model
6 minutes ago
DevQuasar/deepseek-ai.DeepSeek-R1-Zero-GGUF
posted
an
update
about 3 hours ago
I've run the open llm leaderboard evaluations + hellaswag on https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B and compared to https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct and at first glance R1 do not beat Llama overall.
If anyone wants to double check the results are posted here:
https://github.com/csabakecskemeti/lm_eval_results
Am I made some mistake, or (at least this distilled version) not as good/better than the competition?
I'll run the same on the Qwen 7B distilled version too.
updated
a model
about 6 hours ago
DevQuasar/bespokelabs.Bespoke-Stratos-32B-GGUF
Organizations
csabakecskemeti's activity
CUDA out of memory error during fp8 to bf16 model conversion + fix
1
#17 opened 26 days ago
by
sszymczyk
Is this tested?
5
#1 opened about 2 months ago
by
csabakecskemeti
Generate on V100 questions
5
#10 opened about 1 month ago
by
csabakecskemeti
Is this a LORA adapter?
2
#1 opened about 1 month ago
by
csabakecskemeti
New activity in
DevQuasar/huihui-ai.Llama-3.3-70B-Instruct-abliterated-finetuned-GGUF
about 1 month ago
Checksum fails on /huihui-ai.Llama-3.3-70B-Instruct-abliterated-finetuned.Q4_K_M-00004-of-00004.gguf
2
#1 opened about 1 month ago
by
JoshGreifer
How the scores are calculated
3
#1028 opened about 2 months ago
by
csabakecskemeti
Phi3 or Mistral?
2
#3 opened 2 months ago
by
csabakecskemeti
[bot] Conversion to Parquet
#1 opened 3 months ago
by
parquet-converter
I think the Q8_0 is corrupted.
2
#1 opened 2 months ago
by
remghoost
Having issues running this
2
#1 opened 3 months ago
by
csabakecskemeti
Weight size VS VRAM requirements
7
#8 opened 3 months ago
by
mindkrypted
Update README.md
#1 opened 4 months ago
by
bbqddt
update pad_token?
3
#1 opened 3 months ago
by
gatorand
Possible non-GGUF release?
1
#1 opened 8 months ago
by
Azazelle