Zoran's picture

29

Zoran

zokica

·

AI & ML interests

None yet

Recent Activity

new activity 13 days ago

google/gemma-3-4b-it:Does not works with bitsandbytes 4bit and 8bit

new activity 13 days ago

unsloth/gemma-3-4b-it-unsloth-bnb-4bit:Does not work at all

new activity 8 months ago

google/gemma-2-9b:Gemma 2's Flash attention 2 implementation is strange...

View all activity

Organizations

None yet

zokica's activity

New activity in google/gemma-3-4b-it 13 days ago

Does not works with bitsandbytes 4bit and 8bit

#27 opened 13 days ago by

New activity in unsloth/gemma-3-4b-it-unsloth-bnb-4bit 13 days ago

Does not work at all

#1 opened 13 days ago by

New activity in google/gemma-2-9b 8 months ago

Gemma 2's Flash attention 2 implementation is strange...

#23 opened 9 months ago by

New activity in google/gemma-2-2b 8 months ago

Problem with Lora finetuning, Out of memory

#13 opened 8 months ago by

New activity in unsloth/gemma-2-2b-bnb-4bit 8 months ago

OOM when finetuning with lora.

#1 opened 8 months ago by

New activity in unsloth/gemma-2-9b-bnb-4bit 8 months ago

Peft out of memory

#2 opened 8 months ago by

New activity in google/gemma-2-9b 9 months ago

Model repeating information and "spitting out" random characters

#14 opened 9 months ago by

Gemma2FlashAttention2 missing sliding_window variable

#8 opened 9 months ago by

New activity in upstage/SOLAR-10.7B-Instruct-v1.0 9 months ago

Why batch size>1 does not increase model speed

#41 opened 9 months ago by

New activity in EleutherAI/pile-t5-large 9 months ago

why UMT5

#1 opened about 1 year ago by

New activity in microsoft/phi-1_5 12 months ago

Something broken on last update

#85 opened 12 months ago by

New activity in rhysjones/phi-2-orange about 1 year ago

Can't get it to generate the EOS token and beam search is not supported

#3 opened about 1 year ago by

New activity in microsoft/phi-2 about 1 year ago

How to fine-tune this? + Training code

#19 opened over 1 year ago by

New activity in rhysjones/phi-2-orange-v2 about 1 year ago

Added token

#5 opened about 1 year ago by

New activity in microsoft/phi-2 about 1 year ago

Generation after finetuning does not ends at EOS token

#123 opened about 1 year ago by

New activity in microsoft/phi-1_5 over 1 year ago

Attention mask for generation function in the future?

#7 opened over 1 year ago by

New activity in mosaicml/mpt-30b-chat almost 2 years ago

The model is extremelly slow in 4bit, is my code for loading ok?

#7 opened almost 2 years ago by

New activity in TheBloke/guanaco-33B-GPTQ almost 2 years ago

guanaco-65b

#1 opened almost 2 years ago by

New activity in mosaicml/mpt-7b almost 2 years ago

Speed on CPU

#8 opened almost 2 years ago by

Will you make a 3B model as well?

#7 opened almost 2 years ago by