Oleg Dmitriev
qilowoq
AI & ML interests
NLP (mainly in Russian)
Recent Activity
new activity
about 24 hours ago
google/gemma-2-9b-it:Sliding window vs. Global Attention
liked
a model
8 days ago
infly/INF-ORM-Llama3.1-70B
upvoted
a
paper
about 1 month ago
Gemma 2: Improving Open Language Models at a Practical Size
Organizations
qilowoq's activity
Sliding window vs. Global Attention
6
#41 opened 4 months ago
by
tanliboy
Adding `safetensors` variant of this model
#4 opened about 2 months ago
by
SFconvertbot
Adding `safetensors` variant of this model
#1 opened about 2 months ago
by
SFconvertbot
How can we access the logits from this model output?
5
#3 opened about 1 year ago
by
vishwasprabhub
Methodology questions
2
#2 opened over 1 year ago
by
justinbarton
Different size between tokenizer vocab and embedding
2
#1 opened over 1 year ago
by
demharters
Different size between tokenizer vocab and embedding
2
#1 opened over 1 year ago
by
demharters