Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
11.0
TFLOPS
9
17
6
Erik Kaunismäki
erikkaum
Follow
bezzam's profile picture
imstevenpmwork's profile picture
badaoui's profile picture
107 followers
·
153 following
https://www.erikkaum.com/
ErikKaum
ErikKaum
erik-kaunismaki
erikkaum.bsky.social
AI & ML interests
None yet
Recent Activity
posted
an
update
2 days ago
Releasing my first kernel 🔥 MaxSim Late-interaction retrieval (ColBERT / PyLate) bottlenecks on materializing the full similarity matrix. This kernel avoids it by using tiled scoring with simdgroup_matrix (Metal) and WMMA. The result is 3–5× speedup compared to naive PyTorch baseline 🔥 Benchmarks: - SmallRerank (B=32, C=10): up to 3.2× (M3 Pro) / 2.8× (A100) - HeavyRerank (B=32, C=100): up to 3.8× (M3 Pro) / 5.3× (A100) - LongDocStress (Ld=1024): up to 6.2× (L4) Try it out 👇 https://huggingface.co/kernels/erikkaum/maxsim
updated
a bucket
3 days ago
erikkaum/training-cache
published
a bucket
7 days ago
erikkaum/training-cache
View all activity
Organizations
erikkaum
's models
3
Sort: Recently updated
erikkaum/vllm-caches
Updated
Mar 10
erikkaum/vllm-torch-compile-cache
Updated
Feb 10
erikkaum/moonline
Image-to-Text
•
Updated
May 7, 2024
•
8