2 2 2

Turi Abu

turiabu

AI & ML interests

NLP, Speech Processing

Recent Activity

new activity 19 days ago

facebook/mms-300m:model._get_adapters() for 300m model not available

reacted to akhaliq's post with ❤️ about 1 month ago

QwQ-32B-Preview is now available in anychat A reasoning model that is competitive with OpenAI o1-mini and o1-preview try it out: https://huggingface.co/spaces/akhaliq/anychat

liked a Space about 1 month ago

akhaliq/anychat

View all activity

Organizations

turiabu's activity

New activity in facebook/mms-300m 19 days ago

model._get_adapters() for 300m model not available

#3 opened about 1 year ago by

Jungwonchang

reacted to akhaliq's post with ❤️ about 1 month ago

Post

5786

QwQ-32B-Preview is now available in anychat

A reasoning model that is competitive with OpenAI o1-mini and o1-preview

try it out: akhaliq/anychat

1 reply

liked a Space about 1 month ago

Running on CPU Upgrade

1.31k

🏢

Anychat

New activity in meta-llama/Llama-3.2-1B 3 months ago

Request: DOI

#24 opened 3 months ago by

romanbot

liked a model 3 months ago

Qwen/Qwen2.5-0.5B-Instruct

Text Generation • Updated Sep 25, 2024 • 456k • • 179

reacted to their post with 🤗 7 months ago

Post

2128

Can anyone see my post on🤗?
Reply with 🤗

4 replies

posted an update 7 months ago

Post

2128

Can anyone see my post on🤗?
Reply with 🤗

4 replies

reacted to Sentdex's post with 🔥 9 months ago

Post

8493

Okay, first pass over KAN: Kolmogorov–Arnold Networks, it looks very interesting!

Interpretability of KAN model:
May be considered mostly as a safety issue these days, but it can also be used as a form of interaction between the user and a model, as this paper argues and I think they make a valid point here. With MLP, we only interact with the outputs, but KAN is an entirely different paradigm and I find it compelling.

Scalability:
KAN shows better parameter efficiency than MLP. This likely translates also to needing less data. We're already at the point with the frontier LLMs where all the data available from the internet is used + more is made synthetically...so we kind of need something better.

Continual learning:
KAN can handle new input information w/o catastrophic forgetting, which helps to keep a model up to date without relying on some database or retraining.

Sequential data:
This is probably what most people are curious about right now, and KANs are not shown to work with sequential data yet and it's unclear what the best approach might be to make it work well both in training and regarding the interpretability aspect. That said, there's a rich long history of achieving sequential data in variety of ways, so I don't think getting the ball rolling here would be too challenging.

Mostly, I just love a new paradigm and I want to see more!

KAN: Kolmogorov-Arnold Networks (2404.19756)

5 replies

upvoted a paper about 1 year ago

GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 187