Michael Peres's picture

15

Michael Peres

makiisthebes

·

makiisthenes

AI & ML interests

TinyML | Transformers | Reinforcement Learning | Learning through practice, MSc AI Student

Recent Activity

liked a model 10 days ago

makiisthebes/gemma-2-2b-Instruct-NL2SQL

updated a model 10 days ago

makiisthebes/gemma-2-9b-Instruct-NL2SQL

updated a model 10 days ago

makiisthebes/gemma-2-2b-Instruct-NL2SQL

View all activity

Organizations

None yet

makiisthebes's activity

liked a model 10 days ago

makiisthebes/gemma-2-2b-Instruct-NL2SQL

Updated 10 days ago • 73 • 1

updated 2 models 10 days ago

makiisthebes/gemma-2-9b-Instruct-NL2SQL

Updated 10 days ago • 67

makiisthebes/gemma-2-2b-Instruct-NL2SQL

Updated 10 days ago • 73 • 1

liked a dataset 10 days ago

Shritama/nl2sql

Viewer • Updated Feb 17 • 10.7k • 47 • 2

liked a dataset 11 days ago

Rajpreet2206/nl2sql-dataset

Viewer • Updated Aug 19 • 9.69k • 38 • 1

liked a model 23 days ago

bastienp/Gemma-2-2B-Instruct-structured-output

Updated Aug 19 • 294 • 2

liked a dataset 26 days ago

Clinton/texttosqlv2_25000_v2

Viewer • Updated Jul 28, 2023 • 25k • 81 • 5

liked a model 26 days ago

DiTy/gemma-2-9b-it-function-calling-GGUF

Text Generation • Updated 20 days ago • 1.3k • 10

liked a dataset 26 days ago

glaiveai/glaive-function-calling-v2

Viewer • Updated Sep 27, 2023 • 113k • 601 • 401

updated a dataset 3 months ago

makiisthebes/swedish_police_license_plates

Viewer • Updated Oct 5 • 8.1k • 6

replied to Sentdex's post 7 months ago

I am sure more work on inference will be done, looks pretty exciting, possibly reducing the model sizes quite a bunch!

reacted to Sentdex's post with 🔥 7 months ago

Post

8414

Okay, first pass over KAN: Kolmogorov–Arnold Networks, it looks very interesting!

Interpretability of KAN model:
May be considered mostly as a safety issue these days, but it can also be used as a form of interaction between the user and a model, as this paper argues and I think they make a valid point here. With MLP, we only interact with the outputs, but KAN is an entirely different paradigm and I find it compelling.

Scalability:
KAN shows better parameter efficiency than MLP. This likely translates also to needing less data. We're already at the point with the frontier LLMs where all the data available from the internet is used + more is made synthetically...so we kind of need something better.

Continual learning:
KAN can handle new input information w/o catastrophic forgetting, which helps to keep a model up to date without relying on some database or retraining.

Sequential data:
This is probably what most people are curious about right now, and KANs are not shown to work with sequential data yet and it's unclear what the best approach might be to make it work well both in training and regarding the interpretability aspect. That said, there's a rich long history of achieving sequential data in variety of ways, so I don't think getting the ball rolling here would be too challenging.

Mostly, I just love a new paradigm and I want to see more!

KAN: Kolmogorov-Arnold Networks (2404.19756)

5 replies

·

updated a model 7 months ago

makiisthebes/RNN_LSTM

updated a model 9 months ago

makiisthebes/transformers_scratch

Updated Apr 14 • 1

liked a model 9 months ago

makiisthebes/diffusion_model_scratch

Updated Apr 4 • 1

updated a model 9 months ago

makiisthebes/diffusion_model_scratch

Updated Apr 4 • 1

liked a model 9 months ago

makiisthebes/AlexNet_CNN_Visualisation

Updated Apr 4 • 1

updated a model 9 months ago

makiisthebes/AlexNet_CNN_Visualisation

Updated Apr 4 • 1

liked a model 10 months ago

makiisthebes/autoencoders

Updated Feb 24 • 1

updated a model 10 months ago

makiisthebes/autoencoders

Updated Feb 24 • 1