Rakshit Aralimatti PRO

RakshitAralimatti

AI & ML interests

Poor GPU Guy

Recent Activity

Organizations

Hugging Face Discord Community's profile picture AI Starter Pack's profile picture

RakshitAralimatti's activity

New activity in mllmTeam/PhoneLM-0.5B 21 days ago

GGUF Models

#1 opened 21 days ago by
RakshitAralimatti
New activity in rasa/command-generation-calm-demo-v1 about 1 month ago

Model Selection dout

3
#3 opened about 1 month ago by
RakshitAralimatti

Getting Error

3
#1 opened 4 months ago by
RakshitAralimatti
New activity in LLM360/TxT360 3 months ago
New activity in mlfoundations/dclm-baseline-1.0 3 months ago
New activity in varma007ut/Indian_Legal_Assitant 4 months ago
reacted to bartowski's post with โค๏ธ 5 months ago
view post
Post
10061
So turns out I've been spreading a bit of misinformation when it comes to imatrix in llama.cpp

It starts true; imatrix runs the model against a corpus of text and tracks the activation of weights to determine which are most important

However what the quantization then does with that information is where I was wrong.

I think I made the accidental connection between imatrix and exllamav2's measuring, where ExLlamaV2 decides how many bits to assign to which weight depending on the goal BPW

Instead, what llama.cpp with imatrix does is it attempts to select a scale for a quantization block that most accurately returns the important weights to their original values, ie minimizing the dequantization error based on the importance of activations

The mildly surprising part is that it actually just does a relatively brute force search, it picks a bunch of scales and tries each and sees which one results in the minimum error for weights deemed important in the group

But yeah, turns out, the quantization scheme is always the same, it's just that the scaling has a bit more logic to it when you use imatrix

Huge shoutout to @compilade for helping me wrap my head around it - feel free to add/correct as well if I've messed something up
ยท
reacted to as-cle-bert's post with ๐Ÿš€ 6 months ago
view post
Post
5067
Hi HF Community!๐Ÿค—

In the past days, OpenAI announced their search engine, SearchGPT: today, I'm glad to introduce you SearchPhi, an AI-powered and open-source web search tool that aims to reproduce similar features to SearchGPT, built upon microsoft/Phi-3-mini-4k-instruct, llama.cpp๐Ÿฆ™ and Streamlit.
Although not as capable as SearchGPT, SearchPhi v0.0-beta.0 is a first step toward a fully functional and multimodal search engine :)
If you want to know more, head over to the GitHub repository (https://github.com/AstraBert/SearchPhi) and, to test it out, use this HF space: as-cle-bert/SearchPhi
Have fun!๐Ÿฑ