Pulkit Mehta's picture

Pulkit Mehta

pulkitmehtawork
·

AI & ML interests

None yet

Recent Activity

Organizations

Hugging Face Discord Community's profile picture FOR-GPU-POOR's profile picture

pulkitmehtawork's activity

New activity in TinyStories-Regional/README 2 days ago
upvoted an article 25 days ago
view article
Article

Training and Finetuning Reranker Models with Sentence Transformers v4

115
reacted to ritvik77's post with ❤️ about 1 month ago
view post
Post
2515
Big companies are now training huge AI models with tons of data and billions of parameters, and the future seems to be about quantization—making those models smaller by turning big numbers into simpler ones, like going from 32-bit to 8-bit without reducing accuracy by +/- 0.01%. There should be some standard unit of measurement for the ratio of model size reduction to accuracy lost.

What do you all thing about this ?
·
replied to abhishek's post about 1 month ago
view reply

nice .. I tried on various queries ranging from simple greeting to maths questions , reasoning related and it routed . I also liked that it gave domain , task type ,complexity and its rationale for selecting the model . Looking forward for the team to integrate VLMs as well ..

reacted to abhishek's post with 👍 about 1 month ago
view post
Post
3399
🚀 I'm thrilled to announce the launch of Arcee Conductor, a game-changing platform that's about to revolutionize the way you interact with AI models! 🤖 As the pioneers of small language models (SLMs), we've been working tirelessly to bring you the most exciting innovation in the AI space.
Here's a quick TL;DR of what Arcee Conductor is all about:

🌟 Choice and flexibility: Get access to multiple models, including our powerful SLMs and third-party LLMs, to choose the best one for your specific use case
🤖 Intelligent routing: Our platform evaluates which model is best-suited for each of your queries, ensuring you get the most accurate results
📈 Cost savings: Reduce your AI costs with our affordable SLMs, while still having access to leading LLMs when needed
🚀 Easy to get started: Sign up now and try Arcee Conductor today, with 400 million tokens (a $200 value) on us! 🎁
📊 Proven track record: Our SLMs have already racked up 222K+ downloads on Hugging Face, with customers seeing significant cost savings and improved accuracy

For a limited time, you can get $200 credits to use with Conductor for FREE. Check it out here: https://conductor.arcee.ai
  • 3 replies
·
New activity in ds4sd/SmolDocling-256M-Demo about 1 month ago
upvoted an article about 1 month ago
view article
Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

235
replied to natolambert's post about 1 month ago
view reply

great work .. I checked some of the details on the substack post as well .. paper link is not working .. pls check

reacted to natolambert's post with ❤️ about 1 month ago
view post
Post
Today, we’re releasing our first pretrained Open Language Models (OLMo) at the Allen Institute for AI (AI2), a set of 7 billion parameter models and one 1 billion parameter variant. This line of work was probably the main reason I joined AI2 and is the biggest lever I see possible to enact meaningful change in how AI is used, studied, and discussed in the short term.

Links at the top because that's what you want:
* Core 7B model: allenai/OLMo-7B
* 7B model twin (different GPU hardware): allenai/OLMo-7B-Twin-2T
* 1B model: allenai/OLMo-1B
* Dataset: allenai/dolma
* Paper (arxiv soon): https://allenai.org/olmo/olmo-paper.pdf
* My personal blog post: https://www.interconnects.ai/p/olmo


OLMo will represent a new type of LLM enabling new approaches to ML research and deployment, because on a key axis of openness, OLMo represents something entirely different. OLMo is built for scientists to be able to develop research directions at every point in the development process and execute on them, which was previously not available due to incomplete information and tools.

Depending on the evaluation methods, OLMo 1 is either the best 7 billion parameter base model available for download or one of the best. This relies on a new way of thinking where models are judged on parameter plus token budget, similar to how scaling laws are measured for LLMs.

We're just getting started, so please help us learn how to be more scientific with LLMs!
  • 2 replies
·
reacted to mitkox's post with 👍 about 1 month ago
view post
Post
2818
llama.cpp is 26.8% faster than ollama.
I have upgraded both, and using the same settings, I am running the same DeepSeek R1 Distill 1.5B on the same hardware. It's an Apples to Apples comparison.

Total duration:
llama.cpp 6.85 sec <- 26.8% faster
ollama 8.69 sec

Breakdown by phase:
Model loading
llama.cpp 241 ms <- 2x faster
ollama 553 ms

Prompt processing
llama.cpp 416.04 tokens/s with an eval time 45.67 ms <- 10x faster
ollama 42.17 tokens/s with an eval time of 498 ms

Token generation
llama.cpp 137.79 tokens/s with an eval time 6.62 sec <- 13% faster
ollama 122.07 tokens/s with an eval time 7.64 sec

llama.cpp is LLM inference in C/C++; ollama adds abstraction layers and marketing.

Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
·
upvoted an article about 1 month ago
view article
Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

397
New activity in reasoning-course/certificates about 1 month ago
upvoted an article about 2 months ago
view article
Article

1 Billion Classifications

43