AI & ML interests

Efficient machine learning for any model and hardware: pruning, quantization, compilation, and more.

Recent Activity

Articles

PrunaAI's activity

davidberenstein1957Β 
posted an update 15 days ago
view post
Post
2050
🚨 New Bonus Unit: Tracing & Evaluating Your Agent! 🚨

Learn how to transform your agent from a simple demo into a robust, reliable product ready for real users.

UNIT: https://huggingface.co/learn/agents-course/bonus-unit2/introduction

In this unit, you'll learn:
- Offline Evaluation – Benchmark and iterate your agent using datasets.
- Online Evaluation – Continuously track key metrics such as latency, costs, and user feedback.

Happy testing and improving!

Thanks Langfuse team!
sharpenbΒ 
posted an update 20 days ago
view post
Post
3070
We open-sourced the pruna package that can be easily installed with pip install pruna :) It allows to easily ccompress and evaluate AI models including transformers and diffusers.

- Github repo: https://github.com/PrunaAI/pruna
- Documentation: https://docs.pruna.ai/en/stable/index.html

With open-sourcing, people can now inspect and contribute to the open code. Beyond the code, we provide detailed readme, tutorials, benchmarks, and documentation to make transparent compression, evaluation, and saving/loading/serving of AI models.

Happy to share it with you and always interested in collecting your feedback :)
  • 1 reply
Β·
davidberenstein1957Β 
posted an update about 1 month ago
davidberenstein1957Β 
posted an update about 1 month ago
view post
Post
4236
πŸ₯Š Epic Agent Framework Showdown! Available today!

πŸ”΅ In the blue corner, the versatile challenger with a proven track record of knowledge retrieval: LlamaIndex!

πŸ›‘ In the red corner, the defender, weighing in with lightweight efficiency: Hugging Face smolagents!

πŸ”— URL: agents-course

We just published the LlamaIndex unit for the agents course, and it is set to offer a great contrast between the smolagents unit by looking at

- What makes llama-index stand-out
- How the LlamaHub is used for integrations
- Creating QueryEngine components
- Using agents and tools
- Agentic and multi-agent workflows

The team has been working flat-out on this for a few weeks. Supported by Logan Markewich and Laurie Voss over at LlamaIndex.

Who won? You decide!
davidberenstein1957Β 
posted an update about 1 month ago
view post
Post
3035
🫸 New release to push vector search to the Hub with vicinity and work with any serialisable objects.

πŸ§‘β€πŸ« KNN, HNSW, USEARCH, ANNOY, PYNNDESCENT, FAISS, and VOYAGER.

πŸ”— Example Repo: minishlab/my-vicinity-repo
davidberenstein1957Β 
posted an update about 2 months ago
view post
Post
3304
πŸš€ Find banger tools for your smolagents!

I created the Tools gallery, which makes tools specifically developed by/for smolagents searchable and visible. This will help with:
- inspiration
- best practices
- finding cool tools

Space: davidberenstein1957/smolagents-and-tools
  • 1 reply
Β·
davidberenstein1957Β 
posted an update about 2 months ago
sharpenbΒ 
posted an update about 2 months ago
view post
Post
534
How to deploy compressed ML models in your pipeline?

We wrote a series of blogs on this topics. Hope that it can be helpful to people:
- Standard Model Compression in ML Pipeline: https://www.pruna.ai/blog/standard-model-compression-ml-pipeline
- Boost Your Replicate Models with Pruna AI: A Step-by-Step Guide: https://www.pruna.ai/blog/guide-replicate-pruna-ai
- Pruna + Triton: A Winning Combination for High-Performance AI Deployments: https://www.pruna.ai/blog/pruna-triton-combination

Feel free to join our discord (https://discord.com/invite/rskEr4BZJx) if you have questions ;)
davidberenstein1957Β 
posted an update 2 months ago