4 20 9

Jesse

jessepisel

jessepisel

AI & ML interests

computer vision, generative ai, agentic

Recent Activity

upvoted a collection about 9 hours ago

Cogito v1 Preview

upvoted a paper about 16 hours ago

SmolVLM: Redefining small and efficient multimodal models

reacted to clem's post with 🔥 5 days ago

Llama models (arguably the most successful open AI models of all times) just represented 3% of total model downloads on Hugging Face in March. People and media like stories of winner takes all & one model/company to rule them all but the reality is much more nuanced than this! Kudos to all the small AI builders out there!

View all activity

Organizations

jessepisel's activity

upvoted a collection about 9 hours ago

Cogito v1 Preview

Collection

5 items • Updated 1 day ago • 47

upvoted a paper about 16 hours ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published 1 day ago • 106

reacted to clem's post with 🔥 5 days ago

Post

1866

Llama models (arguably the most successful open AI models of all times) just represented 3% of total model downloads on Hugging Face in March.

People and media like stories of winner takes all & one model/company to rule them all but the reality is much more nuanced than this!

Kudos to all the small AI builders out there!

2 replies

liked a Space 6 days ago

Try YourBench!

🪄

Generate a custom benchmark from any document

reacted to fdaudens's post with 🔥 6 days ago

Post

2154

Did we just drop personalized AI evaluation?! This tool auto-generates custom benchmarks on your docs to test which models are the best.

Most benchmarks test general capabilities, but what matters is how models handle your data and tasks. YourBench helps answer critical questions like:
- Do you really need a hundreds-of-billions-parameter model sledgehammer to crack a nut?
- Could a smaller, fine-tuned model work better?
- How well do different models understand your domain?

Some cool features:
📚 Generates custom benchmarks from your own documents (PDFs, Word, HTML)
🎯 Tests models on real tasks, not just general capabilities
🔄 Supports multiple models for different pipeline stages
🧠 Generate both single-hop and multi-hop questions
🔍 Evaluate top models and deploy leaderboards instantly
💰 Full cost analysis to optimize for your budget
🛠️ Fully configurable via a single YAML file

26 SOTA models tested for question generation. Interesting finding: Qwen2.5 32B leads in question diversity, while smaller Qwen models and Gemini 2.0 Flash offer great value for cost.

You can also run it locally on any models you want.

I'm impressed. Try it out: yourbench/demo

reacted to nyuuzyou's post with 👍 8 days ago

Post

1479

✈️ FlightAware Photos Dataset - nyuuzyou/flightaware

Collection of approximately 197,718 aviation photographs featuring:
- High-quality aircraft images across multiple sizes and formats
- Comprehensive metadata including aircraft registrations, types, and photographer information
- View counts, ratings, and submission timestamps for each photo
- Rich classification data preserving original titles, descriptions, and photographer badges

This dataset offers a unique visual archive of aircraft spanning commercial, military, and private aviation captured by FlightAware's community of photographers under CC BY-NC-SA 3.0 license.

reacted to clem's post with 🔥 8 days ago

Post

3903

Before 2020, most of the AI field was open and collaborative. For me, that was the key factor that accelerated scientific progress and made the impossible possible—just look at the “T” in ChatGPT, which comes from the Transformer architecture openly shared by Google.

Then came the myth that AI was too dangerous to share, and companies started optimizing for short-term revenue. That led many major AI labs and researchers to stop sharing and collaborating.

With OAI and sama now saying they're willing to share open weights again, we have a real chance to return to a golden age of AI progress and democratization—powered by openness and collaboration, in the US and around the world.

This is incredibly exciting. Let’s go, open science and open-source AI!

5 replies

upvoted a paper 13 days ago

Open Deep Search: Democratizing Search with Open-source Reasoning Agents

Paper • 2503.20201 • Published 14 days ago • 42

updated 2 models 21 days ago

thinkonward/denoizer

Updated 21 days ago • 2

thinkonward/geophysical-foundation-model

Updated 21 days ago • 49 • 5

New activity in thinkonward/geophysical-foundation-model 21 days ago

Curios why snapshot_download is needed

#2 opened 22 days ago by

nielsr

upvoted a paper 27 days ago

Beyond RAG: Task-Aware KV Cache Compression for Comprehensive Knowledge Reasoning

Paper • 2503.04973 • Published Mar 6 • 21

reacted to as-cle-bert's post with 👍 about 1 month ago

Post

2728

I just released a fully automated evaluation framework for your RAG applications!📈

GitHub 👉 https://github.com/AstraBert/diRAGnosis
PyPi 👉 https://pypi.org/project/diragnosis/

It's called 𝐝𝐢𝐑𝐀𝐆𝐧𝐨𝐬𝐢𝐬 and is a lightweight framework that helps you 𝗱𝗶𝗮𝗴𝗻𝗼𝘀𝗲 𝘁𝗵𝗲 𝗽𝗲𝗿𝗳𝗼𝗿𝗺𝗮𝗻𝗰𝗲 𝗼𝗳 𝗟𝗟𝗠𝘀 𝗮𝗻𝗱 𝗿𝗲𝘁𝗿𝗶𝗲𝘃𝗮𝗹 𝗺𝗼𝗱𝗲𝗹𝘀 𝗶𝗻 𝗥𝗔𝗚 𝗮𝗽𝗽𝗹𝗶𝗰𝗮𝘁𝗶𝗼𝗻𝘀.

You can launch it as an application locally (it's Docker-ready!🐋) or, if you want more flexibility, you can integrate it in your code as a python package📦

The workflow is simple:
🧠 You choose your favorite LLM provider and model (supported, for now, are Mistral AI, Groq, Anthropic, OpenAI and Cohere)
🧠 You pick the embedding models provider and the embedding model you prefer (supported, for now, are Mistral AI, Hugging Face, Cohere and OpenAI)
📄 You prepare and provide your documents
⚙️ Documents are ingested into a Qdrant vector database and transformed into a synthetic question dataset with the help of LlamaIndex
📊 The LLM is evaluated for the faithfulness and relevancy of its retrieval-augmented answer to the questions
📊 The embedding model is evaluated for hit rate and mean reciprocal ranking (MRR) of the retrieved documents

And the cool thing is that all of this is 𝗶𝗻𝘁𝘂𝗶𝘁𝗶𝘃𝗲 𝗮𝗻𝗱 𝗰𝗼𝗺𝗽𝗹𝗲𝘁𝗲𝗹𝘆 𝗮𝘂𝘁𝗼𝗺𝗮𝘁𝗲𝗱: you plug it in, and it works!🔌⚡

Even cooler? This is all built on top of LlamaIndex and its integrations: no need for tons of dependencies or fancy workarounds🦙
And if you're a UI lover, Gradio and FastAPI are there to provide you a seamless backend-to-frontend experience🕶️

So now it's your turn: you can either get diRAGnosis from GitHub 👉 https://github.com/AstraBert/diRAGnosis
or just run a quick and painless:

uv pip install diragnosis

To get the package installed (lightning-fast) in your environment🏃‍♀️

Have fun and feel free to leave feedback and feature/integrations requests on GitHub issues✨

upvoted a paper about 1 month ago

LLM as a Broken Telephone: Iterative Generation Distorts Information

Paper • 2502.20258 • Published Feb 27 • 26

updated a model about 1 month ago

thinkonward/challenges

Updated Mar 6

liked a model about 1 month ago

amd/Instella-3B-Instruct

Text Generation • Updated 11 days ago • 1.73k • 45

updated 2 models about 1 month ago

thinkonward/section-seeker-base-16

Updated Mar 4

thinkonward/section-seeker-large-16

Updated Mar 4