John Locke

johnlockejrr

AI & ML interests

NLP, OCR, AI

Recent Activity

updated a model 5 days ago
johnlockejrr/yolo_samaritan
published a model 5 days ago
johnlockejrr/yolo_samaritan
updated a dataset 7 days ago
johnlockejrr/samaritan_v1
View all activity

Organizations

Samaritan AI's profile picture

johnlockejrr's activity

updated a Space 10 days ago
published a Space 10 days ago
reacted to clem's post with šŸ”„ 11 days ago
view post
Post
1918
Llama models (arguably the most successful open AI models of all times) just represented 3% of total model downloads on Hugging Face in March.

People and media like stories of winner takes all & one model/company to rule them all but the reality is much more nuanced than this!

Kudos to all the small AI builders out there!
  • 2 replies
Ā·
reacted to fdaudens's post with šŸ”„ 12 days ago
view post
Post
2225
Did we just drop personalized AI evaluation?! This tool auto-generates custom benchmarks on your docs to test which models are the best.

Most benchmarks test general capabilities, but what matters is how models handle your data and tasks. YourBench helps answer critical questions like:
- Do you really need a hundreds-of-billions-parameter model sledgehammer to crack a nut?
- Could a smaller, fine-tuned model work better?
- How well do different models understand your domain?

Some cool features:
šŸ“š Generates custom benchmarks from your own documents (PDFs, Word, HTML)
šŸŽÆ Tests models on real tasks, not just general capabilities
šŸ”„ Supports multiple models for different pipeline stages
🧠 Generate both single-hop and multi-hop questions
šŸ” Evaluate top models and deploy leaderboards instantly
šŸ’° Full cost analysis to optimize for your budget
šŸ› ļø Fully configurable via a single YAML file

26 SOTA models tested for question generation. Interesting finding: Qwen2.5 32B leads in question diversity, while smaller Qwen models and Gemini 2.0 Flash offer great value for cost.

You can also run it locally on any models you want.

I'm impressed. Try it out: yourbench/demo