28 1 9

s

Tom-Neverwinter

Tom-Neverwinter

AI & ML interests

Making improvements to help the world.

Recent Activity

reacted to clem's post with 🔥 1 day ago

Llama models (arguably the most successful open AI models of all times) just represented 3% of total model downloads on Hugging Face in March. People and media like stories of winner takes all & one model/company to rule them all but the reality is much more nuanced than this! Kudos to all the small AI builders out there!

replied to philschmid's post 11 days ago

Gemini 2.5 Pro, thinking by default! We excited launch our best Gemini model for reasoning, multimodal and coding yet! #1 on LMSYS, Humanity’s Last Exam, AIME and GPQA and more! TL;DR: - 💻 Best Gemini coding model yet, particularly for web development (excels on LiveCodeBench). - 🧠 Default "Thinking" with up to 64k token output - 🌌 1 Million multimodal input context for text, image, video, audio, and pdf - 🛠️ Function calling, structured output, google search & code execution. - 🏆 #1 on LMArena & sota on AIME, GPQA, Humanity's Last Exam - 💡 Knowledge cut of January 2025 - 🤗 Available for free as Experimental in AI Studio, Gemini API & Gemini APP - 🚀 Rate limits - Free 2 RPM 50 req/day Try it ⬇️ https://aistudio.google.com/?model=gemini-2.5-pro-exp-03-25

new activity 21 days ago

TheDrummer/Gemmasutra-Small-4B-v1-GGUF:Review so far

View all activity

Organizations

None yet

Tom-Neverwinter's activity

reacted to clem's post with 🔥 1 day ago

Post

1766

Llama models (arguably the most successful open AI models of all times) just represented 3% of total model downloads on Hugging Face in March.

People and media like stories of winner takes all & one model/company to rule them all but the reality is much more nuanced than this!

Kudos to all the small AI builders out there!

2 replies

replied to philschmid's post 11 days ago

it has amnesia at times. it cant handle .lua (in the web browser, cursor, cline etc are fine) which is very annoying. it also fails to even handle its own tooling at times. [this is happening on googles own gemini page not just aistudio, cline and cursor have this issue too]

fails to follow instructions as it gets stuck in one thought, then just proceeds to ram its thought though

It gets an A+ for commenting its code.

New activity in TheDrummer/Gemmasutra-Small-4B-v1-GGUF 21 days ago

Review so far

#1 opened 24 days ago by

GlobalMeltdown

New activity in BeaverAI/Fallen-Gemma3-12B-v1a-GGUF 21 days ago

versions

#1 opened 21 days ago by

Tom-Neverwinter

New activity in perplexity-ai/r1-1776 about 2 months ago

Was this Model Needed?

#12 opened about 2 months ago by

fahdmirzac

New activity in NousResearch/DeepHermes-3-Llama-3-8B-Preview-GGUF about 2 months ago

Engage the hype engines anther hermes has arrived!!!

#1 opened about 2 months ago by

Tom-Neverwinter

reacted to csabakecskemeti's post with 🔥 3 months ago

Post

1510

I've built a small utility to split safetensors file by file.
The issue/need came up when I've tried to convert the new Deepseek V3 model from FP8 to BF16.
The only Ada architecture GPU I have is an RTX 4080 and the 16GB vram was just wasn't enough for the conversion.

BTW: I'll upload the bf16 version here:
DevQuasar/deepseek-ai.DeepSeek-V3-Base-bf16
(it will take a while - days with my upload speed)
If anyone has access the resources to test it I'd appreciate a feedback if it's working or not.

The tool, is available from here:
https://github.com/csabakecskemeti/ai_utils/blob/main/safetensor_splitter.py
It's splitting every file to n pieces by the layers if possible, and create a new "model.safetensors.index.json" file.
I've tested it with Llama 3.1 8B and multiple split sizes, and validated by using inference pipeline.
use --help for usage
Please note current version expects the model is already multiple file and have a "model.safetensors.index.json" layer-safetensor mapping file.

New activity in Apollo-LMMs/README 4 months ago

model pulled

#1 opened 4 months ago by

Tom-Neverwinter

reacted to tomaarsen's post with ❤️🚀🔥 6 months ago

Post

7124

📣 Sentence Transformers v3.2.0 is out, marking the biggest release for inference in 2 years! 2 new backends for embedding models: ONNX (+ optimization & quantization) and OpenVINO, allowing for speedups up to 2x-3x AND Static Embeddings for 500x speedups at 10-20% accuracy cost.

1️⃣ ONNX Backend: This backend uses the ONNX Runtime to accelerate model inference on both CPU and GPU, reaching up to 1.4x-3x speedup depending on the precision. We also introduce 2 helper methods for optimizing and quantizing models for (much) faster inference.
2️⃣ OpenVINO Backend: This backend uses Intel their OpenVINO instead, outperforming ONNX in some situations on CPU.

Usage is as simple as SentenceTransformer("all-MiniLM-L6-v2", backend="onnx"). Does your model not have an ONNX or OpenVINO file yet? No worries - it'll be autoexported for you. Thank me later 😉

🔒 Another major new feature is Static Embeddings: think word embeddings like GLoVe and word2vec, but modernized. Static Embeddings are bags of token embeddings that are summed together to create text embeddings, allowing for lightning-fast embeddings that don't require any neural networks. They're initialized in one of 2 ways:

1️⃣ via Model2Vec, a new technique for distilling any Sentence Transformer models into static embeddings. Either via a pre-distilled model with from_model2vec or with from_distillation where you do the distillation yourself. It'll only take 5 seconds on GPU & 2 minutes on CPU, no dataset needed.
2️⃣ Random initialization. This requires finetuning, but finetuning is extremely quick (e.g. I trained with 3 million pairs in 7 minutes). My final model was 6.6% worse than bge-base-en-v1.5, but 500x faster on CPU.

Full release notes: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.2.0
Documentation on Speeding up Inference: https://sbert.net/docs/sentence_transformer/usage/efficiency.html

1 reply

reacted to merve's post with 🔥 6 months ago

Post

3801

Meta AI vision has been cooking @facebook
They shipped multiple models and demos for their papers at @ECCV 🤗

Here's a compilation of my top picks:
- Sapiens is family of foundation models for human-centric depth estimation, segmentation and more, all models have open weights and demos 👏

All models have their demos and even torchscript checkpoints!
A collection of models and demos: facebook/sapiens-66d22047daa6402d565cb2fc
- VFusion3D is state-of-the-art consistent 3D generation model from images

Model: facebook/vfusion3d
Demo: facebook/VFusion3D

- CoTracker is the state-of-the-art point (pixel) tracking model

Demo: facebook/cotracker
Model: facebook/cotracker

reacted to louisbrulenaudet's post with 👍 7 months ago

Post

2610

The Romulus model series has been released on Hugging Face, continually pre-trained on 34,864,949 tokens of French laws and intended to serve as a foundation for fine-tuning on labeled data 🤗

The training code, dataset and model weights are open and available free on HF and the training was based on H100 provided by Microsoft for Startups using Unsloth AI by @danielhanchen and @shimmyshimmer 🦥

Link to the base model: louisbrulenaudet/Romulus-cpt-Llama-3.1-8B-v0.1

Link to the instruct model: louisbrulenaudet/Romulus-cpt-Llama-3.1-8B-v0.1-Instruct

Link to the dataset: louisbrulenaudet/Romulus-cpt-fr

Please note that these models have not been aligned for the production of usable texts as they stand, and will certainly need to be refined for the desired tasks in order to produce satisfactory results.

1 reply

New activity in multimodalart/flux-lora-the-explorer 8 months ago

how to make a lora

#2 opened 8 months ago by

guardiancc

updated 4 models 8 months ago

New activity in meta-llama/Llama-3.1-8B-Instruct 9 months ago

Issues loading model with ooabooga textgenwebui

#20 opened 9 months ago by

Kenji776

reacted to vikhyatk's post with 🔥 9 months ago

Post

3314

🚀 Exciting news! We've just launched "Thundermoon" - the latest version of Moondream, our open-source vision language model! 🌙

Key improvements in this release:
1. Massive leap in OCR capabilities
2. Enhanced document understanding
3. Significant boosts across key metrics:
* DocVQA: 61.9 (↑103%)
* TextVQA: 60.2 (↑5.2%)
* GQA: 64.9 (↑2.9%)

What does this mean? Moondream can now tackle complex document analysis tasks with unprecedented accuracy for a model of its size. From deciphering handwritten notes to interpreting data tables, the applications are vast.

Check out the image for a glimpse of Moondream in action, effortlessly extracting insights from a 1944 sugar industry document!

Why it matters:
* Democratizing AI: As an open-source project, we're making advanced vision AI accessible to all developers.
* Efficiency: Proving that smaller models can deliver big results.
* Real-world impact: From historical document analysis to modern business intelligence, the potential use cases are exciting.

Curious to try it out? Try out the live demo here! https://moondream.ai/playground

4 replies