3 3 41

codito

https://codito.in

codito

AI & ML interests

None yet

Recent Activity

liked a model 3 days ago

stduhpf/google-gemma-3-12b-it-qat-q4_0-gguf-small

liked a dataset 6 months ago

nvidia/HelpSteer2

reacted to tomaarsen's post with 🔥 6 months ago

📣 Sentence Transformers v3.2.0 is out, marking the biggest release for inference in 2 years! 2 new backends for embedding models: ONNX (+ optimization & quantization) and OpenVINO, allowing for speedups up to 2x-3x AND Static Embeddings for 500x speedups at 10-20% accuracy cost. 1️⃣ ONNX Backend: This backend uses the ONNX Runtime to accelerate model inference on both CPU and GPU, reaching up to 1.4x-3x speedup depending on the precision. We also introduce 2 helper methods for optimizing and quantizing models for (much) faster inference. 2️⃣ OpenVINO Backend: This backend uses Intel their OpenVINO instead, outperforming ONNX in some situations on CPU. Usage is as simple as `SentenceTransformer("all-MiniLM-L6-v2", backend="onnx")`. Does your model not have an ONNX or OpenVINO file yet? No worries - it'll be autoexported for you. Thank me later 😉 🔒 Another major new feature is Static Embeddings: think word embeddings like GLoVe and word2vec, but modernized. Static Embeddings are bags of token embeddings that are summed together to create text embeddings, allowing for lightning-fast embeddings that don't require any neural networks. They're initialized in one of 2 ways: 1️⃣ via Model2Vec, a new technique for distilling any Sentence Transformer models into static embeddings. Either via a pre-distilled model with `from_model2vec` or with `from_distillation` where you do the distillation yourself. It'll only take 5 seconds on GPU & 2 minutes on CPU, no dataset needed. 2️⃣ Random initialization. This requires finetuning, but finetuning is extremely quick (e.g. I trained with 3 million pairs in 7 minutes). My final model was 6.6% worse than bge-base-en-v1.5, but 500x faster on CPU. Full release notes: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.2.0 Documentation on Speeding up Inference: https://sbert.net/docs/sentence_transformer/usage/efficiency.html

View all activity

Organizations

None yet

codito's activity

liked a model 3 days ago

stduhpf/google-gemma-3-12b-it-qat-q4_0-gguf-small

Updated 1 day ago • 1.15k • 24

liked a dataset 6 months ago

nvidia/HelpSteer2

Viewer • Updated Dec 18, 2024 • 21.4k • 3.77k • 411

reacted to tomaarsen's post with 🔥 6 months ago

Post

7127

📣 Sentence Transformers v3.2.0 is out, marking the biggest release for inference in 2 years! 2 new backends for embedding models: ONNX (+ optimization & quantization) and OpenVINO, allowing for speedups up to 2x-3x AND Static Embeddings for 500x speedups at 10-20% accuracy cost.

1️⃣ ONNX Backend: This backend uses the ONNX Runtime to accelerate model inference on both CPU and GPU, reaching up to 1.4x-3x speedup depending on the precision. We also introduce 2 helper methods for optimizing and quantizing models for (much) faster inference.
2️⃣ OpenVINO Backend: This backend uses Intel their OpenVINO instead, outperforming ONNX in some situations on CPU.

Usage is as simple as SentenceTransformer("all-MiniLM-L6-v2", backend="onnx"). Does your model not have an ONNX or OpenVINO file yet? No worries - it'll be autoexported for you. Thank me later 😉

🔒 Another major new feature is Static Embeddings: think word embeddings like GLoVe and word2vec, but modernized. Static Embeddings are bags of token embeddings that are summed together to create text embeddings, allowing for lightning-fast embeddings that don't require any neural networks. They're initialized in one of 2 ways:

1️⃣ via Model2Vec, a new technique for distilling any Sentence Transformer models into static embeddings. Either via a pre-distilled model with from_model2vec or with from_distillation where you do the distillation yourself. It'll only take 5 seconds on GPU & 2 minutes on CPU, no dataset needed.
2️⃣ Random initialization. This requires finetuning, but finetuning is extremely quick (e.g. I trained with 3 million pairs in 7 minutes). My final model was 6.6% worse than bge-base-en-v1.5, but 500x faster on CPU.

Full release notes: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.2.0
Documentation on Speeding up Inference: https://sbert.net/docs/sentence_transformer/usage/efficiency.html

1 reply

liked a Space 6 months ago

195

MMLU-Pro Leaderboard

🥇

More advanced and challenging multi-task evaluation

updated 3 models 7 months ago

upvoted a collection 7 months ago

Awesome SFT datasets

Collection

A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12, 2024 • 131

reacted to m-ric's post with 🔥 7 months ago

Post

3398

🔥 𝐐𝐰𝐞𝐧 𝐫𝐞𝐥𝐞𝐚𝐬𝐞𝐬 𝐭𝐡𝐞𝐢𝐫 𝟐.𝟓 𝐟𝐚𝐦𝐢𝐥𝐲 𝐨𝐟 𝐦𝐨𝐝𝐞𝐥𝐬: 𝐍𝐞𝐰 𝐒𝐎𝐓𝐀 𝐟𝐨𝐫 𝐚𝐥𝐥 𝐬𝐢𝐳𝐞𝐬 𝐮𝐩 𝐭𝐨 𝟕𝟐𝐁!

The Chinese LLM maker just dropped a flurry of different models, ensuring there will be a Qwen SOTA model for every application out there:
Qwen2.5: 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B
Qwen2.5-Coder: 1.5B, 7B, and 32B on the way
Qwen2.5-Math: 1.5B, 7B, and 72B.

And they didn't sleep: the performance is top of the game for each weight category!

𝐊𝐞𝐲 𝐢𝐧𝐬𝐢𝐠𝐡𝐭𝐬:

🌐 All models have 𝟭𝟮𝟴𝗸 𝘁𝗼𝗸𝗲𝗻 𝗰𝗼𝗻𝘁𝗲𝘅𝘁 𝗹𝗲𝗻𝗴𝘁𝗵

📚 Models pre-trained on 18T tokens, even longer than the 15T of Llama-3

💪 The flagship 𝗤𝘄𝗲𝗻𝟮.𝟱-𝟳𝟮𝗕 𝗶𝘀 ~𝗰𝗼𝗺𝗽𝗲𝘁𝗶𝘁𝗶𝘃𝗲 𝘄𝗶𝘁𝗵 𝗟𝗹𝗮𝗺𝗮-𝟯.𝟭-𝟰𝟬𝟱𝗕, 𝗮𝗻𝗱 𝗵𝗮𝘀 𝗮 𝟯-𝟱% 𝗺𝗮𝗿𝗴𝗶𝗻 𝗼𝗻 𝗟𝗹𝗮𝗺𝗮-𝟯.𝟭-𝟳𝟬𝗕 𝗼𝗻 𝗺𝗼𝘀𝘁 𝗯𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸𝘀.

🇫🇷 On top of this, it 𝘁𝗮𝗸𝗲𝘀 𝘁𝗵𝗲 #𝟭 𝘀𝗽𝗼𝘁 𝗼𝗻 𝗺𝘂𝗹𝘁𝗶𝗹𝗶𝗻𝗴𝘂𝗮𝗹 𝘁𝗮𝘀𝗸𝘀 so it might become my standard for French

💻 Qwen2.5-Coder is only 7B but beats competing models up to 33B (DeeSeek-Coder 33B-Instruct). Let's wait for their 32B to come out!

🧮 Qwen2.5-Math sets a new high in the ratio of MATH benchmark score to # of parameters. They trained it by "aggregating more high-quality mathematical data, particularly in Chinese, from web sources, books, and codes across multiple recall cycles."

📄 Technical report to be released "very soon"

🔓 All models have the most permissive license apache2.0, except the 72B models that have a custom license mentioning "you can use it for free EXCEPT if your product has over 100M users"

🤗 All models are available on the HF Hub! ➡️ Qwen/qwen25-66e81a666513e518adb90d9e

2 replies

liked a model 7 months ago

lemon07r/Gemma-2-Ataraxy-9B

Text Generation • Updated Oct 6, 2024 • 98 • 75

updated a model 7 months ago

codito/gemma-2-2b-it-reflection-test1

Text Generation • Updated Sep 7, 2024 • 8

liked a model 7 months ago

bartowski/Phi-3.5-mini-instruct-GGUF

Text Generation • Updated Sep 15, 2024 • 91.9k • 66

liked a model 8 months ago

bartowski/Gemma-2-9B-It-SPPO-Iter3-GGUF

Text Generation • Updated Jul 15, 2024 • 4.97k • 55

updated 4 models 8 months ago

codito/gemma-2-2b-it-func-test2

Text Generation • Updated Aug 22, 2024 • 4

codito/gemma-2-2b-it-func-test4

Text Generation • Updated Aug 21, 2024 • 2

codito/gemma-2-2b-it-func-test3

Text Generation • Updated Aug 17, 2024 • 2

codito/gemma-2-2b-it-func-test1

Text Generation • Updated Aug 16, 2024 • 3

liked a Space 8 months ago

1.37k

GGUF My Repo

🦙

Create and quantize models on Hugging Face

upvoted a paper 8 months ago

Transformer Explainer: Interactive Learning of Text-Generative Models

Paper • 2408.04619 • Published Aug 8, 2024 • 163

reacted to gabrielmbmb's post with 🔥 8 months ago

Post

3590

Just dropped magpie-ultra-v0.1! The first open synthetic dataset generated with Llama 3.1 405B. Created with distilabel, it's our most advanced and compute-intensive pipeline to date. We made the GPUs of the cluster go brrrrr 🚀

argilla/magpie-ultra-v0.1

Take it a look and tell us what you think! Probably, the models taking the most out of it are smol models 🤗 We will be improving the dataset in upcoming iterations!