gg-tt

company
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

gg-tt's activity

clefourrierย 
posted an update 1 day ago
view post
Post
1241
Gemma3 family is out! Reading the tech report, and this section was really interesting to me from a methods/scientific fairness pov.

Instead of doing over-hyped comparisons, they clearly state that **results are reported in a setup which is advantageous to their models**.
(Which everybody does, but people usually don't say)

For a tech report, it makes a lot of sense to report model performance when used optimally!
On leaderboards on the other hand, comparison will be apples to apples, but in a potentially unoptimal way for a given model family (like some user interact sub-optimally with models)

Also contains a cool section (6) on training data memorization rate too! Important to see if your model will output the training data it has seen as such: always an issue for privacy/copyright/... but also very much for evaluation!

Because if your model knows its evals by heart, you're not testing for generalization.
tomaarsenย 
posted an update 4 days ago
view post
Post
6197
An assembly of 18 European companies, labs, and universities have banded together to launch ๐Ÿ‡ช๐Ÿ‡บ EuroBERT! It's a state-of-the-art multilingual encoder for 15 European languages, designed to be finetuned for retrieval, classification, etc.

๐Ÿ‡ช๐Ÿ‡บ 15 Languages: English, French, German, Spanish, Chinese, Italian, Russian, Polish, Portuguese, Japanese, Vietnamese, Dutch, Arabic, Turkish, Hindi
3๏ธโƒฃ 3 model sizes: 210M, 610M, and 2.1B parameters - very very useful sizes in my opinion
โžก๏ธ Sequence length of 8192 tokens! Nice to see these higher sequence lengths for encoders becoming more common.
โš™๏ธ Architecture based on Llama, but with bi-directional (non-causal) attention to turn it into an encoder. Flash Attention 2 is supported.
๐Ÿ”ฅ A new Pareto frontier (stronger *and* smaller) for multilingual encoder models
๐Ÿ“Š Evaluated against mDeBERTa, mGTE, XLM-RoBERTa for Retrieval, Classification, and Regression (after finetuning for each task separately): EuroBERT punches way above its weight.
๐Ÿ“ Detailed paper with all details, incl. data: FineWeb for English and CulturaX for multilingual data, The Stack v2 and Proof-Pile-2 for code.

Check out the release blogpost here: https://huggingface.co/blog/EuroBERT/release
* EuroBERT/EuroBERT-210m
* EuroBERT/EuroBERT-610m
* EuroBERT/EuroBERT-2.1B

The next step is for researchers to build upon the 3 EuroBERT base models and publish strong retrieval, zero-shot classification, etc. models for all to use. I'm very much looking forward to it!
  • 1 reply
ยท
lysandreย 
posted an update 20 days ago
view post
Post
5609
SmolVLM-2 and SigLIP-2 are now part of transformers in dedicated releases!

They're added on top of the v4.49.0 release, and can be installed from the following tags: v4.49.0-SmolVLM-2 and v4.49.0-SigLIP-2.

This marks a new beginning for the release process of transformers. For the past five years, we've been doing monthly releases featuring many models (v4.49.0, the latest release, features 9 new architectures).

Starting with SmolVLM-2 & SigLIP2, we'll now additionally release tags supporting new models on a stable branch. These models are therefore directly available for use by installing from the tag itself. These tags will continue to be updated with fixes applied to these models.

Going forward, continue expecting software releases following semantic versioning: v4.50.0 will have ~10 new architectures compared to v4.49.0, as well as a myriad of new features, improvements and bug fixes. Accompanying these software releases, we'll release tags offering brand new models as fast as possible, to make them accessible to all immediately.
  • 1 reply
ยท
Xenovaย 
posted an update about 1 month ago
view post
Post
9458
We did it. Kokoro TTS (v1.0) can now run 100% locally in your browser w/ WebGPU acceleration. Real-time text-to-speech without a server. โšก๏ธ

Generate 10 seconds of speech in ~1 second for $0.

What will you build? ๐Ÿ”ฅ
webml-community/kokoro-webgpu

The most difficult part was getting the model running in the first place, but the next steps are simple:
โœ‚๏ธ Implement sentence splitting, allowing for streamed responses
๐ŸŒ Multilingual support (only phonemization left)

Who wants to help?
ยท
tomaarsenย 
posted an update about 2 months ago
view post
Post
2206
I just released Sentence Transformers v3.4.0, featuring a memory leak fix, compatibility between the powerful Cached... losses and the Matryoshka loss modifier, and a bunch of fixes & small features.

๐Ÿช† Matryoshka & Cached loss compatibility
It is now possible to combine the powerful Cached... losses (which use in-batch negatives & a caching mechanism to allow for endless batch size & negatives) with the Matryoshka loss modifier which modifies a base loss such that it is trained not only on the maximum dimensionality (e.g. 1024 dimensions), but also on many lower dimensions (e.g. 768, 512, 256, 128, 64, 32).
After training, these models' embeddings can be truncated for faster retrieval, etc.

๐ŸŽž๏ธ Resolve memory leak when Model and Trainer are reinitialized
Due to a circular dependency between Trainer -> Model -> ModelCardData -> Trainer, deleting both the trainer & model still didn't free up the memory.
This led to a memory leak in scripts where you repeatedly do so.

โž• New Features
Many new small features, e.g. multi-GPU support for 'mine_hard_negatives', a 'margin' parameter to TripletEvaluator, and Matthews Correlation Coefficient in the BinaryClassificationEvaluator.

๐Ÿ› Bug Fixes
Also a bunch of fixes, for example that subsequent batches were not sorted when using the "no_duplicates" batch sampler. See the release notes for more details.

Full release notes: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.4.0

Big thanks to all community members who assisted in this release. 10 folks with their first contribution this time around!
Xenovaย 
posted an update about 2 months ago
view post
Post
6448
Introducing Kokoro.js, a new JavaScript library for running Kokoro TTS, an 82 million parameter text-to-speech model, 100% locally in the browser w/ WASM. Powered by ๐Ÿค— Transformers.js. WebGPU support coming soon!
๐Ÿ‘‰ npm i kokoro-js ๐Ÿ‘ˆ

Try it out yourself: webml-community/kokoro-web
Link to models/samples: onnx-community/Kokoro-82M-ONNX

You can get started in just a few lines of code!
import { KokoroTTS } from "kokoro-js";

const tts = await KokoroTTS.from_pretrained(
  "onnx-community/Kokoro-82M-ONNX",
  { dtype: "q8" }, // fp32, fp16, q8, q4, q4f16
);

const text = "Life is like a box of chocolates. You never know what you're gonna get.";
const audio = await tts.generate(text,
  { voice: "af_sky" }, // See `tts.list_voices()`
);
audio.save("audio.wav");

Huge kudos to the Kokoro TTS community, especially taylorchu for the ONNX exports and Hexgrad for the amazing project! None of this would be possible without you all! ๐Ÿค—

The model is also extremely resilient to quantization. The smallest variant is only 86 MB in size (down from the original 326 MB), with no noticeable difference in audio quality! ๐Ÿคฏ
ยท
mlabonneย 
posted an update about 2 months ago
view post
Post
6019
๐Ÿ†• LLM Course 2025 edition!

I updated the LLM Scientist roadmap and added a ton of new information and references. It covers training, datasets, evaluation, quantization, and new trends like test-time compute scaling.

The LLM Course has been incredibly popular (41.3k stars!) and I've been touched to receive many, many messages about how it helped people in their careers.

I know how difficult this stuff can be, so I'm super proud of the impact it had. I want to keep updating it in 2025, especially with the LLM Engineer roadmap.

Thanks everyone, hope you'll enjoy it!

๐Ÿ’ป LLM Course: https://huggingface.co/blog/mlabonne/llm-course
tomaarsenย 
posted an update about 2 months ago
view post
Post
4652
๐ŸŽ๏ธ Today I'm introducing a method to train static embedding models that run 100x to 400x faster on CPU than common embedding models, while retaining 85%+ of the quality! Including 2 fully open models: training scripts, datasets, metrics.

We apply our recipe to train 2 Static Embedding models that we release today! We release:
2๏ธโƒฃ an English Retrieval model and a general-purpose Multilingual similarity model (e.g. classification, clustering, etc.), both Apache 2.0
๐Ÿง  my modern training strategy: ideation -> dataset choice -> implementation -> evaluation
๐Ÿ“œ my training scripts, using the Sentence Transformers library
๐Ÿ“Š my Weights & Biases reports with losses & metrics
๐Ÿ“• my list of 30 training and 13 evaluation datasets

The 2 Static Embedding models have the following properties:
๐ŸŽ๏ธ Extremely fast, e.g. 107500 sentences per second on a consumer CPU, compared to 270 for 'all-mpnet-base-v2' and 56 for 'gte-large-en-v1.5'
0๏ธโƒฃ Zero active parameters: No Transformer blocks, no attention, not even a matrix multiplication. Super speed!
๐Ÿ“ No maximum sequence length! Embed texts at any length (note: longer texts may embed worse)
๐Ÿ“ Linear instead of exponential complexity: 2x longer text takes 2x longer, instead of 2.5x or more.
๐Ÿช† Matryoshka support: allow you to truncate embeddings with minimal performance loss (e.g. 4x smaller with a 0.56% perf. decrease for English Similarity tasks)

Check out the full blogpost if you'd like to 1) use these lightning-fast models or 2) learn how to train them with consumer-level hardware: https://huggingface.co/blog/static-embeddings

The blogpost contains a lengthy list of possible advancements; I'm very confident that our 2 models are only the tip of the iceberg, and we may be able to get even better performance.

Alternatively, check out the models:
* sentence-transformers/static-retrieval-mrl-en-v1
* sentence-transformers/static-similarity-mrl-multilingual-v1
  • 1 reply
ยท
danielhanchenย 
posted an update 2 months ago
view post
Post
4669
We fixed many bugs in Phi-4 & uploaded fixed GGUF + 4-bit versions! โœจ

Our fixed versions are even higher on the Open LLM Leaderboard than Microsoft's!

GGUFs: unsloth/phi-4-GGUF
Dynamic 4-bit: unsloth/phi-4-unsloth-bnb-4bit

You can also now finetune Phi-4 for free on Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Phi_4-Conversational.ipynb

Read our blogpost for more details on bug fixes etc: https://unsloth.ai/blog/phi4
danielhanchenย 
posted an update 2 months ago
Xenovaย 
posted an update 2 months ago
view post
Post
8358
First project of 2025: Vision Transformer Explorer

I built a web app to interactively explore the self-attention maps produced by ViTs. This explains what the model is focusing on when making predictions, and provides insights into its inner workings! ๐Ÿคฏ

Try it out yourself! ๐Ÿ‘‡
webml-community/attention-visualization

Source code: https://github.com/huggingface/transformers.js-examples/tree/main/attention-visualization
tomaarsenย 
posted an update 2 months ago
view post
Post
3034
That didn't take long! Nomic AI has finetuned the new ModernBERT-base encoder model into a strong embedding model for search, classification, clustering and more!

Details:
๐Ÿค– Based on ModernBERT-base with 149M parameters.
๐Ÿ“Š Outperforms both nomic-embed-text-v1 and nomic-embed-text-v1.5 on MTEB!
๐ŸŽ๏ธ Immediate FA2 and unpacking support for super efficient inference.
๐Ÿช† Trained with Matryoshka support, i.e. 2 valid output dimensionalities: 768 and 256.
โžก๏ธ Maximum sequence length of 8192 tokens!
2๏ธโƒฃ Trained in 2 stages: unsupervised contrastive data -> high quality labeled datasets.
โž• Integrated in Sentence Transformers, Transformers, LangChain, LlamaIndex, Haystack, etc.
๐Ÿ›๏ธ Apache 2.0 licensed: fully commercially permissible

Try it out here: nomic-ai/modernbert-embed-base

Very nice work by Zach Nussbaum and colleagues at Nomic AI.