ONNX weights for transformers.js
Hey, awesome work!
Could you add the onnx weights like in https://huggingface.co/minishlab/M2V_base_output/tree/main?
Hey @do-me , thanks! I can try to create some onnx weights (Xenova created them the last time). I think there might be a few changes required to make it work since inference is now fully numpy based. Do you perhaps have some example code where you are using the onnx model from M2V_base_output that I can use to test if everything works?
Sure, this was the conversion code: https://github.com/MinishLab/model2vec/issues/75#issuecomment-2408746794
Unfortunately that code does not work anymore since that was when we were still using Torch for inference. Do the old ONNX models from M2V_base_output still work for you (and if so, how are you using them)? If you have a code example, I can see if I can make it work with the new Numpy inference.
Yes they work perfectly with the latest transformers.js version. Xenova posted the example code in the GitHub issue. I will push an app later this evening that you can use for reference!
@Pringled
here you go: https://jsfiddle.net/wohd0gsj/1. Just have a look at the console where the embs are logged.
If you're curious, I built an app based on model2vec: https://do-me.github.io/semantic-similarity-table/ (but will announce it only next week). I'd love to add the possibility to switch between models for higher quality (potion) or multilinguism.
Unfortunately it doesn't work yet, but it might be the fault of the way I'm calling the new model:
import { AutoModel, AutoTokenizer, Tensor } from 'https://cdn.jsdelivr.net/npm/@huggingface/[email protected]';
const model = await AutoModel.from_pretrained('minishlab/potion-base-8M', {
config: { model_type: 'model2vec' },
dtype: 'fp32'
});
const tokenizer = await AutoTokenizer.from_pretrained('minishlab/potion-base-8M');
const texts = ['hello', 'hello world'];
const { input_ids } = await tokenizer(texts, { add_special_tokens: false, return_tensor: false });
const cumsum = arr => arr.reduce((acc, num, i) => [...acc, num + (acc[i - 1] || 0)], []);
const offsets = [0, ...cumsum(input_ids.slice(0, -1).map(x => x.length))];
const flattened_input_ids = input_ids.flat();
const model_inputs = {
input_ids: new Tensor('int64', flattened_input_ids, [flattened_input_ids.length]),
offsets: new Tensor('int64', offsets, [offsets.length]),
}
const { embeddings } = await model(model_inputs);
console.log(embeddings.tolist()); // output matches python version
Seems like it's the missing (?) tokenizers fault:
Ah, I see, those are needed for Transformers compatibility. I can add that as well, I'll ping you once those are added
Awesome, great to hear! I'll also add these files to the other POTION models as well as the multilingual model so that you can use those as well :).