Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
1
1
55
Mex Ivanov
MexIvanov
Follow
evilfreelancer's profile picture
21world's profile picture
2 followers
Ā·
8 following
MexIvanov
AI & ML interests
NLP, Coding, Quantum Computing and more.
Recent Activity
reacted
to
singhsidhukuldeep
's
post
with š„
2 days ago
Exciting News in AI: JinaAI Releases JINA-CLIP-v2! The team at Jina AI has just released a groundbreaking multilingual multimodal embedding model that's pushing the boundaries of text-image understanding. Here's why this is a big deal: š Technical Highlights: - Dual encoder architecture combining a 561M parameter Jina XLM-RoBERTa text encoder and a 304M parameter EVA02-L14 vision encoder - Supports 89 languages with 8,192 token context length - Processes images up to 512Ć512 pixels with 14Ć14 patch size - Implements FlashAttention2 for text and xFormers for vision processing - Uses Matryoshka Representation Learning for efficient vector storage ā”ļø Under The Hood: - Multi-stage training process with progressive resolution scaling (224ā384ā512) - Contrastive learning using InfoNCE loss in both directions - Trained on massive multilingual dataset including 400M English and 400M multilingual image-caption pairs - Incorporates specialized datasets for document understanding, scientific graphs, and infographics - Uses hard negative mining with 7 negatives per positive sample š Performance: - Outperforms previous models on visual document retrieval (52.65% nDCG@5) - Achieves 89.73% image-to-text and 79.09% text-to-image retrieval on CLIP benchmark - Strong multilingual performance across 30 languages - Maintains performance even with 75% dimension reduction (256D vs 1024D) šÆ Key Innovation: The model solves the long-standing challenge of unifying text-only and multi-modal retrieval systems while adding robust multilingual support. Perfect for building cross-lingual visual search systems! Kudos to the research team at Jina AI for this impressive advancement in multimodal AI!
reacted
to
singhsidhukuldeep
's
post
with š
3 days ago
Exciting breakthrough in AI: @Meta's new Byte Latent Transformer (BLT) revolutionizes language models by eliminating tokenization! The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special: >> Key Innovations Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models. Three-Component Architecture: ā¢ Lightweight Local Encoder that converts bytes to patch representations ā¢ Powerful Global Latent Transformer that processes patches ā¢ Local Decoder that converts patches back to bytes >> Technical Advantages ā¢ Matches performance of Llama 3 at 8B parameters while being more efficient ā¢ Superior handling of non-English languages and rare character sequences ā¢ Remarkable 99.9% accuracy on spelling tasks ā¢ Better scaling properties than token-based models >> Under the Hood The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs. This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!
liked
a model
8 days ago
CohereForAI/c4ai-command-r7b-12-2024
View all activity
Organizations
None yet
MexIvanov
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
liked
a model
8 days ago
CohereForAI/c4ai-command-r7b-12-2024
Text Generation
ā¢
Updated
6 days ago
ā¢
25.8k
ā¢
311
liked
a model
23 days ago
jinaai/jina-embeddings-v3
Feature Extraction
ā¢
Updated
23 days ago
ā¢
714k
ā¢
613
liked
a dataset
26 days ago
wikimedia/wikipedia
Viewer
ā¢
Updated
Jan 9
ā¢
61.6M
ā¢
48.3k
ā¢
664
liked
a model
about 1 month ago
NexaAIDev/OmniVLM-968M
Updated
10 days ago
ā¢
3.79k
ā¢
485
liked
a dataset
5 months ago
HuggingFaceTB/smollm-corpus
Viewer
ā¢
Updated
Sep 6
ā¢
237M
ā¢
76.6k
ā¢
269
liked
a model
6 months ago
sentence-transformers/LaBSE
Sentence Similarity
ā¢
Updated
Mar 27
ā¢
581k
ā¢
237
liked
a dataset
6 months ago
sentence-transformers/trivia-qa-triplet
Viewer
ā¢
Updated
Jun 21
ā¢
52.9M
ā¢
309
ā¢
5
liked
2 models
7 months ago
mistralai/Mistral-7B-v0.3
Text Generation
ā¢
Updated
Jul 24
ā¢
3.2M
ā¢
409
openbmb/MiniCPM-Llama3-V-2_5
Image-Text-to-Text
ā¢
Updated
Sep 25
ā¢
28.5k
ā¢
1.38k
liked
2 models
9 months ago
urchade/gliner_large_bio-v0.1
Token Classification
ā¢
Updated
Apr 9
ā¢
135
ā¢
9
urchade/gliner_medium-v2.1
Token Classification
ā¢
Updated
Aug 21
ā¢
18.8k
ā¢
28
liked
9 models
10 months ago
urchade/gliner_large-v1
Updated
Apr 10
ā¢
1.32k
ā¢
4
urchade/gliner_medium-v2
Updated
Apr 10
ā¢
82
ā¢
5
urchade/gliner_large-v2
Token Classification
ā¢
Updated
Jul 12
ā¢
6.03k
ā¢
44
urchade/gliner_small-v1
Token Classification
ā¢
Updated
Apr 10
ā¢
855
ā¢
9
urchade/gliner_small-v2
Updated
Apr 10
ā¢
203
ā¢
6
urchade/gliner_medium-v1
Updated
May 7
ā¢
139
ā¢
5
urchade/gliner_multi
Token Classification
ā¢
Updated
Apr 10
ā¢
27.7k
ā¢
124
urchade/gliner_base
Token Classification
ā¢
Updated
Apr 10
ā¢
3.23k
ā¢
71
sambanovasystems/SambaLingo-Russian-Chat
Text Generation
ā¢
Updated
Apr 16
ā¢
197
ā¢
52
Load more