It has nothing to do with AI architecture.
Doula Isham Rashik Hasan
disham993
AI & ML interests
Machine Learning, Deep Learning, Natural Language Processing
Recent Activity
commented on
an
article
6 days ago
How to Use FastAPI MCP Server: A Complete Guide
liked
a model
about 1 month ago
yasserrmd/Text2SQL-1.5B
Organizations
disham993's activity

commented on
How to Use FastAPI MCP Server: A Complete Guide
6 days ago

reacted to
onekq's
post with 🔥
25 days ago
Post
1850
🐋DeepSeek-v3-0324🐋 ranks higher than v3, now is tied with QwQ-32B
onekq-ai/WebApp1K-models-leaderboard
onekq-ai/WebApp1K-models-leaderboard

upvoted
an
article
about 1 month ago
Article
Training and Finetuning Embedding Models with Sentence Transformers v3
•
211

reacted to
m-ric's
post with 🚀🔥👍
about 2 months ago
Post
3093
Less is More for Reasoning (LIMO): a 32B model fine-tuned with 817 examples can beat o1-preview on math reasoning! 🤯
Do we really need o1's huge RL procedure to see reasoning emerge? It seems not.
Researchers from Shanghai Jiaotong University just demonstrated that carefully selected examples can boost math performance in large language models using SFT —no huge datasets or RL procedures needed.
Their procedure allows Qwen2.5-32B-Instruct to jump from 6.5% to 57% on AIME and from 59% to 95% on MATH, while using only 1% of the data in previous approaches.
⚡ The Less-is-More Reasoning Hypothesis:
‣ Minimal but precise examples that showcase optimal reasoning patterns matter more than sheer quantity
‣ Pre-training knowledge plus sufficient computational resources at inference levels up math skills
➡️ Core techniques:
‣ High-quality reasoning chains with self-verification steps
‣ 817 handpicked problems that encourage deeper reasoning
‣ Enough inference-time computation to allow extended reasoning
💪 Efficiency gains:
‣ Only 817 examples instead of 100k+
‣ 40.5% absolute improvement across 10 diverse benchmarks, outperforming models trained on 100x more data
This really challenges the notion that SFT leads to memorization rather than generalization! And opens up reasoning to GPU-poor researchers 🚀
Read the full paper here 👉 LIMO: Less is More for Reasoning (2502.03387)
Do we really need o1's huge RL procedure to see reasoning emerge? It seems not.
Researchers from Shanghai Jiaotong University just demonstrated that carefully selected examples can boost math performance in large language models using SFT —no huge datasets or RL procedures needed.
Their procedure allows Qwen2.5-32B-Instruct to jump from 6.5% to 57% on AIME and from 59% to 95% on MATH, while using only 1% of the data in previous approaches.
⚡ The Less-is-More Reasoning Hypothesis:
‣ Minimal but precise examples that showcase optimal reasoning patterns matter more than sheer quantity
‣ Pre-training knowledge plus sufficient computational resources at inference levels up math skills
➡️ Core techniques:
‣ High-quality reasoning chains with self-verification steps
‣ 817 handpicked problems that encourage deeper reasoning
‣ Enough inference-time computation to allow extended reasoning
💪 Efficiency gains:
‣ Only 817 examples instead of 100k+
‣ 40.5% absolute improvement across 10 diverse benchmarks, outperforming models trained on 100x more data
This really challenges the notion that SFT leads to memorization rather than generalization! And opens up reasoning to GPU-poor researchers 🚀
Read the full paper here 👉 LIMO: Less is More for Reasoning (2502.03387)

upvoted
an
article
about 2 months ago
Article
Introducing the Synthetic Data Generator - Build Datasets with Natural Language
•
123

disham993/electrical-classification-bert-large
Text Classification
•
Updated
•
12

disham993/electrical-classification-bert-base
Text Classification
•
Updated
•
1
•
1

disham993/electrical-classification-distilbert-base
Text Classification
•
Updated
•
2
•
1

disham993/electrical-classification-ModernBERT-large
Text Classification
•
Updated
•
43
•
1

disham993/electrical-classification-ModernBERT-base
Text Classification
•
Updated
•
16
•
4

disham993/electrical-ner-distilbert-base
Token Classification
•
Updated
•
8

disham993/electrical-ner-bert-large
Token Classification
•
Updated
•
4
•
1

disham993/electrical-ner-bert-base
Token Classification
•
Updated
•
14
•
1

disham993/electrical-ner-ModernBERT-large
Token Classification
•
Updated
•
11
•
2