Muhammad Farrukh Mehmood

sfarrukhm

AI & ML interests

Generative AI, LLM, SLM

Recent Activity

upvoted an article 4 days ago

A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes

upvoted an article 5 days ago

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

updated a model about 1 month ago

sfarrukhm/mistral-7b-clpsych25-v1

View all activity

Organizations

sfarrukhm's activity

upvoted an article 4 days ago

Article

A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes

Aug 17, 2022

• 85

upvoted an article 5 days ago

Article

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

May 24, 2023

• 137

updated 2 models about 1 month ago

sfarrukhm/mistral-7b-clpsych25-v1

Updated Mar 2

sfarrukhm/ppo-LunarLander-v2

Reinforcement Learning • Updated Mar 1 • 3

published 2 models about 1 month ago

sfarrukhm/ppo-LunarLander-v2

Reinforcement Learning • Updated Mar 1 • 3

sfarrukhm/mistral-7b-clpsych25-v1

Updated Mar 2

published a model about 2 months ago

sfarrukhm/flan-t5-clpysch-summary

Updated Feb 18

updated a model about 2 months ago

sfarrukhm/t5-clpysch-summary

Summarization • Updated Feb 18 • 3

published a model about 2 months ago

sfarrukhm/t5-clpysch-summary

Summarization • Updated Feb 18 • 3

updated a model about 2 months ago

sfarrukhm/distilbert-FT-clinc

Text Classification • Updated Feb 12 • 24

published 2 models about 2 months ago

sfarrukhm/distilbert-FT-clinc

Text Classification • Updated Feb 12 • 24

sfarrukhm/distilbert-DF-clinc

Updated Feb 12

upvoted 2 papers 2 months ago

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Paper • 2501.17703 • Published Jan 29 • 58

Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models

Paper • 2501.12370 • Published Jan 21 • 11

upvoted an article 2 months ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 839

updated a dataset 3 months ago

sfarrukhm/smart-mf

Viewer • Updated Jan 23 • 100 • 48

published a dataset 3 months ago

sfarrukhm/smart-mf

Viewer • Updated Jan 23 • 100 • 48

updated a dataset 3 months ago

sfarrukhm/my-distiset-801dd116

Viewer • Updated Jan 23 • 100 • 33

published a dataset 3 months ago

sfarrukhm/my-distiset-801dd116

Viewer • Updated Jan 23 • 100 • 33

published a Space 3 months ago

Synth Argila

✍