Ken Tsui

kenhktsui

AI & ML interests

ML Engineer Lead. Researcher on Small Language Model - Building Classifiers to Find High Quality Data/ Reasoning Benchmark/ Synthetic Data

Recent Activity

upvoted a collection 3 days ago

Llama 4

liked a model 12 days ago

hon9kon9ize/CantoneseLLMChat-v1.0-7B

liked a model 12 days ago

alvanlii/whisper-small-cantonese

View all activity

Organizations

kenhktsui's activity

upvoted a collection 3 days ago

Llama 4

Collection

Llama 4 release • 10 items • Updated 3 days ago • 399

liked 2 models 12 days ago

hon9kon9ize/CantoneseLLMChat-v1.0-7B

Text Generation • Updated 28 days ago • 2.18k • 6

alvanlii/whisper-small-cantonese

Automatic Speech Recognition • Updated Nov 12, 2024 • 4.29k • • 85

upvoted a paper 13 days ago

Scaling Vision Pre-Training to 4K Resolution

Paper • 2503.19903 • Published 14 days ago • 39

liked 2 datasets 14 days ago

nvidia/Llama-Nemotron-Post-Training-Dataset-v1

Viewer • Updated 22 days ago • 15.2M • 14.2k • 346

nvidia/PhysicalAI-Robotics-GR00T-X-Embodiment-Sim

Updated 7 days ago • 34.3k • 105

liked a dataset 15 days ago

TommyChien/UltraDomain

Preview • Updated Sep 9, 2024 • 398 • 20

liked a model 20 days ago

manycore-research/SpatialLM-Llama-1B

Text Generation • Updated 19 days ago • 16k • 921

updated a model 23 days ago

kenhktsui/finefineweb-domain-fasttext-classifier

Text Classification • Updated 23 days ago • 27 • 1

updated a collection 23 days ago

FastText Model for Pretraining Data Curation

Collection

6 items • Updated 23 days ago • 2

published a model 23 days ago

kenhktsui/finefineweb-domain-fasttext-classifier

Text Classification • Updated 23 days ago • 27 • 1

updated a dataset 24 days ago

kenhktsui/FineFineWeb-First100K

Viewer • Updated 24 days ago • 6.7M • 429

published a dataset 24 days ago

kenhktsui/FineFineWeb-First100K

Viewer • Updated 24 days ago • 6.7M • 429

liked a dataset 25 days ago

m-a-p/FineFineWeb

Viewer • Updated Dec 19, 2024 • 4.89B • 364k • 46

liked a model 27 days ago

google/gemma-3-27b-it

Image-Text-to-Text • Updated 18 days ago • 1.02M • • 1.13k

liked a model 28 days ago

MCG-NJU/videomae-base

Video Classification • Updated Mar 29, 2024 • 43.9k • 44

liked a dataset about 1 month ago

ontocord/lighteval_MATH-Hard

Viewer • Updated Jan 29 • 3.63k • 159 • 3

liked a Space about 1 month ago

2.43k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

liked a dataset about 1 month ago

Congliu/Chinese-DeepSeek-R1-Distill-data-110k

Viewer • Updated Feb 21 • 110k • 4.72k • 616