Huang Liang Hsun PRO

lianghsun

https://www.lianghsun.dev

AI & ML interests

Founder of Twinkle AI. Focused on applying deep learning in legal and scientific domains, with expertise in NLP and model fine-tuning.

Recent Activity

updated a model about 1 hour ago

twinkle-ai/Llama-3.2-3B-F1-Instruct

replied to their post about 18 hours ago

With the arrival of Twinkle April — Twinkle AI’s annual open-source celebration held every April — our community is excited to unveil its very first project: 📊 Twinkle Eval (https://github.com/ai-twinkle/Eval), a next-generation evaluation tool led by our contributor @tedslin . Unlike traditional evaluation tools like iKala’s ievals (https://github.com/ikala-ai/ievals), which can only evaluate language models (LMs) one sample at a time, Twinkle Eval is designed with Large Reasoning Models (LRMs) in mind. As reasoning time increases with more complex models, traditional tools become increasingly inefficient 😲 — for example, evaluating LRMs on the https://huggingface.co/datasets/ikala/tmmluplus benchmark could take * half a day without finishing. One question we were especially curious about: Does shuffling multiple-choice answer order impact model accuracy? 🤔 → See: "Change Answer Order Can Decrease MMLU Accuracy" – arXiv:2406.19470v1 To address these challenges, Twinkle Eval brings three key innovations to the table: 1️⃣ Parallelized evaluation of samples 2️⃣ Multi-round testing for stability 3️⃣ Randomized answer order to test robustness After running experiments, we observed that Twinkle Eval can speed up evaluation by up to 15× 🚀🚀. Interestingly, most models scored slightly lower under the 2️⃣3️⃣ test settings compared to their claimed performance — suggesting further benchmarking is needed. This framework also comes with additional tunable parameters and detailed logging of LM behavior per question — perfect for those who want to dive deeper. 😆 If you find Twinkle Eval useful, please ⭐ the project and help spread the word 🤗

updated a collection 1 day ago

🏎️ Formosa-1 Series

View all activity

Organizations

lianghsun's activity

New activity in lianghsun/super-cot-preview 15 days ago

Upload tokenizer_config.json

#1 opened 15 days ago by

minyichen

New activity in minyichen/Natural-Reasoning-R1-10k 16 days ago

Upload train-00000-of-00001.parquet

#2 opened 16 days ago by

lianghsun

New activity in minyichen/HuggingFaceH4_MATH_R1 16 days ago

Upload train-00000-of-00001.parquet

#2 opened 16 days ago by

lianghsun

New activity in minyichen/tw-instruct-R1-200k 19 days ago

Upload tw_instruct_R1_liang.json

#4 opened 19 days ago by

lianghsun

New activity in lianghsun/new-identity 21 days ago

Upload 3 files

#5 opened 21 days ago by

minyichen

New activity in lianghsun/tw-political-correctness-chat 21 days ago

Upload datasets.jsonl

#2 opened 21 days ago by

minyichen

New activity in yuhuanstudio/wikipedia-pretrain-zh-tw 21 days ago

請問會有更新到 2025 的版本嗎

#2 opened 23 days ago by

lianghsun

New activity in lianghsun/new-identity 24 days ago

Upload identity.json

#4 opened 24 days ago by

minyichen

Upload 2 files

#2 opened 24 days ago by

minyichen

New activity in google/gemma-3-4b-it 24 days ago

Question About Benchmark Version in README

#9 opened 24 days ago by

lianghsun

New activity in wuulong/purchasing_exam_questions 26 days ago

補上依據法源

#3 opened 26 days ago by

lianghsun

Upload validation-00000-of-00001.parquet

#2 opened 26 days ago by

lianghsun

New activity in lianghsun/Llama-3.2-Taiwan-3B-Instruct 3 months ago

playground是壞的

#2 opened 3 months ago by

metalnow

🚩 Report: Legal issue(s)

#1 opened 3 months ago by

wayne1998

New activity in lianghsun/ultrachat-multilingual 3 months ago

Dataset Viewer issue: UnexpectedError

#2 opened 4 months ago by

lianghsun

New activity in lianghsun/free-gpt-4o-chat 4 months ago

free-gpt-4o-chat

#1 opened 4 months ago by

avadhuta

New activity in lianghsun/Llama-3.2-Taiwan-3B 4 months ago

轉換成GGUF後的使用問題

#2 opened 4 months ago by

AtwoodYen

New activity in Johnson8187/Chinese-Emotion 4 months ago

要不要試試看 Llama-3.2-Taiwan-1B 😎

#1 opened 4 months ago by

lianghsun

要不要試試看 Llama-3.2-Taiwan-1B 😎

#2 opened 4 months ago by

lianghsun

New activity in lianghsun/Llama-3.2-Taiwan-3B 4 months ago

Convert to GGUF format failed!

#1 opened 4 months ago by

leotaipei