2 2 1

Jingang Wang

bitwjg

AI & ML interests

None yet

Recent Activity

authored a paper 5 days ago

SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity

liked a Space 7 days ago

nanotron/ultrascale-playbook

authored a paper 6 months ago

Large Language Models Meet Open-World Intent Discovery and Recognition: An Evaluation of ChatGPT

View all activity

Organizations

None yet

bitwjg's activity

authored a paper 5 days ago

SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity

Paper • 2503.01506 • Published 6 days ago • 8

liked a Space 7 days ago

2.13k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

authored 12 papers 6 months ago

Large Language Models Meet Open-World Intent Discovery and Recognition: An Evaluation of ChatGPT

Paper • 2310.10176 • Published Oct 16, 2023 • 1

Lifting the Curse of Capacity Gap in Distilling Language Models

Paper • 2305.12129 • Published May 20, 2023

Retrieval-based Knowledge Transfer: An Effective Approach for Extreme Large Language Model Compression

Paper • 2310.15594 • Published Oct 24, 2023 • 1

Improving Document Representations by Generating Pseudo Query Embeddings for Dense Retrieval

Paper • 2105.03599 • Published May 8, 2021

FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for Task-Oriented Dialogue

Paper • 2306.10315 • Published Jun 17, 2023 • 1

DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning

Paper • 2402.09136 • Published Feb 14, 2024 • 1

XPrompt: Exploring the Extreme of Prompt Tuning

Paper • 2210.04457 • Published Oct 10, 2022 • 1

Semi-Supervised Knowledge-Grounded Pre-training for Task-Oriented Dialog Systems

Paper • 2210.08873 • Published Oct 17, 2022 • 1

Parallel Decoding via Hidden Transfer for Lossless Large Language Model Acceleration

Paper • 2404.12022 • Published Apr 18, 2024

Speculative Decoding via Early-exiting for Faster LLM Inference with Thompson Sampling Control Mechanism

Paper • 2406.03853 • Published Jun 6, 2024

What's Wrong with Your Code Generated by Large Language Models? An Extensive Study

Paper • 2407.06153 • Published Jul 8, 2024

How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality Data

Paper • 2409.03810 • Published Sep 5, 2024 • 35

upvoted 2 papers 6 months ago

ReMamba: Equip Mamba with Effective Long-Sequence Modeling

Paper • 2408.15496 • Published Aug 28, 2024 • 12

How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality Data

Paper • 2409.03810 • Published Sep 5, 2024 • 35

authored a paper 6 months ago

ReMamba: Equip Mamba with Effective Long-Sequence Modeling

Paper • 2408.15496 • Published Aug 28, 2024 • 12