Training Dynamics

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

oskarvanderwal authored a paper 7 days ago

Inseq: An Interpretability Toolkit for Sequence Generation Models

oskarvanderwal authored a paper 7 days ago

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

oskarvanderwal authored a paper 7 days ago

Identifying and Adapting Transformer-Components Responsible for Gender Bias in an English Language Model

View all activity

training-dynamics's activity

oskarvanderwal

authored 4 papers 7 days ago

Inseq: An Interpretability Toolkit for Sequence Generation Models

Paper • 2302.13942 • Published Feb 27, 2023 • 1

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

Paper • 2304.01373 • Published Apr 3, 2023 • 9

Identifying and Adapting Transformer-Components Responsible for Gender Bias in an English Language Model

Paper • 2310.12611 • Published Oct 19, 2023

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Paper • 2211.05100 • Published Nov 9, 2022 • 31

pietrolesci

authored a paper 7 days ago

PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs

Paper • 2503.09543 • Published Mar 12

pietrolesci

authored a paper about 2 months ago

Self-Training Large Language Models for Tool-Use Without Demonstrations

Paper • 2502.05867 • Published Feb 9

pietrolesci

updated a dataset about 2 months ago

EleutherAI/pile-preshuffled-seeds

Updated Feb 27 • 69 • 1

pietrolesci

authored a paper 5 months ago

Tending Towards Stability: Convergence Challenges in Small Language Models

Paper • 2410.11451 • Published Oct 15, 2024

pietrolesci

authored a paper 10 months ago

Causal Estimation of Memorisation Profiles

Paper • 2406.04327 • Published Jun 6, 2024 • 1

pietrolesci

authored a paper about 1 year ago

AnchorAL: Computationally Efficient Active Learning for Large and Imbalanced Datasets

Paper • 2404.05623 • Published Apr 8, 2024 • 3

pietrolesci

authored a paper over 1 year ago

Diable: Efficient Dialogue State Tracking as Operations on Tables

Paper • 2305.17020 • Published May 26, 2023

AI & ML interests

Recent Activity

Team members 2

training-dynamics's activity