2 21 113

Kostis Gourgoulias

kgourgou

http://kgourgou.me

AI & ML interests

Language modeling, few-shot learning, bayesian inference, information theory, uncertainty quantification.

Recent Activity

liked a model 12 days ago

UW/OLMo2-8B-SuperBPE-t160k

upvoted a collection 12 days ago

SuperBPE

liked a dataset 18 days ago

kgourgou/hugging-face-language-models

View all activity

Organizations

kgourgou's activity

upvoted a collection 12 days ago

SuperBPE

Collection

SuperBPE tokenizers and models trained with them • 7 items • Updated 16 days ago • 13

upvoted a collection about 1 month ago

Hallucination detection

Collection

Trained ModernBERT (base and large) for detection hallucinations in LLM responses. The models are trained as token classifications. • 4 items • Updated about 1 month ago • 15

upvoted 2 papers about 2 months ago

An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging

Paper • 2502.09056 • Published Feb 13 • 30

Agency Is Frame-Dependent

Paper • 2502.04403 • Published Feb 6 • 23

upvoted an article 3 months ago

Article

Train 400x faster Static Embedding Models with Sentence Transformers

Jan 15

• 170

upvoted a paper 8 months ago

Can Large Language Models Infer Causation from Correlation?

Paper • 2306.05836 • Published Jun 9, 2023 • 6

upvoted a paper 10 months ago

TextGrad: Automatic "Differentiation" via Text

Paper • 2406.07496 • Published Jun 11, 2024 • 31

upvoted a collection 10 months ago

Models and Linearity

Collection

2 items • Updated Jun 14, 2024 • 1

upvoted 2 papers 11 months ago

Your Transformer is Secretly Linear

Paper • 2405.12250 • Published May 19, 2024 • 158

Not All Language Model Features Are Linear

Paper • 2405.14860 • Published May 23, 2024 • 41

upvoted an article 11 months ago

Article

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Apr 22, 2024

• 80

upvoted 2 papers 12 months ago

Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition

Paper • 2309.15223 • Published Sep 26, 2023 • 19

Resolving Interference When Merging Models

Paper • 2306.01708 • Published Jun 2, 2023 • 14

upvoted a collection over 1 year ago

Model Merging

Collection

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 236

upvoted a paper over 1 year ago

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

Paper • 2203.05482 • Published Mar 10, 2022 • 6

upvoted a collection over 1 year ago

Papers about model merging

Collection

referenced in the mergekit repo: https://github.com/cg123/mergekit • 4 items • Updated Feb 13, 2024 • 14

upvoted a paper over 1 year ago

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 258

upvoted a paper almost 2 years ago

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 142