Christopher Schröder's picture

14 19

Christopher Schröder

cschroeder

·

https://github.com/webis-de/small-text

AI & ML interests

NLP, Active Learning, Text Representations, PyTorch

Recent Activity

updated a model 9 days ago

small-text/word2vec-google-news-300

published a model 9 days ago

small-text/word2vec-google-news-300

upvoted a paper 29 days ago

EuroBERT: Scaling Multilingual Encoders for European Languages

View all activity

Organizations

cschroeder's activity

upvoted a paper 29 days ago

EuroBERT: Scaling Multilingual Encoders for European Languages

Paper • 2503.05500 • Published Mar 7 • 76

upvoted a paper about 1 month ago

NeoBERT: A Next-Generation BERT

Paper • 2502.19587 • Published Feb 26 • 39

upvoted a collection 5 months ago

Models for dataset curation

9 items • Updated Dec 5, 2024 • 17

upvoted a paper 5 months ago

Self-Training for Sample-Efficient Active Learning for Text Classification with Pre-Trained Language Models

Paper • 2406.09206 • Published Jun 13, 2024 • 1

upvoted a collection 5 months ago

OpenCulture

A multilingual dataset of public domain books and newspapers. • 27 items • Updated Nov 6, 2024 • 125

upvoted a collection 6 months ago

EU20-Benchmarks

Evaluation Benchmarks for 20 European languages. • 5 items • Updated Oct 11, 2024 • 8

upvoted an article 7 months ago

Article

AI Policy @🤗: Open ML Considerations in the EU AI Act

Jul 24, 2023

• 2

upvoted a paper 8 months ago

Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time

Paper • 2408.13233 • Published Aug 23, 2024 • 25

upvoted 3 papers 9 months ago

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

Paper • 2407.13623 • Published Jul 18, 2024 • 57

RETVec: Resilient and Efficient Text Vectorizer

Paper • 2302.09207 • Published Feb 18, 2023 • 3

LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

Paper • 2407.03963 • Published Jul 4, 2024 • 19

upvoted a paper 10 months ago

AnchorAL: Computationally Efficient Active Learning for Large and Imbalanced Datasets

Paper • 2404.05623 • Published Apr 8, 2024 • 3

upvoted a collection 11 months ago

🎧AI Podcasts and Talks!

🤗Cool stuff to listen to at any time! • 10 items • Updated Oct 6, 2023 • 5

upvoted a paper 12 months ago

Small-Text: Active Learning for Text Classification in Python

Paper • 2107.10314 • Published Jul 21, 2021 • 1