hw4_tokenizer_20k / README.md
alexgichamba's picture
Update README.md
e4f58ba verified
metadata
library_name: transformers
tags: []

Tokenizer

A tokenizer with a vocab size of 20k for Intro to Deep Learning Homework 4 on Language Modelling and Automatic Speech Recognition.

The tokenizer was trained on LibriSpeech LM text