Geneformer / geneformer /tokenizer.py

Commit History

Add option for variable input_size and to add CLS/SEP Tokens (#299)
aa25cd2
verified

ctheodoris hchen725 commited on

edit docstring format to highlight options
e3330a6

Christina Theodoris commited on

change doc formatting
17f036a

Christina Theodoris commited on

add sphinx docs
2a0dcbe

Christina Theodoris commited on

Add option for modified batch size for loom tokenizer
0960cf6

Christina Theodoris commited on

Add option for modifying chunk size for anndata tokenizer
fd93ebf

Christina Theodoris commited on

Add error for no files found and suppress loompy import warning
abdf980

Christina Theodoris commited on

Update tokenizer to allow tokenization without custom cell attributes
57b9778

Christina Theodoris commited on

Modify tokenizer to allow renaming attr names btwn loom and .dataset
e78c44d

Christina Theodoris commited on

Add further explanation regarding input file format for transcriptome tokenizer
c34ead6

Christina Theodoris commited on

Add further explanation to tokenizer example script and updated tokenizer to match loompy raised error
78dd83b

Christina Theodoris commited on

Fix bug with metadata when processing multiple .loom files (#3)
044d737

ctheodoris davidjwen commited on

Add data collator for cell classification and example for cell classification
088ea6e

Christina Theodoris commited on

Add Geneformer tokenizer and updated model card
5426788

Christina Theodoris commited on