English
pszemraj commited on
Commit
3796f48
1 Parent(s): 8eaf377

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -1
README.md CHANGED
@@ -13,6 +13,18 @@ library_name: transformers
13
  After minutes of hard work, it is now available.
14
 
15
 
16
- ```
 
 
 
 
17
 
 
 
 
18
  ```
 
 
 
 
 
 
13
  After minutes of hard work, it is now available.
14
 
15
 
16
+ ```python
17
+ from transformers import AutoTokenizer
18
+ tokenizer = AutoTokenizer.from_pretrained("BEE-spoke-data/BeeTokenizer")
19
+
20
+ test_string = "When dealing with Varroa destructor mites, it's crucial to administer the right acaricides during the late autumn months, but only after ensuring that the worker bee population is free from pesticide contamination."
21
 
22
+ output = tokenizer(test_string)
23
+ print(f"Test string: {test_string}")
24
+ print(f"Tokens:\n\t{output.input_ids}")
25
  ```
26
+
27
+
28
+ ## Notes
29
+
30
+ - the default tokenizer (on branch `main`) has a vocab size of 32128