zhongwang commited on
Commit
78375dc
·
verified ·
1 Parent(s): 52b4176

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -1,5 +1,13 @@
1
  ---
2
  license: bsd
 
 
 
 
 
 
 
 
3
  ---
4
 
5
  This is the base model of GenomeOcean-100M. It is trained with Causal Language Modeling (CLM) and uses a BPE tokenizer with 4096 tokens. It supports a maximum sequence length of 1024 tokens (~5kbp).
 
1
  ---
2
  license: bsd
3
+ language:
4
+ - en
5
+ tags:
6
+ - biology
7
+ - genomics
8
+ - metagenomics
9
+ - DNA
10
+ - microbiome
11
  ---
12
 
13
  This is the base model of GenomeOcean-100M. It is trained with Causal Language Modeling (CLM) and uses a BPE tokenizer with 4096 tokens. It supports a maximum sequence length of 1024 tokens (~5kbp).