manu commited on
Commit
161527f
1 Parent(s): c51c9e2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -0
README.md ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - cerebras/SlimPajama-627B
5
+ - oscar-corpus/OSCAR-2301
6
+ - bigcode/starcoderdata
7
+ language:
8
+ - fr
9
+ - en
10
+ pipeline_tag: text-generation
11
+ tags:
12
+ - legal
13
+ - art
14
+ - code
15
+ - finance
16
+ - medical
17
+ - text-generation-inference
18
+ ---
19
+
20
+ # CroissantLLM: A not so flaky bilingual 1.3B model
21
+
22
+ An experimental mode trained on a small subsplit of the final data.
23
+
24
+ ### Usage
25
+
26
+ ```python
27
+ model_name = "croissantllm/base_50k"
28
+
29
+ model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto")
30
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
31
+
32
+ inputs = tokenizer("His name is Bob. -> Il s'appelle Bob.\nHe is heading to the market. -> Il va au marché.\nWe are heading to the beach, let's go together. ->", return_tensors="pt").to(model.device)
33
+ tokens = model.generate(**inputs, max_length=100, do_sample=True, top_p=0.95, top_k=60, temperature=0.5)
34
+ print(tokenizer.decode(tokens[0]))
35
+
36
+ # remove bos token
37
+ inputs = tokenizer("France -> Paris, Italie -> Rome, Allemagne -> Berlin, Espagne ->", return_tensors="pt", add_special_tokens=False).to(model.device)
38
+ tokens = model.generate(**inputs, max_length=250, do_sample=True, top_p=0.95, top_k=60)
39
+ print(tokenizer.decode(tokens[0]))
40
+ ```