--- license: mit datasets: - cerebras/SlimPajama-627B - oscar-corpus/OSCAR-2301 - bigcode/starcoderdata language: - fr - en pipeline_tag: text-generation tags: - legal - art - code - finance - medical - text-generation-inference --- # CroissantLLM: A not so flaky bilingual 1.3B model An experimental mode trained on a small subsplit of the final data. ### Usage ```python model_name = "croissantllm/base_50k" model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto") tokenizer = AutoTokenizer.from_pretrained(model_name) inputs = tokenizer("His name is Bob. -> Il s'appelle Bob.\nHe is heading to the market. -> Il va au marché.\nWe are heading to the beach, let's go together. ->", return_tensors="pt").to(model.device) tokens = model.generate(**inputs, max_length=100, do_sample=True, top_p=0.95, top_k=60, temperature=0.5) print(tokenizer.decode(tokens[0])) # remove bos token inputs = tokenizer("France -> Paris, Italie -> Rome, Allemagne -> Berlin, Espagne ->", return_tensors="pt", add_special_tokens=False).to(model.device) tokens = model.generate(**inputs, max_length=250, do_sample=True, top_p=0.95, top_k=60) print(tokenizer.decode(tokens[0])) ```