CT-LLM-Base / README.md
dododododo's picture
Create README.md
15e2355 verified
|
raw
history blame
1.6 kB
metadata
{}

CT-LLM-Base

🌐 Homepage | 🤗 MAP-CC | 🤗 CHC-Bench | 🤗 CT-LLM | 📖 arXiv | GitHub

CT-LLM-Base is the first Chinese-centric large language model, both pre-training and fine-tuned primarily on Chinese corpora, and offers significant insights into potential biases, Chinese language ability, and multilingual adaptability.

Uses

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = '<your-model-path>'

tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False, trust_remote_code=True)

model = AutoModelForCausalLM.from_pretrained(
    model_path,
    device_map="auto",
    torch_dtype='auto'
).eval()

input_text = "很久很久以前,"

input_ids = tokenizer(input_text, add_generation_prompt=True, return_tensors='pt').to(model.device)
output_ids = model.generate(**input_ids, max_new_tokens=20)
response = tokenizer.decode(output_ids[0], skip_special_tokens=True)

print(response)