Update README.md
Browse files
README.md
CHANGED
@@ -31,37 +31,36 @@ CodeShell is a multi-language code LLM developed by the [Knowledge Computing Lab
|
|
31 |
|
32 |
## Quickstart
|
33 |
|
34 |
-
### Code Generation
|
35 |
-
|
36 |
Codeshell 提供了Hugging Face格式的模型,开发者可以通过下列代码加载并使用。
|
37 |
|
38 |
Codeshell offers a model in the Hugging Face format. Developers can load and use it with the following code.
|
39 |
|
40 |
```python
|
|
|
41 |
import torch
|
42 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
43 |
-
tokenizer = AutoTokenizer.from_pretrained("codeshell", trust_remote_code=True)
|
44 |
-
model = AutoModelForCausalLM.from_pretrained("codeshell", trust_remote_code=True).cuda()
|
45 |
-
inputs = tokenizer('def print_hello_world():', return_tensors='pt').cuda()
|
46 |
-
outputs = model.generate(inputs)
|
47 |
-
print(tokenizer.decode(outputs[0]))
|
48 |
-
```
|
49 |
|
50 |
-
|
|
|
|
|
51 |
|
52 |
-
|
|
|
|
|
|
|
|
|
53 |
|
54 |
-
|
55 |
-
|
56 |
-
|
57 |
-
|
58 |
-
inputs = tokenizer(input_text, return_tensors='pt').cuda()
|
59 |
-
outputs = model.generate(inputs)
|
60 |
-
print(tokenizer.decode(outputs[0]))
|
61 |
```
|
62 |
|
63 |
-
|
|
|
|
|
64 |
|
|
|
65 |
|
66 |
Code Shell使用GPT-2作为基础架构,采用Grouped-Query Attention、RoPE相对位置编码等技术。
|
67 |
|
|
|
31 |
|
32 |
## Quickstart
|
33 |
|
|
|
|
|
34 |
Codeshell 提供了Hugging Face格式的模型,开发者可以通过下列代码加载并使用。
|
35 |
|
36 |
Codeshell offers a model in the Hugging Face format. Developers can load and use it with the following code.
|
37 |
|
38 |
```python
|
39 |
+
import time
|
40 |
import torch
|
41 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
|
|
|
|
|
|
|
|
|
|
42 |
|
43 |
+
device = torch.device('cuda:0')
|
44 |
+
model = AutoModelForCausalLM.from_pretrained('WisdomShell/CodeShell-7B-Chat', trust_remote_code=True).to(device)
|
45 |
+
tokenizer = AutoTokenizer.from_pretrained('WisdomShell/CodeShell-7B-Chat')
|
46 |
|
47 |
+
history = []
|
48 |
+
query = '你是谁?'
|
49 |
+
response = model.chat(query, history, tokenizer)
|
50 |
+
print(response)
|
51 |
+
history.append(query, response)
|
52 |
|
53 |
+
query = '用Python写一个HTTP server'
|
54 |
+
response = model.chat(query, history, tokenizer)
|
55 |
+
print(response)
|
56 |
+
history.append(query, response)
|
|
|
|
|
|
|
57 |
```
|
58 |
|
59 |
+
开发者也可以通过VS Code与JetBrains插件与CodeShell-7B-Chat交互,详情请参[VSCode插件仓库](https://github.com/WisdomShell/codeshell-vscode)与[IntelliJ插件仓库](https://github.com/WisdomShell/codeshell-intellij)。
|
60 |
+
|
61 |
+
Developers can also interact with CodeShell-7B-Chat through VS Code and JetBrains plugins. For details, please refer to the [VSCode Plugin Repository](https://github.com/WisdomShell/codeshell-vscode) and [IntelliJ Plugin Repository](https://github.com/WisdomShell/codeshell-intellij).
|
62 |
|
63 |
+
## Model Details
|
64 |
|
65 |
Code Shell使用GPT-2作为基础架构,采用Grouped-Query Attention、RoPE相对位置编码等技术。
|
66 |
|