Merge branch 'main' of hf.co:Qwen/Qwen2-72B-Instruct
Browse files
README.md
CHANGED
@@ -16,7 +16,7 @@ Compared with the state-of-the-art opensource language models, including the pre
|
|
16 |
|
17 |
Qwen2-72B-Instruct supports a context length of up to 131,072 tokens, enabling the processing of extensive inputs. Please refer to [this section](#processing-long-texts) for detailed instructions on how to deploy Qwen2 for handling long texts.
|
18 |
|
19 |
-
For more details, please refer to our [blog
|
20 |
<br>
|
21 |
|
22 |
## Model Details
|
@@ -117,7 +117,7 @@ For deployment, we recommend using vLLM. You can enable long-context capabilitie
|
|
117 |
}'
|
118 |
```
|
119 |
|
120 |
-
For further usage instructions of vLLM, please refer to
|
121 |
|
122 |
**Note**: Presently, vLLM only supports static YARN, which means the scaling factor remains constant regardless of input length, potentially impacting performance on shorter texts. We advise adding the `rope_scaling` configuration only when processing long contexts is required.
|
123 |
|
|
|
16 |
|
17 |
Qwen2-72B-Instruct supports a context length of up to 131,072 tokens, enabling the processing of extensive inputs. Please refer to [this section](#processing-long-texts) for detailed instructions on how to deploy Qwen2 for handling long texts.
|
18 |
|
19 |
+
For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2/) and [GitHub](https://github.com/QwenLM/Qwen2).
|
20 |
<br>
|
21 |
|
22 |
## Model Details
|
|
|
117 |
}'
|
118 |
```
|
119 |
|
120 |
+
For further usage instructions of vLLM, please refer to our [Github](https://github.com/QwenLM/Qwen2).
|
121 |
|
122 |
**Note**: Presently, vLLM only supports static YARN, which means the scaling factor remains constant regardless of input length, potentially impacting performance on shorter texts. We advise adding the `rope_scaling` configuration only when processing long contexts is required.
|
123 |
|