Spaces:
Running
Running
Fix README linting & add details about auth
Browse files
README.md
CHANGED
@@ -117,28 +117,35 @@ You can change things like the parameters, or customize the preprompt to better
|
|
117 |
|
118 |
### Running your own models using a custom endpoint
|
119 |
|
120 |
-
If you want to, you can even run your own models, by having a look at our endpoint project, [text-generation-inference](https://github.com/huggingface/text-generation-inference). You can then add your own
|
121 |
|
122 |
```
|
123 |
-
|
124 |
-
|
125 |
-
"
|
|
|
126 |
```
|
127 |
|
|
|
|
|
128 |
### Custom endpoint authorization
|
129 |
|
130 |
-
Custom endpoints may require authorization
|
|
|
|
|
131 |
|
132 |
`echo -n "USER:PASS" | base64`
|
133 |
|
134 |
> VVNFUjpQQVNT
|
135 |
|
|
|
|
|
136 |
You can then add the generated information and the `authorization` parameter to your `.env.local`.
|
137 |
|
138 |
```
|
139 |
-
"endpoints": [
|
140 |
{
|
141 |
-
"url": "https://HOST:PORT/generate_stream",
|
142 |
"authorization": "Basic VVNFUjpQQVNT",
|
143 |
}
|
144 |
]
|
@@ -146,16 +153,16 @@ You can then add the generated information and the `authorization` parameter to
|
|
146 |
|
147 |
### Models hosted on multiple custom endpoints
|
148 |
|
149 |
-
If the model being hosted will be available on multiple servers/instances add the `weight` parameter to your `.env.local`.
|
150 |
|
151 |
```
|
152 |
-
"endpoints": [
|
153 |
{
|
154 |
-
"url": "https://HOST:PORT/generate_stream",
|
155 |
"weight": 1
|
156 |
}
|
157 |
{
|
158 |
-
"url": "https://HOST:PORT/generate_stream",
|
159 |
"weight": 2
|
160 |
}
|
161 |
...
|
|
|
117 |
|
118 |
### Running your own models using a custom endpoint
|
119 |
|
120 |
+
If you want to, you can even run your own models locally, by having a look at our endpoint project, [text-generation-inference](https://github.com/huggingface/text-generation-inference). You can then add your own endpoints to the `MODELS` variable in `.env.local`, by adding an `"endpoints"` key for each model in `MODELS`.
|
121 |
|
122 |
```
|
123 |
+
{
|
124 |
+
// rest of the model config here
|
125 |
+
"endpoints": [{"url": "https://HOST:PORT/generate_stream"}]
|
126 |
+
}
|
127 |
```
|
128 |
|
129 |
+
If `endpoints` is left unspecified, ChatUI will look for the model on the hosted Hugging Face inference API using the model name.
|
130 |
+
|
131 |
### Custom endpoint authorization
|
132 |
|
133 |
+
Custom endpoints may require authorization, depending on how you configure them. Authentication will usually be set either with `Basic` or `Bearer`.
|
134 |
+
|
135 |
+
For `Basic` we will need to generate a base64 encoding of the username and password.
|
136 |
|
137 |
`echo -n "USER:PASS" | base64`
|
138 |
|
139 |
> VVNFUjpQQVNT
|
140 |
|
141 |
+
For `Bearer` you can use a token, which can be grabbed from [here](https://huggingface.co/settings/tokens).
|
142 |
+
|
143 |
You can then add the generated information and the `authorization` parameter to your `.env.local`.
|
144 |
|
145 |
```
|
146 |
+
"endpoints": [
|
147 |
{
|
148 |
+
"url": "https://HOST:PORT/generate_stream",
|
149 |
"authorization": "Basic VVNFUjpQQVNT",
|
150 |
}
|
151 |
]
|
|
|
153 |
|
154 |
### Models hosted on multiple custom endpoints
|
155 |
|
156 |
+
If the model being hosted will be available on multiple servers/instances add the `weight` parameter to your `.env.local`. The `weight` will be used to determine the probability of requesting a particular endpoint.
|
157 |
|
158 |
```
|
159 |
+
"endpoints": [
|
160 |
{
|
161 |
+
"url": "https://HOST:PORT/generate_stream",
|
162 |
"weight": 1
|
163 |
}
|
164 |
{
|
165 |
+
"url": "https://HOST:PORT/generate_stream",
|
166 |
"weight": 2
|
167 |
}
|
168 |
...
|