Add readme instructions on SageMaker
Browse files
README.md
CHANGED
@@ -28,11 +28,15 @@ cd falconlite-dev/script
|
|
28 |
./start_falconlite.sh
|
29 |
```
|
30 |
### Perform inference
|
31 |
-
```
|
32 |
# after FalconLite has been completely started
|
33 |
pip install -r requirements-client.txt
|
34 |
python falconlite_client.py
|
35 |
```
|
|
|
|
|
|
|
|
|
36 |
**Important** - When using FalconLite for inference for the first time, it may require a brief 'warm-up' period that can take 10s of seconds. However, subsequent inferences should be faster and return results in a more timely manner. This warm-up period is normal and should not affect the overall performance of the system once the initialisation period has been completed.
|
37 |
|
38 |
## Evalution Result ##
|
|
|
28 |
./start_falconlite.sh
|
29 |
```
|
30 |
### Perform inference
|
31 |
+
```bash
|
32 |
# after FalconLite has been completely started
|
33 |
pip install -r requirements-client.txt
|
34 |
python falconlite_client.py
|
35 |
```
|
36 |
+
|
37 |
+
### *New!* Amazon SageMaker Deployment ###
|
38 |
+
To deploy FalconLite on SageMaker endpoint, please follow [this notebook](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/custom-tgi-ecr/deploy.ipynb).
|
39 |
+
|
40 |
**Important** - When using FalconLite for inference for the first time, it may require a brief 'warm-up' period that can take 10s of seconds. However, subsequent inferences should be faster and return results in a more timely manner. This warm-up period is normal and should not affect the overall performance of the system once the initialisation period has been completed.
|
41 |
|
42 |
## Evalution Result ##
|