|
# Chill Watcher |
|
consider deploy on: |
|
- huggingface inference point |
|
- replicate api |
|
- lightning.ai |
|
|
|
# platform comparison |
|
> all support autoscaling |
|
|
|
|platform|prediction speed|charges|deploy handiness| |
|
|-|-|-|-| |
|
|huggingface|fast:20s|high:$0.6/hr (without autoscaling)|easy:git push| |
|
|replicate|fast if used frequently: 30s, slow if needs initialization: 5min|low: $0.02 per generation|difficult: build image and upload| |
|
|lightning.ai|fast with app running: 20s, slow if idle: XXs|low: free $30 per month, $0.18 per init, $0.02 per run|easy: one command| |
|
|
|
# platform deploy options |
|
## huggingface |
|
> [docs](https://huggingface.co/docs/inference-endpoints/guides/custom_handler) |
|
|
|
- requirements: use pip packages in `requirements.txt` |
|
- `init()` and `predict()` function: use `handler.py`, implement the `EndpointHandler` class |
|
- more: modify `handler.py` for requests and inference and explore more highly-customized features |
|
- deploy: git (lfs) push to huggingface repository(the whole directory including models and weights, etc.), and use inference endpoints to deploy. Click and deploy automaticly, very simple. |
|
- call api: use the url provide by inference endpoints after endpoint is ready(build, initialize and in a "running" state), make a post request to the url using request schema definied in the `handler.py` |
|
|
|
## replicate |
|
> [docs](https://replicate.com/docs/guides/push-a-model) |
|
|
|
- requirements: specify all requirements(pip packages, system packages, python version, cuda, etc.) in `cog.yaml` |
|
- `init()` and `predict()` function: use `predict.py`, implement the `Predictor` class |
|
- more: modify `predict.py` |
|
- deploy: |
|
1. get a linux GPU machine with 60GB disk space; |
|
2. install [cog](https://replicate.com/docs/guides/push-a-model) and [docker](https://docs.docker.com/engine/install/ubuntu/#set-up-the-repository) |
|
3. `git pull` the current repository from huggingface, including large model files |
|
4. after `predict.py` and `cog.yaml` is correctly coded, run `cog login`, `cog push`, then cog will build a docker image locally and push the image to replicate. As the image could take 30GB or so disk space, it would cost a lot network bandwidth. |
|
- call api: if everything runs successfully and the docker image is pushed to replicate, you will see a web-ui and an API example directly in your replicate repository |
|
|
|
## lightning.ai |
|
> docs: [code](https://lightning.ai/docs/app/stable/levels/basic/real_lightning_component_implementations.html), [deploy](https://lightning.ai/docs/app/stable/workflows/run_app_on_cloud/) |
|
|
|
- requirements: |
|
- pip packages are listed in `requirements.txt`, note that some requirements are different from those in huggingface, and you need to modify some lines in `requirements.txt` according to the comment in the `requirements.txt` |
|
- other pip packages, system packages and some big model weight files download commands, can be listed using a custom build config. Checkout `class CustomBuildConfig(BuildConfig)` in `app.py`. In a custom build config you can use many linux commands such as `wget` and `sudo apt-get update`. The custom build config will be executed on the `__init__()` of the `PythonServer` class |
|
- `init()` and `predict()` function: use `app.py`, implement the `PythonServer` class. Note: |
|
- some packages haven't been installed when the file is called(these packages may be installed when `__init__()` is called), so some import code should be in the function, not at the top of the file, or you may get import errors. |
|
- you can't save your own value to `PythonServer.self` unless it's predifined in the variables, so don't assign any self-defined variables to `self` |
|
- if you use the custom build config, you should implement `PythonServer`'s `__init()__` yourself, so don't forget to use the correct function signature |
|
- more: ... |
|
- deploy: |
|
- `pip install lightning` |
|
- prepare the directory on your local computer(no need to have a GPU) |
|
- list big files in the `.lightningignore` file to avoid big file upload and save deploy time cost |
|
- run `lightning run app app.py --cloud` in the local terminal, and it will upload the files in the directory to lightning cloud, and start deploying on the cloud |
|
- check error logs on the web-ui, use `all logs` |
|
- call api: only if the app starts successfully, you can see a valid url in the `settings` page of the web-ui. Open that url, and you can see the api |
|
|
|
### some stackoverflow: |
|
install docker: |
|
- https://docs.docker.com/engine/install/ubuntu/#set-up-the-repository |
|
|
|
install git-lfs: |
|
- https://github.com/git-lfs/git-lfs/blob/main/INSTALLING.md |
|
linux: |
|
``` |
|
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash |
|
|
|
sudo apt-get install git-lfs |
|
``` |
|
|
|
--- |
|
license: apache-2.0 |
|
--- |
|
|