Chill Watcher
consider deploy on:
- huggingface inference point
- replicate api
- lightning.ai
platform comparison
all support autoscaling
platform | prediction speed | charges | deploy handiness |
---|---|---|---|
huggingface | fast:20s | high:$0.6/hr (without autoscaling) | easy:git push |
replicate | fast if used frequently: 30s, slow if needs initialization: 5min | low: $0.02 per generation | difficult: build image and upload |
lightning.ai | fast with app running: 20s, slow if idle: XXs | low: free $30 per month, $0.18 per init, $0.02 per run | easy: one command |
platform deploy options
huggingface
- requirements: use pip packages in
requirements.txt
init()
andpredict()
function: usehandler.py
, implement theEndpointHandler
class- more: modify
handler.py
for requests and inference and explore more highly-customized features - deploy: git (lfs) push to huggingface repository(the whole directory including models and weights, etc.), and use inference endpoints to deploy. Click and deploy automaticly, very simple.
- call api: use the url provide by inference endpoints after endpoint is ready(build, initialize and in a "running" state), make a post request to the url using request schema definied in the
handler.py
replicate
- requirements: specify all requirements(pip packages, system packages, python version, cuda, etc.) in
cog.yaml
init()
andpredict()
function: usepredict.py
, implement thePredictor
class- more: modify
predict.py
- deploy:
- get a linux GPU machine with 60GB disk space;
- install cog and docker
git pull
the current repository from huggingface, including large model files- after
predict.py
andcog.yaml
is correctly coded, runcog login
,cog push
, then cog will build a docker image locally and push the image to replicate. As the image could take 30GB or so disk space, it would cost a lot network bandwidth.
- call api: if everything runs successfully and the docker image is pushed to replicate, you will see a web-ui and an API example directly in your replicate repository
lightning.ai
- requirements:
- pip packages are listed in
requirements.txt
, note that some requirements are different from those in huggingface, and you need to modify some lines inrequirements.txt
according to the comment in therequirements.txt
- other pip packages, system packages and some big model weight files download commands, can be listed using a custom build config. Checkout
class CustomBuildConfig(BuildConfig)
inapp.py
. In a custom build config you can use many linux commands such aswget
andsudo apt-get update
. The custom build config will be executed on the__init__()
of thePythonServer
class
- pip packages are listed in
init()
andpredict()
function: useapp.py
, implement thePythonServer
class. Note:- some packages haven't been installed when the file is called(these packages may be installed when
__init__()
is called), so some import code should be in the function, not at the top of the file, or you may get import errors. - you can't save your own value to
PythonServer.self
unless it's predifined in the variables, so don't assign any self-defined variables toself
- if you use the custom build config, you should implement
PythonServer
's__init()__
yourself, so don't forget to use the correct function signature
- some packages haven't been installed when the file is called(these packages may be installed when
- more: ...
- deploy:
pip install lightning
- prepare the directory on your local computer(no need to have a GPU)
- list big files in the
.lightningignore
file to avoid big file upload and save deploy time cost - run
lightning run app app.py --cloud
in the local terminal, and it will upload the files in the directory to lightning cloud, and start deploying on the cloud - check error logs on the web-ui, use
all logs
- call api: only if the app starts successfully, you can see a valid url in the
settings
page of the web-ui. Open that url, and you can see the api
some stackoverflow:
install docker:
install git-lfs:
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs