# flux-schnell-edge-inference | |
This holds the baseline for the FLUX Schnel NVIDIA GeForce RTX 4090 contest, which can be forked freely and optimized | |
Some recommendations are as follows: | |
- Installing dependencies should be done in `pyproject.toml`, including git dependencies | |
- HuggingFace models should be specified in the `models` array in the `pyproject.toml` file, and will be downloaded before benchmarking | |
- The pipeline does **not** have internet access so all dependencies and models must be included in the `pyproject.toml` | |
- Compiled models should be hosted on HuggingFace and included in the `models` array in the `pyproject.toml` (rather than compiling during loading). Loading time matters far more than file sizes | |
- Avoid changing `src/main.py`, as that includes mostly protocol logic. Most changes should be in `models` and `src/pipeline.py` | |
- Ensure the entire repository (excluding dependencies and HuggingFace models) is under 16MB | |
For testing, you need a docker container with pytorch and ubuntu 22.04. | |
You can download your listed dependencies with `uv`, installed with: | |
```bash | |
pipx ensurepath | |
pipx install uv | |
``` | |
You can then relock with `uv lock`, and then run with `uv run start_inference` | |