| # flux-schnell-edge-inference | |
| This holds the baseline for the FLUX Schnel NVIDIA GeForce RTX 4090 contest, which can be forked freely and optimized | |
| Some recommendations are as follows: | |
| - Installing dependencies should be done in `pyproject.toml`, including git dependencies | |
| - HuggingFace models should be specified in the `models` array in the `pyproject.toml` file, and will be downloaded before benchmarking | |
| - The pipeline does **not** have internet access so all dependencies and models must be included in the `pyproject.toml` | |
| - Compiled models should be hosted on HuggingFace and included in the `models` array in the `pyproject.toml` (rather than compiling during loading). Loading time matters far more than file sizes | |
| - Avoid changing `src/main.py`, as that includes mostly protocol logic. Most changes should be in `models` and `src/pipeline.py` | |
| - Ensure the entire repository (excluding dependencies and HuggingFace models) is under 16MB | |
| For testing, you need a docker container with pytorch and ubuntu 22.04. | |
| You can download your listed dependencies with `uv`, installed with: | |
| ```bash | |
| pipx ensurepath | |
| pipx install uv | |
| ``` | |
| You can then relock with `uv lock`, and then run with `uv run start_inference` | |