English
Chinese

Model Card for VAR (Visual AutoRegressive) Transformers πŸ”₯

arXivdemo platform

VAR is a new visual generation framework that makes GPT-style models surpass diffusion models for the first timeπŸš€, and exhibits clear power-law Scaling LawsπŸ“ˆ like large language models (LLMs).

VAR redefines the autoregressive learning on images as coarse-to-fine "next-scale prediction" or "next-resolution prediction", diverging from the standard raster-scan "next-token prediction".

This repo is used for hosting VAR's checkpoints.

For more details or tutorials see https://github.com/FoundationVision/VAR.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for FoundationVision/var

Finetunes
1 model

Dataset used to train FoundationVision/var

Space using FoundationVision/var 1