Yu's MemoCapsule

Autoregression

Generating Images Like Texts

Can we generate images in the same way as autoregressive language model? Although this sounds simpler than diffusion models, we still need to deal with many computational cost problems. But don’t worry too much, there are serval brilliant methods to try to make this idea more competitive. Taming Transformer -> Patrick Esser, et al. CVPR 2021 The key challenge of autoregressive generation is how to solve the quadratically increasing cost of image sequences that are much longer than texts.