Transformers
2017-Attention is all you need
See [VSP+17], the first paper about transformer.
2020-An image is worth 16x16 words: Transformers for image recognition at scale
See [DBK+20], known as Vit
.
Transformers
- DBK+20
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, and others. An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020. URL: https://arxiv.org/pdf/2010.11929.
- VSP+17
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 2017. URL: https://arxiv.org/pdf/2010.11929.