Abstract: Vision Transformer (ViT), a radically different architecture than convolutional neural networks offers multiple advantages including design simplicity, robustness and state-of-the-art ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results