Skip to content

Implementation of Image Classification using Visual Transformers in Amazon SageMaker based on the ideas from research paper - Visual Transformers: Token-based Image Representation and Processing for Computer Vision.

License

Notifications You must be signed in to change notification settings

aws-samples/amazon-sagemaker-visual-transformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Image Classification using Visual Transformers

In standard image classification algorithms like ResNet, InceptionNet etc., images are represented as pixel arrays on which a series of convolution operations are performed. Although, great accuracy has been achieved with these algorithms, the convolution operation is computationally expensive. Therefore, in this notebook we will look at an alternative way to perform Image Classification using the ideas mentioned in the Visual Transformers: Token-based Image Representation and Processing for Computer Vision research paper.

Diagram of a Visual Transformer (VT).
For a given image, we first apply convolutional layers to extract low-level features. The output feature map is then fed to VT: First, apply a tokenizer, grouping pixels into a small number of visual tokens. Second, apply transformers to model relationships between tokens. Third, visual tokens are directly used for image classification or projected back to the feature map for semantic segmentation.

Note

  • Dataset used is Intel Image Classification from Kaggle.
  • The notebook is only an example and not to be used for production deployments.
  • Use Python3 (PyTorch 1.6 Python 3.6 CPU Optimized) kernel and ml.m5.large (2 vCPU + 8 GiB) for the notebook, if you are using Amazon SageMaker Studio.
  • Notebook has ideas and some of the pseudo code from Visual Transformers: Token-based Image Representation and Processing for Computer Vision research paper but does not reproduces the results mentioned in the paper.

Amazon SageMaker


Amazon SageMaker is the most comprehensive and full managed machine learning service. With SageMaker, data scientists and developers can quickly and easily build and train machine learning models, and then directly deploy them into a production-ready hosted environment. It provides an integrated Jupyter authoring notebook instance for easy access to your data sources for exploration and analysis, so you don't have to manage servers. It also provides common machine learning algorithms that are optimized to run efficiently against extremely large data in a distributed environment. With native support for bring-your-own-algorithms and frameworks, SageMaker offers flexible distributed training options that adjust to your specific workflows. Deploy a model into a secure and scalable environment by launching it with a few clicks from SageMaker Studio or the SageMaker console. We use Amazon SageMaker Studio for running the code, for more details see the AWS documentation.

How to run the code in Amazon SageMaker Studio?


If you haven't used Amazon SageMaker Studio before, please follow the steps mentioned in Onboard to Amazon SageMaker Studio.

To log in from the SageMaker console

  • Onboard to Amazon SageMaker Studio. If you've already onboarded, skip to the next step.
  • Open the SageMaker console.
  • Choose Amazon SageMaker Studio.
  • The Amazon SageMaker Studio Control Panel opens.
  • In the Amazon SageMaker Studio Control Panel, you'll see a list of user names.
  • Next to your user name, choose Open Studio.

Open a Studio notebook

SageMaker Studio can only open notebooks listed in the Studio file browser. In this example we will Clone a Git Repository in SageMaker Studio.

To clone the repo

  • In the left sidebar, choose the File Browser icon ( ).
  • Choose the root folder or the folder you want to clone the repo into.
  • In the left sidebar, choose the Git icon ( ).
  • Choose Clone a Repository.
  • Enter the URI for the repo https://github.com/aws-samples/amazon-sagemaker-visual-transformer.git.
  • Choose CLONE.
  • If the repo requires credentials, you are prompted to enter your username and password.
  • Wait for the download to finish. After the repo has been cloned, the File Browser opens to display the cloned repo.
  • Double click the repo to open it.
  • Choose the Git icon to view the Git user interface which now tracks the examples repo.
  • To track a different repo, open the repo in the file browser and then choose the Git icon.

To open a notebook

  • In the left sidebar, choose the File Browser icon ( ) to display the file browser.
  • Browse to a notebook file and double-click it to open the notebook in a new tab.

References


Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

About

Implementation of Image Classification using Visual Transformers in Amazon SageMaker based on the ideas from research paper - Visual Transformers: Token-based Image Representation and Processing for Computer Vision.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published