AITemplate (AIT) is a Python framework that translates deep neural networks into CUDA (NVIDIA GPU) / HIP (AMD GPU) C++ code for fast inference services. Highlights of AITemplate include:

  • High performance: Approaching roofline fp16 TensorCore (NVIDIA GPU)/MatrixCore (AMD GPU) performance on major models, including ResNet, MaskRCNN, BERT, VisionTransformer, Stable Diffusion, etc.
  • Unified, open, flexible: Seamless fp16 deep neural network models for NVIDIA GPUs or AMD GPUs. Completely open source, Lego-style, easily extensible, high-performance primitives that support new models.


Hardware Requirements:

  • NVIDIA : AIT is only tested on SM80+ GPUs, not all cores will work on older SM75/SM70 (T4/V100) GPUs.
  • AMD : AIT is only tested on CDNA2 (MI-210/250) GPUs, older CDNA1 (MI-100) GPUs may have compiler issues.

clone code

When cloning code, use the following command to clone submodules at the same time:

git clone --recursive

Docker image

We strongly recommend using AITemplate with Docker to avoid accidentally using the wrong version of NVCC or HIPCC.

  • CUDA: ./docker/ cuda
  • ROCM: DOCKER_BUILDKIT=1 ./docker/ rocm

This will build a ait:latestdocker image for label

#AITemplate #Homepage #Documentation #Downloads #Meta #Open #Source #Python #Framework #News Fast Delivery

Leave a Comment

Your email address will not be published. Required fields are marked *