TensorRT Open Source Software
This repository contains the Open Source Software (OSS) components of NVIDIA TensorRT. It includes the sources for TensorRT plugins and ONNX parser, as well as sample applications demonstrating usage and capabilities of the TensorRT platform. These open source software components are a subset of the TensorRT General Availability (GA) release with some extensions and bug-fixes.
- For code contributions to TensorRT-OSS, please see our Contribution Guide and Coding Guidelines.
- For a summary of new additions and updates shipped with TensorRT-OSS releases, please refer to the Changelog.
- For business inquiries, please contact researchinquiries@nvidia.com
- For press and other inquiries, please contact Hector Marinez at hmarinez@nvidia.com
Need enterprise support? NVIDIA global support is available for TensorRT with the NVIDIA AI Enterprise software suite. Check out NVIDIA LaunchPad for free access to a set of hands-on labs with TensorRT hosted on NVIDIA infrastructure.
Join the TensorRT and Triton community and stay current on the latest product updates, bug fixes, content, best practices, and more.
Prebuilt TensorRT Python Package
We provide the TensorRT Python package for an easy installation.
To install:
You can skip the Build section to enjoy TensorRT with Python.
Build
Prerequisites
To build the TensorRT-OSS components, you will first need the following software packages.
TensorRT GA build
- TensorRT v10.9.0.34
- Available from direct download links listed below
System Packages
- CUDA
- Recommended versions:
- cuda-12.8.0 + cuDNN-8.9
- cuda-11.8.0 + cuDNN-8.9
- GNU make >= v4.1
- cmake >= v3.13
- python >= v3.8, <= v3.10.x
- pip >= v19.0
- Essential utilities
- git, pkg-config, wget
Optional Packages
-
Containerized build
- Docker >= 19.03
- NVIDIA Container Toolkit
-
PyPI packages (for demo applications/tests)
- onnx
- onnxruntime
- tensorflow-gpu >= 2.5.1
- Pillow >= 9.0.1
- pycuda < 2021.1
- numpy
- pytest
-
Code formatting tools (for contributors)
- Clang-format
- Git-clang-format
NOTE: onnx-tensorrt, cub, and protobuf packages are downloaded along with TensorRT OSS, and not required to be installed.
Downloading TensorRT Build
-
Download TensorRT OSS
git clone -b main https://github.com/nvidia/TensorRT TensorRT cd TensorRT git submodule update --init --recursive
-
(Optional — if not using TensorRT container) Specify the TensorRT GA release build path
If using the TensorRT OSS build container, TensorRT libraries are preinstalled under
/usr/lib/x86_64-linux-gnu
and you may skip this step.Else download and extract the TensorRT GA build from NVIDIA Developer Zone with the direct links below:
- TensorRT 10.9.0.34 for CUDA 11.8, Linux x86_64
- TensorRT 10.9.0.34 for CUDA 12.8, Linux x86_64
- TensorRT 10.9.0.34 for CUDA 11.8, Windows x86_64
- TensorRT 10.9.0.34 for CUDA 12.8, Windows x86_64
Example: Ubuntu 20.04 on x86-64 with cuda-12.8
cd ~/Downloads tar -xvzf TensorRT-10.9.0.34.Linux.x86_64-gnu.cuda-12.8.tar.gz export TRT_LIBPATH=`pwd`/TensorRT-10.9.0.34
Example: Windows on x86-64 with cuda-12.8
Expand-Archive -Path TensorRT-10.9.0.34.Windows.win10.cuda-12.8.zip $env:TRT_LIBPATH="$pwd\TensorRT-10.9.0.34\lib"
Setting Up The Build Environment
For Linux platforms, we recommend that you generate a docker container for building TensorRT OSS as described below. For native builds, please install the prerequisite System Packages.
-
Generate the TensorRT-OSS build container.
Example: Ubuntu 20.04 on x86-64 with cuda-12.8 (default)
./docker/build.sh --file docker/ubuntu-20.04.Dockerfile --tag tensorrt-ubuntu20.04-cuda12.8
Example: Rockylinux8 on x86-64 with cuda-12.8
./docker/build.sh --file docker/rockylinux8.Dockerfile --tag tensorrt-rockylinux8-cuda12.8
Example: Ubuntu 22.04 cross-compile for Jetson (aarch64) with cuda-12.8 (JetPack SDK)
./docker/build.sh --file docker/ubuntu-cross-aarch64.Dockerfile --tag tensorrt-jetpack-cuda12.8
Example: Ubuntu 22.04 on aarch64 with cuda-12.8
./docker/build.sh --file docker/ubuntu-22.04-aarch64.Dockerfile --tag tensorrt-aarch64-ubuntu22.04-cuda12.8
-
Launch the TensorRT-OSS build container.
Example: Ubuntu 20.04 build container
./docker/launch.sh --tag tensorrt-ubuntu20.04-cuda12.8 --gpus all
NOTE:
1. Use the--tag
corresponding to build container generated in Step 1.
2. NVIDIA Container Toolkit is required for GPU access (running TensorRT applications) inside the build container.
3.sudo
password for Ubuntu build containers is ‘nvidia’.
4. Specify port number using--jupyter <port>
for launching Jupyter notebooks.
Building TensorRT-OSS
-
Generate Makefiles and build
Example: Linux (x86-64) build with default cuda-12.8
cd $TRT_OSSPATH mkdir -p build && cd build cmake .. -DTRT_LIB_DIR=$TRT_LIBPATH -DTRT_OUT_DIR=`pwd`/out make -j$(nproc)
Example: Linux (aarch64) build with default cuda-12.8
cd $TRT_OSSPATH mkdir -p build && cd build cmake .. -DTRT_LIB_DIR=$TRT_LIBPATH -DTRT_OUT_DIR=`pwd`/out -DCMAKE_TOOLCHAIN_FILE=$TRT_OSSPATH/cmake/toolchains/cmake_aarch64-native.toolchain make -j$(nproc)
Example: Native build on Jetson (aarch64) with cuda-12.8
cd $TRT_OSSPATH mkdir -p build && cd build cmake .. -DTRT_LIB_DIR=$TRT_LIBPATH -DTRT_OUT_DIR=`pwd`/out -DTRT_PLATFORM_ID=aarch64 -DCUDA_VERSION=12.8 CC=/usr/bin/gcc make -j$(nproc)
NOTE: C compiler must be explicitly specified via CC= for native aarch64 builds of protobuf.
Example: Ubuntu 22.04 Cross-Compile for Jetson (aarch64) with cuda-12.8 (JetPack)
cd $TRT_OSSPATH mkdir -p build && cd build cmake .. -DCMAKE_TOOLCHAIN_FILE=$TRT_OSSPATH/cmake/toolchains/cmake_aarch64.toolchain -DCUDA_VERSION=12.8 -DCUDNN_LIB=/pdk_files/cudnn/usr/lib/aarch64-linux-gnu/libcudnn.so -DCUBLAS_LIB=/usr/local/cuda-12.8/targets/aarch64-linux/lib/stubs/libcublas.so -DCUBLASLT_LIB=/usr/local/cuda-12.8/targets/aarch64-linux/lib/stubs/libcublasLt.so -DTRT_LIB_DIR=/pdk_files/tensorrt/lib make -j$(nproc)
Example: Native builds on Windows (x86) with cuda-12.8
cd $TRT_OSSPATH mkdir -p build cd -p build cmake .. -DTRT_LIB_DIR="$env:TRT_LIBPATH" -DCUDNN_ROOT_DIR="$env:CUDNN_PATH" -DTRT_OUT_DIR="$pwd\\out" msbuild TensorRT.sln /property:Configuration=Release -m:$env:NUMBER_OF_PROCESSORS
NOTE: The default CUDA version used by CMake is 12.8.0. To override this, for example to 11.8, append
-DCUDA_VERSION=11.8
to the cmake command. -
Required CMake build arguments are:
TRT_LIB_DIR
: Path to the TensorRT installation directory containing libraries.TRT_OUT_DIR
: Output directory where generated build artifacts will be copied.
-
Optional CMake build arguments:
CMAKE_BUILD_TYPE
: Specify if binaries generated are for release or debug (contain debug symbols). Values consists of [Release
] |Debug
CUDA_VERSION
: The version of CUDA to target, for example [11.7.1
].CUDNN_VERSION
: The version of cuDNN to target, for example [8.6
].PROTOBUF_VERSION
: The version of Protobuf to use, for example [3.0.0
]. Note: Changing this will not configure CMake to use a system version of Protobuf, it will configure CMake to download and try building that version.CMAKE_TOOLCHAIN_FILE
: The path to a toolchain file for cross compilation.BUILD_PARSERS
: Specify if the parsers should be built, for example [ON
] |OFF
. If turned OFF, CMake will try to find precompiled versions of the parser libraries to use in compiling samples. First in${TRT_LIB_DIR}
, then on the system. If the build type is Debug, then it will prefer debug builds of the libraries before release versions if available.BUILD_PLUGINS
: Specify if the plugins should be built, for example [ON
] |OFF
. If turned OFF, CMake will try to find a precompiled version of the plugin library to use in compiling samples. First in${TRT_LIB_DIR}
, then on the system. If the build type is Debug, then it will prefer debug builds of the libraries before release versions if available.BUILD_SAMPLES
: Specify if the samples should be built, for example [ON
] |OFF
.GPU_ARCHS
: GPU (SM) architectures to target. By default we generate CUDA code for all major SMs. Specific SM versions can be specified here as a quoted space-separated list to reduce compilation time and binary size. Table of compute capabilities of NVIDIA GPUs can be found here. Examples: — NVidia A100:-DGPU_ARCHS="80"
— Tesla T4, GeForce RTX 2080:-DGPU_ARCHS="75"
— Titan V, Tesla V100:-DGPU_ARCHS="70"
— Multiple SMs:-DGPU_ARCHS="80 75"
TRT_PLATFORM_ID
: Bare-metal build (unlike containerized cross-compilation). Currently supported options:x86_64
(default).
References
TensorRT Resources
- TensorRT Developer Home
- TensorRT QuickStart Guide
- TensorRT Developer Guide
- TensorRT Sample Support Guide
- TensorRT ONNX Tools
- TensorRT Discussion Forums
- TensorRT Release Notes
Known Issues
- Please refer to TensorRT Release Notes
Introduction
Installing TensorRT can be a challenging task, especially for those who are new to the field of deep learning and computer vision. In this article, we will provide a step-by-step guide on how to install TensorRT on your Windows system. We will cover the installation process from downloading the TensorRT package to installing the required dependencies.
Step 1: Downloading TensorRT
The first step in installing TensorRT is to download the package from the official NVIDIA website. To do this, follow these steps:
- Go to the NVIDIA website and click on the Download button.
- You might need to login to your NVIDIA account to access the download page.
- Choose the TensorRT 10 version from the list of available versions.
- Agree to the terms and conditions and select the TensorRT 10.9 GA option.
- Scroll down and choose the TensorRT 10.9 GA for Windows 10, 11, Server 2022 and CUDA 12.0 to 12.8 ZIP Package option.
Step 2: Extracting the TensorRT Package
Once you have downloaded the TensorRT package, extract the zip file to a location on your computer. This will create a new folder containing the TensorRT package.
Step 3: Copying the TensorRT Files
The next step is to copy the TensorRT files to the correct location on your computer. To do this, follow these steps:
- Go to the lib folder inside the TensorRT package.
- Copy all the files inside the lib folder and paste them over to where your CUDA v12+ is installed (C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\bin).
Step 4: Installing the TensorRT Python Package
The next step is to install the TensorRT Python package. To do this, follow these steps:
- Go back to the TensorRt-10.9.0.34 extraction folder.
- Open a cmd window on that location.
- Do
pip install tensorrt-10.9.0.34-cp311-none-win_amd64.whl
orThe drive + location of your ComfyUI [..] \ComfyUI_windows_portable\python_embeded\python.exe -m pip install tensorrt-10.9.0.34-cp311-none-win_amd64.whl
Step 5: Installing the ComfyUI Requirements
The final step is to install the ComfyUI requirements. To do this, follow these steps:
- Navigate to
\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Upscaler-Tensorrt
- Do
pip install -r requirements.txt
orThe drive + location of your ComfyUI [..] \ComfyUI_windows_portable\python_embeded\python.exe -m pip install -r requirements.txt
Conclusion
Installing TensorRT can be a challenging task, but with this step-by-step guide, you should be able to install it on your Windows system. Remember to follow the instructions carefully and make sure to install the correct dependencies. If you encounter any issues during the installation process feel free to leave a comment below and we will do our best to assist you.
Troubleshooting
If you encounter any issues during the installation process, here are some common troubleshooting steps you can try:
- Make sure you have the correct version of CUDA installed on your computer.
- Make sure you have the correct version of Python installed on your computer.
- Make sure you have the correct version of the TensorRT package installed on your computer.
- Try reinstalling the TensorRT package and the ComfyUI requirements.
- Try running the installation process as an administrator.
System Requirements
To install TensorRT, you will need the following system requirements:
- Windows 10 or later
- CUDA v12.0 or later
- Python 3.6 or later
- TensorRT 10.9 GA or later
FAQs
Here are some frequently asked questions about installing TensorRT:
- Q: What is TensorRT?
- A: TensorRT is a high-performance deep learning inference engine developed by NVIDIA.
- Q: What is the difference between TensorRT and other deep learning frameworks?
- A: TensorRT is designed to provide high-performance inference on NVIDIA GPUs, while other frameworks may provide more flexibility and customization options.
- Q: How do I install TensorRT on my Windows system?
- A: Follow the step-by-step guide above to install TensorRT on your Windows system.
References
Here are some references that you can use to learn more about installing TensorRT:
- NVIDIA TensorRT Documentation: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html
- NVIDIA CUDA Documentation: https://docs.nvidia.com/cuda/index.html
- Python Documentation: https://docs.python.org/3/
TensorRT Q&A
================
Frequently Asked Questions
Here are some frequently asked questions about TensorRT:
Q: What is TensorRT?
A: TensorRT is a high-performance deep learning inference engine developed by NVIDIA. It is designed to provide high-performance inference on NVIDIA GPUs, making it an ideal choice for applications that require fast and accurate inference.
Q: What is the difference between TensorRT and other deep learning frameworks?
A: TensorRT is designed to provide high-performance inference on NVIDIA GPUs, while other frameworks may provide more flexibility and customization options. Other frameworks may also require more computational resources and may not provide the same level of performance as TensorRT.
Q: How do I install TensorRT on my Windows system?
A: To install TensorRT on your Windows system, follow the step-by-step guide above. Make sure you have the correct version of CUDA installed on your computer and that you have the correct version of Python installed on your computer.
Q: What are the system requirements for installing TensorRT?
A: To install TensorRT, you will need the following system requirements:
- Windows 10 or later
- CUDA v12.0 or later
- Python 3.6 or later
- TensorRT 10.9 GA or later
Q: How do I use TensorRT to optimize my deep learning models?
A: To use TensorRT to optimize your deep learning models, you will need to follow these steps:
- Install TensorRT on your system.
- Convert your deep learning model to a TensorRT engine.
- Use the TensorRT engine to optimize your model for inference.
Q: What are the benefits of using TensorRT?
A: The benefits of using TensorRT include:
- High-performance inference on NVIDIA GPUs
- Fast and accurate inference
- Easy to use and integrate with other deep learning frameworks
- Supports a wide range of deep learning models and frameworks
Q: How do I troubleshoot issues with TensorRT?
A: To troubleshoot issues with TensorRT, follow these steps:
- Check the TensorRT documentation for troubleshooting guides and FAQs.
- Check the NVIDIA forums for community support and troubleshooting guides.
- Contact NVIDIA support for further assistance.
Q: What are some common issues with TensorRT?
A: Some common issues with TensorRT include:
- Installation issues
- Compatibility issues with other deep learning frameworks
- Performance issues
- Troubleshooting issues
Q: How do I get started with TensorRT?
A: To get started with TensorRT, follow these steps:
- Install TensorRT on your system.
- Convert your deep learning model to a TensorRT engine.
- Use the TensorRT engine to optimize your model for inference.
- Experiment with different TensorRT features and options to optimize your model.
Q: What are some resources for learning more about TensorRT?
A: Some resources for learning more about TensorRT include:
- NVIDIA TensorRT Documentation: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html
- NVIDIA CUDA Documentation: https://docs.nvidia.com/cuda/index.html
- Python Documentation: <://docs.python.org/3/>
- NVIDIA Forums: https://forums.developer.nvidia.com/
- NVIDIA Developer Blog: https://developer.nvidia.com/blog/
System Requirements
The following system requirements are necessary to install and use TensorRT Model Optimizer — Windows:
OS |
Windows |
Architecture |
amd64 (x86_64) |
Python |
>=3.10,<3.13 |
CUDA |
>=12.0 |
ONNX Runtime |
1.20.0 |
Nvidia Driver |
565.90 or newer |
Nvidia GPU |
RTX 40 and 50 series |
Note
-
Make sure to use GPU-compatible driver and other dependencies (e.g. torch etc.). For instance, support for Blackwell GPU might be present in Nvidia 570+ driver, and CUDA-12.8.
-
We currently support Single-GPU configuration.
The TensorRT Model Optimizer — Windows can be used in following ways:
- Install ModelOpt-Windows as a Standalone Toolkit
- Install ModelOpt-Windows with Olive
TensorRT Open Source Software
This repository contains the Open Source Software (OSS) components of NVIDIA TensorRT. Included are the sources for TensorRT plugins and parsers (Caffe and ONNX), as well as sample applications demonstrating usage and capabilities of the TensorRT platform. These open source software components are a subset of the TensorRT General Availability (GA) release with some extensions and bug-fixes.
- For code contributions to TensorRT-OSS, please see our Contribution Guide and Coding Guidelines.
- For a summary of new additions and updates shipped with TensorRT-OSS releases, please refer to the Changelog.
Build
Prerequisites
To build the TensorRT-OSS components, you will first need the following software packages.
TensorRT GA build
- TensorRT v7.2.1
- See Downloading TensorRT Builds for details
System Packages
- CUDA
- Recommended versions:
- cuda-11.1 + cuDNN-8.0
- cuda-11.0 + cuDNN-8.0
- cuda-10.2 + cuDNN-8.0
- GNU make >= v4.1
- cmake >= v3.13
- python >= v3.6.5
- pip >= v19.0
- Essential utilities
- git, pkg-config, wget, zlib
Optional Packages
- Containerized build
- Docker >= 19.03
- NVIDIA Container Toolkit
- Toolchains and SDKs
- (Cross compilation for Jetson platform) NVIDIA JetPack >= 4.4
- (For Windows builds) Visual Studio 2017 Community or Enterprise edition
- (Cross compilation for QNX platform) QNX Toolchain
- PyPI packages (for demo applications/tests)
- numpy
- onnx 1.6.0
- onnxruntime >= 1.3.0
- pytest
- tensorflow-gpu 1.15.4
- Code formatting tools (for contributors)
- Clang-format
- Git-clang-format
NOTE: onnx-tensorrt, cub, and protobuf packages are downloaded along with TensorRT OSS, and not required to be installed.
Downloading TensorRT Build
- #### Download TensorRT OSS On Linux: Bash «`bash git clone -b master https://github.com/nvidia/TensorRT TensorRT cd TensorRT git submodule update –init –recursive export TRT_SOURCE=
pwd
«` On Windows: Powershell «`powershell git clone -b master https://github.com/nvidia/TensorRT TensorRT cd TensorRT git submodule update –init –recursive- Env
- RT_SOURCE =
(Get-Location) «`
-
#### Download TensorRT GA To build TensorRT OSS, obtain the corresponding TensorRT GA build from NVIDIA Developer Zone.
Example: Ubuntu 18.04 on x86-64 with cuda-11.1
Download and extract the latest TensorRT 7.2.1 GA package for Ubuntu 18.04 and CUDA 11.1 «`bash cd ~/Downloads tar -xvzf TensorRT-7.2.1.6.Ubuntu-18.04.x86_64-gnu.cuda-11.1.cudnn8.0.tar.gz export TRT_RELEASE=
pwd
/TensorRT-7.2.1.6 «` Example: Ubuntu 18.04 on PowerPC with cuda-11.0Download and extract the latest TensorRT 7.2.1 GA package for Ubuntu 18.04 and CUDA 11.0 «`bash cd ~/Downloads tar -xvzf TensorRT-7.2.1.6.Ubuntu-18.04.powerpc64le-gnu.cuda-11.0.cudnn8.0.tar.gz export TRT_RELEASE=
pwd
/TensorRT-7.2.1.6 «` Example: CentOS/RedHat 7 on x86-64 with cuda-11.0Download and extract the TensorRT 7.2.1 GA for CentOS/RedHat 7 and CUDA 11.0 tar package «`bash cd ~/Downloads tar -xvzf TensorRT-7.2.1.6.CentOS-7.6.x86_64-gnu.cuda-11.0.cudnn8.0.tar.gz export TRT_RELEASE=
pwd
/TensorRT-7.2.1.6 «` Example: Ubuntu18.04 Cross-Compile for QNX with cuda-10.2Download and extract the TensorRT 7.2.1 GA for QNX and CUDA 10.2 tar package «`bash cd ~/Downloads tar -xvzf TensorRT-7.2.1.6.Ubuntu-18.04.aarch64-qnx.cuda-10.2.cudnn7.6.tar.gz export TRT_RELEASE=
pwd
/TensorRT-7.2.1.6 export QNX_HOST=/<path-to-qnx-toolchain>/host/linux/x86_64 export QNX_TARGET=/<path-to-qnx-toolchain>/target/qnx7 «` Example: Windows on x86-64 with cuda-11.0Download and extract the TensorRT 7.2.1 GA for Windows and CUDA 11.0 zip package and add msbuild to PATH «`powershell cd ~\Downloads Expand-Archive .\TensorRT-7.2.1.6.Windows10.x86_64.cuda-11.0.cudnn8.0.zip
- Env
- RT_RELEASE = ‘
(Get-Location)\TensorRT-7.2.1.6′ $Env:PATH += ‘C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\MSBuild\15.0\Bin\’ «`
- #### (Optional) JetPack SDK for Jetson builds Using the JetPack SDK manager, download the host components. Steps:
- Download and launch the SDK manager. Login with your developer account.
- Select the platform and target OS (example: Jetson AGX Xavier,
Linux Jetpack 4.4
), and click Continue. - Under
Download & Install Options
change the download folder and selectDownload now, Install later
. Agree to the license terms and click Continue. - Move the extracted files into the
$TRT_SOURCE/docker/jetpack_files
folder.
Setting Up The Build Environment
For native builds, install the prerequisite System Packages. Alternatively (recommended for non-Windows builds), install Docker and generate a build container as described below:
-
#### Generate the TensorRT-OSS build container. The TensorRT-OSS build container can be generated using the Dockerfiles and build script included with TensorRT-OSS. The build container is bundled with packages and environment required for building TensorRT OSS.
Example: Ubuntu 18.04 on x86-64 with cuda-11.1 «`bash ./docker/build.sh –file docker/ubuntu.Dockerfile –tag tensorrt-ubuntu –os 18.04 –cuda 11.1 «` Example: Ubuntu 18.04 on PowerPC with cuda-11.0 «`bash ./docker/build.sh –file docker/ubuntu-cross-ppc64le.Dockerfile –tag tensorrt-ubuntu-ppc –os 18.04 –cuda 11.0 «` Example: CentOS/RedHat 7 on x86-64 with cuda-11.0 «`bash ./docker/build.sh –file docker/centos.Dockerfile –tag tensorrt-centos –os 7 –cuda 11.0 «` Example: Ubuntu 18.04 Cross-Compile for Jetson (arm64) with cuda-10.2 (JetPack) «`bash ./docker/build.sh –file docker/ubuntu-cross-aarch64.Dockerfile –tag tensorrt-cross-jetpack –os 18.04 –cuda 10.2 «`
- #### Launch the TensorRT-OSS build container. Example: Ubuntu 18.04 build container «`bash ./docker/launch.sh –tag tensorrt-ubuntu –gpus all –release $TRT_RELEASE –source $TRT_SOURCE «` > NOTE:
- Use the tag corresponding to the build container you generated in
- To run TensorRT/CUDA programs in the build container, install NVIDIA Container Toolkit. Docker versions < 19.03 require
nvidia-docker2
and--runtime=nvidia
flag for docker run commands. On versions >= 19.03, you need thenvidia-container-toolkit
package and--gpus all
flag.
Building TensorRT-OSS
-
Generate Makefiles or VS project (Windows) and build.
Example: Linux (x86-64) build with default cuda-11.1 «`bash cd $TRT_SOURCE mkdir -p build && cd build cmake .. -DTRT_LIB_DIR=$TRT_RELEASE/lib -DTRT_OUT_DIR=
pwd
/out make -j «` Example: Native build on Jetson (arm64) with cuda-10.2 «`bash cd $TRT_SOURCE mkdir -p build && cd build cmake .. -DTRT_LIB_DIR=$TRT_RELEASE/lib -DTRT_OUT_DIR=pwd
/out -DTRT_PLATFORM_ID=aarch64 -DCUDA_VERSION=10.2 make -j «` Example: Ubuntu 18.04 Cross-Compile for Jetson (arm64) with cuda-10.2 (JetPack) «`bash cd $TRT_SOURCE mkdir -p build && cd build cmake .. -DTRT_LIB_DIR=$TRT_RELEASE/lib -DTRT_OUT_DIR=pwd
/out -DCMAKE_TOOLCHAIN_FILE=$TRT_SOURCE/cmake/toolchains/cmake_aarch64.toolchain -DCUDA_VERSION=10.2 make -j «` Example: Cross-Compile for QNX with cuda-10.2 «`bash cd $TRT_SOURCE mkdir -p build && cd build cmake .. -DTRT_LIB_DIR=$TRT_RELEASE/lib -DTRT_OUT_DIR=pwd
/out -DCMAKE_TOOLCHAIN_FILE=$TRT_SOURCE/cmake/toolchains/cmake_qnx.toolchain -DCUDA_VERSION=10.2 make -j «` Example: Windows (x86-64) build in Powershell «`powershell cd $Env:TRT_SOURCE mkdir -p build ; cd build cmake .. -DTRT_LIB_DIR=- Env
- RT_RELEASE\lib -DTRT_OUT_DIR=’
(Get-Location)\out’ -DCMAKE_TOOLCHAIN_FILE=..\cmake\toolchains\cmake_x64_win.toolchain msbuild ALL_BUILD.vcxproj «` > NOTE:
- The default CUDA version used by CMake is 11.1. To override this, for example to 10.2, append
-DCUDA_VERSION=10.2
to the cmake command. - If samples fail to link on CentOS7, create this symbolic link:
ln -s $TRT_OUT_DIR/libnvinfer_plugin.so $TRT_OUT_DIR/libnvinfer_plugin.so.7
- Required CMake build arguments are:
TRT_LIB_DIR
: Path to the TensorRT installation directory containing libraries.TRT_OUT_DIR
: Output directory where generated build artifacts will be copied.
- Optional CMake build arguments:
CMAKE_BUILD_TYPE
: Specify if binaries generated are for release or debug (contain debug symbols). Values consists of [Release
] |Debug
CUDA_VERISON
: The version of CUDA to target, for example [11.1
].CUDNN_VERSION
: The version of cuDNN to target, for example [8.0
].NVCR_SUFFIX
: Optional nvcr/cuda image suffix. Set to «-rc» for CUDA11 RC builds until general availability. Blank by default.PROTOBUF_VERSION
: The version of Protobuf to use, for example [3.0.0
]. Note: Changing this will not configure CMake to use a system version of Protobuf, it will configure CMake to download and try building that version.CMAKE_TOOLCHAIN_FILE
: The path to a toolchain file for cross compilation.BUILD_PARSERS
: Specify if the parsers should be built, for example [ON
] |OFF
. If turned OFF, CMake will try to find precompiled versions of the parser libraries to use in compiling samples. First in${TRT_LIB_DIR}
, then on the system. If the build type is Debug, then it will prefer debug builds of the libraries before release versions if available.BUILD_PLUGINS
: Specify if the plugins should be built, for example [ON
] |OFF
. If turned OFF, CMake will try to find a precompiled version of the plugin library to use in compiling samples. First in${TRT_LIB_DIR}
, then on the system. If the build type is Debug, then it will prefer debug builds of the libraries before release versions if available.BUILD_SAMPLES
: Specify if the samples should be built, for example [ON
] |OFF
.CUB_VERSION
: The version of CUB to use, for example [1.8.0
].GPU_ARCHS
: GPU (SM) architectures to target. By default we generate CUDA code for all major SMs. Specific SM versions can be specified here as a quoted space-separated list to reduce compilation time and binary size. Table of compute capabilities of NVIDIA GPUs can be found here. Examples:- NVidia A100:
-DGPU_ARCHS="80"
- Tesla T4, GeForce RTX 2080:
-DGPU_ARCHS="75"
- Titan V, Tesla V100:
-DGPU_ARCHS="70"
- Multiple SMs:
-DGPU_ARCHS="80 75"
- NVidia A100:
TRT_PLATFORM_ID
: Bare-metal build (unlike containerized cross-compilation) on non Linux/x86 platforms must explicitly specify the target platform. Currently supported options:x86_64
(default),aarch64
(Optional) Install TensorRT python bindings
-
The TensorRT python API bindings must be installed for running TensorRT python applications
Example: install TensorRT wheel for python 3.6 «`bash pip3 install $TRT_RELEASE/python/tensorrt-7.2.1.6-cp36-none-linux_x86_64.whl «`
References
TensorRT Resources
- TensorRT Homepage
- TensorRT Developer Guide
- TensorRT Sample Support Guide
- TensorRT Discussion Forums
- TensorRT Release Notes.
Known Issues
TensorRT 7.2.1
- None