CUDA Toolkit 11 - The most powerful SW development platform for building GPU-accelerated apps

Ashish Charan
May 20, 2022
2 min read

CUDA 11 adds the below new features, which are essential to achieve the performance promised by the Ampere architecture.

  • New third-generation Tensor Cores to accelerate mixed-precision matrix operations on different data types, including TF32 and Bfloat16

  • Programming and API for task graphs, asynchronous data movement, fine-grained synchronization, L2 cache residency control

  • Performance optimizations in CUDA libraries for linear algebra, FFTs, matrix multiplication, JPEG decoding, and more

  • Support heterogeneous architectures with GPUs including X86_64, Arm64 server, and POWER architectures

  • CUDA C++ enhancements:

  • Compiler performance and usability improvements

  • New link-time optimization capabilities

  • Support for new host compilers and language standards including C++17

  • Introducing Parallel C++ STL support using libcu++ and integration of CUB as a CUDA C++ core library in the Toolkit

  • Operating System support updates

  • Async-copy is offered as an experimental feature in CUDA 11

  • Licenses free of charge

  • Nsight Tools- Developer tools for tracing, debugging, analyzing, and profiling CUDA applications

  • Nsight Systems 2020.3 & Nsight Compute 2020.1- These releases are available now online, and with CUDA 11 on 7/8, and add support for the new NVIDIA Ampere GPU Architecture, and improved CPU feature parity for Power and ARM Server Base System Architecture.

  • Nsight Systems 2020.3- Including support for MPI, OpenACC, and OpenMP, as well as improvement in the CLI, and complex data mining capabilities

  • Nsight Compute 2020.1- Visualizations for Roofline Analysis, A100 memory system, and data compression, as well as theoretical peak (speed of light) metrics

  • Cuda-gdb - Improved load times, debug information, and parallel cuda-gdb sessions

  • New Compute Sanitizer- A functional correctness checking tool that helps you identify memory and threading errors in your CUDA code

  • IDE integrations - NVIDIA® Nsight™ Visual Studio Edition and NVIDIA® Nsight™ Eclipse Edition and much more.

Try NVIDIA A100 GPU on E2E Cloud here