Accelerate your most demanding HPC and hyperscale data center workloads with NVIDIA® Tesla® GPUs. Data scientists and researchers can now parse petabytes of data orders of magnitude faster than they could by using traditional CPUs, in applications ranging from energy exploration to deep learning. Tesla accelerators also deliver the horsepower needed to run bigger simulations faster than ever before. Plus, Tesla delivers the highest performance and user density for virtual desktops, applications, and workstations.
What is Tesla V100?
NVIDIA® V100 Tensor Core is the most advanced data center GPU ever built to accelerate AI, high-performance computing (HPC), data science and graphics. It’s powered by NVIDIA Volta architecture, comes in 16 and 32GB configurations, and offers the performance of up to 32 CPUs in a single GPU. Data scientists, researchers, and engineers can now spend less time optimizing memory usage and more time designing the next AI breakthrough.
What is Tesla P100?
Today’s data centers rely on many interconnected commodity compute nodes, which limits high-performance computing (HPC) and hyperscale workloads. NVIDIA® Tesla® P100 taps into NVIDIA Pascal™ GPU architecture to deliver a unified platform for accelerating both HPC and AI, dramatically increasing throughput while also reducing cost.
1. Hardware Comparison
|Processor||SMs||CUDA Cores||Tensor Cores||Frequency||Cache||Max. Memory||Memory B/W|
|Nvidia P100||56||3,584||N/A||1,126 MHz||4 MB L2||16 GB||720 GB/s|
|Nvidia V100||80||5,120||640||1.53 GHz||6 MB L2||16 GB||900 GB/s|
2. Benchmark Setup
|CPU||2 x Intel Xeon E5-2680 v3||2 x Intel Xeon E5-2686 v4|
|GPU||Nvidia Tesla P100 PCIe||Nvidia Tesla V100 PCIe|
|OS||RedHat Enterprise Linux 7.4||RedHat Enterprise Linux 7.4|
|Clock Boost||GPU: 1328 MHz, memory: 715 MHz||GPU: 1370 MHz, memory: 1750 MHz|
Modern high-performance computing (HPC) data centers are key to solving some of the world’s most important scientific and engineering challenges. NVIDIA® Tesla® accelerated computing platform powers these modern data centers with the industry-leading applications to accelerate HPC and AI workloads. The Tesla P100 GPU is the engine of the modern data center, delivering breakthrough performance with fewer servers resulting in faster insights and dramatically lower costs. Every HPC data center can benefit from the Tesla platform. Over 450 HPC applications in a broad range of domains are optimized for GPUs, including all 10 of the top 10 HPC applications and every major deep learning framework.
The NVIDIA® V100 Tensor Core GPU is the world’s most powerful accelerator for deep learning, machine learning, high-performance computing (HPC), and graphics. Powered by NVIDIA Volta™, a single V100 Tensor Core GPU offers the performance of nearly 32 CPUs—enabling researchers to tackle challenges that were once unsolvable. The V100 won MLPerf, the first industry-wide AI benchmark, validating itself as the world’s most powerful, scalable, and versatile computing platform.
4. Fundamental & Architectural Differences
|Tesla Product||Tesla V100||Tesla P100|
|Cores / GPU||5120||3584|
|GPU Boost Clock||1530 MHz||1480 MHz|
|Tensor Cores / GPU||640||NA|
|Maximum RAM amount||32 GB||16 GB|
|Memory clock speed||1758 MHz||1430 MHz|
|Memory bandwidth||900.1 GB / s||720.9 GB / s|
|CUDA Support||From 7.0 Version||From 6.0 Version|
|Floating-point performance||14,029 gflops||10,609 gflops|
5. Key Features
Extreme performance Powering HPC, Deep Learning, and many more GPU Computing areas
NVLink™ NVIDIA’s new high speed, high bandwidth interconnect for maximum application scalability
HBM2 Fast, high capacity, extremely efficient CoWoS (Chip-on-Wafer-on-Substrate) stacked memory architecture
Unified Memory, Compute Preemption, and New AI Algorithms Significantly improved programming model and advanced AI software optimized for the Pascal architecture;
16nm FinFET Enables more features, higher performance, and improved power efficiency.
-New Streaming Multiprocessor (SM) Architecture Optimized for Deep Learning
-Second-Generation NVIDIA NVLink™
-HBM2 Memory: Faster, Higher Efficiency
-Volta Multi-Process Service
-Enhanced Unified Memory and Address Translation Services
-Maximum Performance and Maximum Efficiency Modes
-Cooperative Groups and New Cooperative Launch APIs
-Volta Optimized Software
A critical question our customers ask is, what kind of GPU should I choose? Which GPU cards can help me deliver results faster?
If you want maximum Deep Learning performance, Tesla V100 is a great choice because of its performance. The dedicated TensorCores have huge performance potential for deep learning applications. NVIDIA has even termed a new “TensorFLOP” to measure this gain.
Tesla V100 is the fastest NVIDIA GPU available on the market. V100 is 3x faster than P100. If you primarily require a large amount of memory for machine learning, you can use either Tesla P100 or V100.