Feature GV100 generation Volta: 5376 CUDA cores, 21.1 billion transistors and 672 additional kernel Tensor Cores
In April of last year, Nvidia introduced GPU generation Pascal GP100, containing 15 billion transistors and 3840 CUDA cores. Then he seemed to be something incredible, but now the accelerator GeForce GTX 1080 Ti and Titan Xp contain almost the same GPU GP102 and they have no problem to buy.
So it’s time to introduce a new monster. And it was the GPU GV100 generation Volta. Recall, this is not the first announcement of this architecture. Back in September of last year Nvidia introduced to Xavier SoC with GPU generation Volta, but it will still come out on the market until next year.
So, GV100 — new flagship GPU of Nvidia. It consists of 21.1 billion transistors and includes 5376 CUDA cores! Yes, about a year or a little more the market will be consumer video card company with about the same parameters, but now features a new GPU was impressive.
As GP100 at the time, GV100 is not intended for consumer graphics cards. Based on this GPU will be to create a professional accelerators for workstations, servers, and so on.
GV100 will be produced according to 12-nanometer process technology at the facilities of TSMC. Square monster GPU is 815 mm2. For comparison, the area of the GP100 is «only» 610 mm2. Apparently, GV100 is the biggest GPU in history.
GV100 is on the same substrate with the memory HBM2, like its predecessor. Only now its capacity is 900 GB/s, but the volume somehow reduced from 32 to 16 GB, which is strange.
GV100 is also distinguished by the fact that contains additional compute units. Talking about 672 units Tensor Cores that are designed for calculations occurring in the framework of machine learning and deep learning. This is the first such GPU architecture on the market and the consumer counterpart, which will likely be called GV102, obviously this unit will not receive. By the way, the performance in operations of machine and deep learning provided is 120 TFLOPS.
As for classic performance, it reaches 7.5 TFLOPS (FP64), and 15 TFLOPS (FP32).
Like last year, we introduced the Tesla accelerator V100-based GV100, but only with 5120 active CUDA cores and 640 cores Cores Tensor. It is designed as a module with interface NVLink second generation with a bandwidth of 300 GB/s, but later probably will modification in the form of the standard expansion Board with PCIe. It is also known that the GPU in this case operates at a frequency of up to 1455 MHz. Given the incredible complexity of the GPU, we can assume that the consumer GPU will run at even higher frequencies, that is, decisions regarding the generation Pascal frequency should increase. TDP of the accelerator is 300 watts.