NVIDIA H100 tensor core GPU
If you’re wondering how Moore’s Law can be so slow while advances like ChatGPT continue to advance, this is what’s happening.
Nvidia’s latest chip, H100, is capable of 34 teraFLOPS in FP64, which is the 64-bit standard standard at which supercomputers are ranked. This same chip, however, can perform 3,958 TeraFLOPS with FP8. FP8 has a precision 8 times lower than FP64. Tensor Cores also accelerate matrix operations. This includes matrix multiplication, accumulation and matrix multiplication. These are heavily used in deep learning calculations.
By focusing on operations for which AI is concerned, the computer’s speed can be increased by more than 100 times!
The accelerated computing revolution has taken a massive leap.
Source:
https://www.nvidia.com/en-us/data-center/h100/