Figure 1. FP16 and BF16 tensor compute performance
Tensor cores, PFLOPS per chip
Bar chart showing tensor compute performance in FP16/BF16. Data from NVIDIA (NVIDIA Blackwell B300, NVIDIA Blackwell B200, NVIDIA H200, and NVIDIA H20) is shown in red, while data from Huawei (Huawei Ascend 910C) is shown in yellow. The data from NVIDIA Blackwell B300 and NVIDIA Blackwell B200 stand out, with 4.5 PFLOPS, while Huawei Ascend 910C has a data of less than 1 PFLOPS.