ユーザ用ツール

サイト用ツール


ai:npu

Neural Network Accelerator (NPU/NNE)

CPU core Clock core fp32 fp16 bfloat16 int16 int8 int4
Qualcomm Snapdragon 835 CPU Kryo 280 2.8 + 1.77 4+4 0.139 TFLOPS
Qualcomm Snapdragon 845 CPU Kryo 385 2.45 + 1.9 4+4 0.146 TFLOPS 0.29 TFLOPS
Intel Core i9-9900K 3.6GHz 8 0.922 TFLOPS
Intel Ryzen 9 3950X 3.5GHz 16 1.792 TFLOPS
Mobile NPU/NNE Clock core fp32 fp16 bfloat16 int16 int8 int4
Apple iPhone X Apple A11 Bionic NE 0.6 TOPS
Apple iPhone XS Apple A12 Bionic NE 5 TOPS
Google Pixel 2 VisualCore 3 TOPS ?
Google Pixel 3 VisualCore 3 TOPS ?
Huawei P20 Pro Kirin 970 NPU 1.92 TFLOPS ?
Huawei P30 Pro Kirin 980 NPU 4.22 TFLOPS ?
Samsung Exynos 9820 NPU
Google Edge TPU 4 TOPS
Intel Movidius Compute Stick Myriad 2 VPU
Intel Neural Compute Stick 2 Myriad X VPU 4 TOPS ?
GPU core Clock core fp32 fp16 bfloat16 int16 int8 int4
Google Edge TPU Vivante GC7000Lite 1.0GHz? 16sp? 0.032 TFLOPS 0.064 TFLOPS
NVIDIA Jetson Nano Tegra X Maxwell 0.92GHz 128sp 0.236 TFLOPS 0.472 TFLOPS
AMD RADEON Vega 56 1.47GHz 3584sp 10.54 TFLOPS 21.09 TFLOPS 42.18 TOPS 84.35 TOPS
AMD RADEON Vega 64 1.55GHz 4096sp 12.67 TFLOPS 25.33 TFLOPS 50.66 TOPS 101.32 TOPS
AMD RADEON VII 1.75GHz 3840sp 13.8 TFLOPS 27.7 TFLOPS 55.3 TOPS 110.7 TOPS
NVIDIA GeForce RTX 2060 (Turing) 1.68GHz 240tc 1920sp 6.45 TFLOPS 51.6 (12.90) TFLOPS 103.2 TOPS 206.4 TOPS
NVIDIA GeForce RTX 2060 Super (Turing) 1.65GHz 272tc 2176sp 7.18 TFLOPS 57.4 (14.36) TFLOPS 114.9 TOPS 229.8 TOPS
NVIDIA GeForce RTX 2070 (Turing) 1.62GHz 288tc 2304sp 7.46 TFLOPS 59.7 (14.93) TFLOPS 119.4 TOPS 238.9 TOPS
NVIDIA GeForce RTX 2070 Super (Turing) 1.77GHz 320tc 2560sp 9.62 TFLOPS 72.5 (18.12) TFLOPS 145.0 TOPS 290.0 TOPS
NVIDIA GeForce RTX 2080 (Turing) 1.71GHz 368tc 2944sp 10.07 TFLOPS 80.5 (20.14) TFLOPS 161.1 TOPS 322.2 TOPS
NVIDIA GeForce RTX 2080 Super (Turing) 1.81GHz 384tc 3072sp 11.15 TFLOPS 89.2 (22.3) TFLOPS 178.4 TOPS 356.8 TOPS
NVIDIA GeForce RTX 2080 Ti (Turing) 1.55GHz 544tc 4352sp 13.45 TFLOPS 107.6 (26.9) TFLOPS 215.2 TOPS 430.3 TOPS
NVIDIA Quadro RTX 4000 (Turing) 1.55GHz 288tc 2304sp 7.12 TFLOPS 57.0 (14.2) TFLOPS 113.9 TOPS 227.8 TOPS
NVIDIA Quadro RTX 5000 (Turing) 1.81GHz 384tc 3072sp 11.15 TFLOPS 89.2 (22.3) TFLOPS 178.4 TOPS 356.8 TOPS
NVIDIA Quadro RTX 6000/8000 (Turing) 1.77GHz 576tc 4608sp 16.31 TFLOPS 130.5 (32.6) TFLOPS 261.0 TOPS 522.0 TOPS
NVIDIA Quadro Titan V (Volta) 1.46GHz 640tc 5120sp 14.90 TFLOPS 119.2 (29.8) TFLOPS
NVIDIA Tesla V100 (Volta) 1.53GHz 640tc 5120sp 15.67 TFLOPS 125.3 (31.3) TFLOPS

参考にしたもの

関連

NVIDIA TensorCore

  • Volta
  • Turing
1 TensorCore = 64 mad , GFLOPS = TensorCore * 128 * GHz

SBC

SBC SoC CPU core IA core CPU clock CPU fp32 GPU GPU API sp GPU clock GPU fp32 GPU fp16 ROP NPU NPU int16 RAM MEM B/W ROM price
Coral Dev Board NXP i.MX 8M Cortex-A53 ARMv8.0A 4 1.5 GHz 48 GFLOPS Vivante GC7000 Lite ES3.x 16 sp 1.0 GHz 32 GFLOPS 64 GFLOPS 1 Edge TPU 4 TOPS LPDDR4-3200 1GB 32bit 12.8 GB/s eMMC 8GB $150
NVIDIA Jetson Nano (DevKit) Tegra X1 Cortex-A57 ARMv8.0A 4 1.43 GHz 46 GFLOPS Maxwell ES3.2/GL4.6/Vulkan/CUDA 128 sp 922 MHz 236 GFLOPS 472 GFLOPS 16 LPDDR4-3200 4GB 64bit 25.6 GB/s $99
Raspberry Pi 4B BCM2711 Cortex-A72 ARMv8.0A 4 1.5 GHz 48 GFLOPS VideoCore VI ES3.x 48? sp 500 MHz 32? GFLOPS 64? GFLOPS LPDDR4-2400 4GB 32bit 9.6 GB/s $55
Raspberry Pi 3B+ BCM2837B0 Cortex-A53 ARMv8.0A 4 1.4 GHz 45 GFLOPS VideoCore IV ES2.0 48 sp 300 MHz 28.8 GFLOPS LPDDR2-900 1GB 32bit 3.6 GB/s $35
Raspberry Pi 3B BCM2837 Cortex-A53 ARMv8.0A 4 1.2 GHz 38 GFLOPS VideoCore IV ES2.0 48 sp 300 MHz 28.8 GFLOPS LPDDR2-900 1GB 32bit 3.6 GB/s $35
Raspberry Pi 2B v1.2 BCM2837 Cortex-A53 ARMv8.0A 4 0.9 GHz 29 GFLOPS VideoCore IV ES2.0 48 sp 300 MHz 28.8 GFLOPS LPDDR2-900 1GB 32bit 3.6 GB/s $35
Raspberry Pi 2B BCM2836 Cortex-A7 ARMv7A 4 0.9 GHz 7 GFLOPS VideoCore IV ES2.0 48 sp 250 MHz 24.0 GFLOPS LPDDR2-900 1GB 32bit 3.6 GB/s $35
Raspberry Pi 1B BCM2835 ARM1176JFZ-S ARMv6 1 0.7 GHz 0.7 GFLOPS VideoCore IV ES2.0 48 sp 250 MHz 24.0 GFLOPS 0.5GB $35
Dragonboard 410c Snapdragon 410 Cortex-A53 ARMv8.0A 4 1.2 GHz 38 GFLOPS Adreno 306 ES3.0 24 sp 450 MHz 21.6 GFLOPS 2? LPDDR3-1066 1GB 32bit 4.3 GB/s eMMC 8GB $75
ASUS Tinker Board RK3288 Cortex-A17 ARMv7A 4 1.8 GHz 58 GFLOPS Mali-T764MP4 ES3.x 68 sp 600 MHz 81.6 GFLOPS 163.2 GFLOPS 4 LPDDR3 2GB 64bit GB/s $60
Mali-T760  17sp/core
ai/npu.txt · 最終更新: 2019/10/19 23:32 by oga