両方とも前のリビジョン前のリビジョン次のリビジョン | 前のリビジョン次のリビジョン両方とも次のリビジョン |
ai:npu [2019/10/08 22:42] – [SBC] oga | ai:npu [2019/10/19 18:50] – [Neural Network Accelerator (NPU/NNE)] oga |
---|
参考にしたもの | 参考にしたもの |
| |
* [[https://wlog.flatlib.jp/archive/1/2019-6-16|Snapdragon 845 ARMv8.2A 半精度 fp16 演算命令を使ってみる / Deep Learning 命令]] | |
* [[https://ark.intel.com/content/www/jp/ja/ark/products/186605/intel-core-i9-9900k-processor-16m-cache-up-to-5-00-ghz.html|Core i9-9900K]] | * [[https://ark.intel.com/content/www/jp/ja/ark/products/186605/intel-core-i9-9900k-processor-16m-cache-up-to-5-00-ghz.html|Core i9-9900K]] |
* [[https://en.wikichip.org/wiki/hisilicon/kirin/970|Kirin 970 NPU]] | * [[https://en.wikichip.org/wiki/hisilicon/kirin/970|Kirin 970 NPU]] |
* [[https://images.nvidia.com/content/pdf/tesla/Volta-Architecture-Whitepaper-v1.1-jp.pdf]] | * [[https://images.nvidia.com/content/pdf/tesla/Volta-Architecture-Whitepaper-v1.1-jp.pdf]] |
| |
| 関連 |
| |
| * [[https://wlog.flatlib.jp/archive/1/2019-6-16|Snapdragon 845 ARMv8.2A 半精度 fp16 演算命令を使ってみる / Deep Learning 命令]] |
===== NVIDIA TensorCore ===== | ===== NVIDIA TensorCore ===== |
| |
| |
| |
^ SBC ^ SoC ^ CPU core ^ core ^ CPU clock ^ GPU ^ sp ^ GPU clock ^ GPU fp32 ^ GPU fp16 ^ NPU ^ NPU int16 ^ RAM ^ MEM B/W ^ ROM ^ price ^ | ^ SBC ^ SoC ^ CPU core ^ core ^ CPU clock ^ GPU ^ GPU API ^ sp ^ GPU clock ^ GPU fp32 ^ GPU fp16 ^ NPU ^ NPU int16 ^ RAM ^ MEM B/W ^ ROM ^ price ^ |
| Coral Dev Board | NXP i.MX 8M | Cortex-A53 | 4 | 1.5 GHz | Vivante GC7000 Lite | 16 sp | 1.0 GHz | 32 GFLOPS | 64 GFLOPS | Edge TPU | 4 TOPS | LPDDR4-3200 1GB | 32bit 12.8 GB/s | eMMC 8GB | $150 | | | Coral Dev Board | NXP i.MX 8M | Cortex-A53 | 4 | 1.5 GHz | Vivante GC7000 Lite | ES3.x | 16 sp | 1.0 GHz | 32 GFLOPS | 64 GFLOPS | Edge TPU | 4 TOPS | LPDDR4-3200 1GB | 32bit 12.8 GB/s | eMMC 8GB | $150 | |
| NVIDIA Jetson Nano (DevKit) | Tegra X1 | Cortex-A57 | 4 | 1.43 GHz | Maxwell | 128 sp | 922 MHz | 236 GFLOPS | 472 GFLOPS | -- | -- | LPDDR4-3200 4GB | 64bit 25.6 GB/s | -- | $99 | | | NVIDIA Jetson Nano (DevKit) | Tegra X1 | Cortex-A57 | 4 | 1.43 GHz | Maxwell | ES3.2/GL4.6/Vulkan/CUDA | 128 sp | 922 MHz | 236 GFLOPS | 472 GFLOPS | -- | -- | LPDDR4-3200 4GB | 64bit 25.6 GB/s | -- | $99 | |
| Raspberry Pi 4B | BCM2711 | Cortex-A72 | 4 | 1.5 GHz | VideoCore VI | sp | 500 MHz | GFLOPS | | -- | -- | LPDDR4-2400 4GB | 32?bit 9.6? GB/s | -- | $35 | | | Raspberry Pi 4B | BCM2711 | Cortex-A72 | 4 | 1.5 GHz | VideoCore VI | ES3.x | sp | 500 MHz | GFLOPS | | -- | -- | LPDDR4-2400 4GB | 32?bit 9.6? GB/s | -- | $55 | |
| Raspberry Pi 3B+ | BCM2837B0 | Cortex-A53 | 4 | 1.4 GHz | VideoCore IV | 48 sp | 400 MHz | 28.8 GFLOPS | -- | -- | -- | LPDDR2-900 1GB | 32bit 3.6 GB/s | -- | $35 | | | Raspberry Pi 3B+ | BCM2837B0 | Cortex-A53 | 4 | 1.4 GHz | VideoCore IV | ES2.0 | 48 sp | 300 MHz | 28.8 GFLOPS | -- | -- | -- | LPDDR2-900 1GB | 32bit 3.6 GB/s | -- | $35 | |
| Dragonboard 410c | Snapdragon 410 | Cortex-A53 | 4 | 1.2 GHz | Adreno 306 | 24 sp | 450 MHz | 21.6 GFLOPS | -- | -- | -- | LPDDR3-1066 1GB | 32bit 4.3 GB/s | eMMC 8GB | $75 | | | Dragonboard 410c | Snapdragon 410 | Cortex-A53 | 4 | 1.2 GHz | Adreno 306 | ES3.0 | 24 sp | 450 MHz | 21.6 GFLOPS | -- | -- | -- | LPDDR3-1066 1GB | 32bit 4.3 GB/s | eMMC 8GB | $75 | |
| |
| |
| |