両方とも前のリビジョン前のリビジョン次のリビジョン | 前のリビジョン次のリビジョン両方とも次のリビジョン |
ai:npu [2019/07/04 23:09] – [Neural Network Accelerator (NPU/NNE)] oga | ai:npu [2019/12/10 22:14] – [SBC] oga |
---|
| AMD RADEON VII | 1.75GHz | 3840sp | 13.8 TFLOPS | 27.7 TFLOPS | | | 55.3 TOPS | 110.7 TOPS | | | AMD RADEON VII | 1.75GHz | 3840sp | 13.8 TFLOPS | 27.7 TFLOPS | | | 55.3 TOPS | 110.7 TOPS | |
| NVIDIA GeForce RTX 2060 (Turing) | 1.68GHz | 240tc 1920sp | 6.45 TFLOPS | 51.6 (12.90) TFLOPS | | | 103.2 TOPS | 206.4 TOPS | | | NVIDIA GeForce RTX 2060 (Turing) | 1.68GHz | 240tc 1920sp | 6.45 TFLOPS | 51.6 (12.90) TFLOPS | | | 103.2 TOPS | 206.4 TOPS | |
| | NVIDIA GeForce RTX 2060 Super (Turing) | 1.65GHz | 272tc 2176sp | 7.18 TFLOPS | 57.4 (14.36) TFLOPS | | | 114.9 TOPS | 229.8 TOPS | |
| NVIDIA GeForce RTX 2070 (Turing) | 1.62GHz | 288tc 2304sp | 7.46 TFLOPS | 59.7 (14.93) TFLOPS | | | 119.4 TOPS | 238.9 TOPS | | | NVIDIA GeForce RTX 2070 (Turing) | 1.62GHz | 288tc 2304sp | 7.46 TFLOPS | 59.7 (14.93) TFLOPS | | | 119.4 TOPS | 238.9 TOPS | |
| NVIDIA GeForce RTX 2070 Super (Turing) | 1.77GHz | 320tc 2560sp | 9.62 TFLOPS | 72.5 (18.12) TFLOPS | | | 145.0 TOPS | 290.0 TOPS | | | NVIDIA GeForce RTX 2070 Super (Turing) | 1.77GHz | 320tc 2560sp | 9.62 TFLOPS | 72.5 (18.12) TFLOPS | | | 145.0 TOPS | 290.0 TOPS | |
参考にしたもの | 参考にしたもの |
| |
* [[https://wlog.flatlib.jp/archive/1/2019-6-16|Snapdragon 845 ARMv8.2A 半精度 fp16 演算命令を使ってみる / Deep Learning 命令]] | |
* [[https://ark.intel.com/content/www/jp/ja/ark/products/186605/intel-core-i9-9900k-processor-16m-cache-up-to-5-00-ghz.html|Core i9-9900K]] | * [[https://ark.intel.com/content/www/jp/ja/ark/products/186605/intel-core-i9-9900k-processor-16m-cache-up-to-5-00-ghz.html|Core i9-9900K]] |
* [[https://en.wikichip.org/wiki/hisilicon/kirin/970|Kirin 970 NPU]] | * [[https://en.wikichip.org/wiki/hisilicon/kirin/970|Kirin 970 NPU]] |
* [[https://images.nvidia.com/content/pdf/tesla/Volta-Architecture-Whitepaper-v1.1-jp.pdf]] | * [[https://images.nvidia.com/content/pdf/tesla/Volta-Architecture-Whitepaper-v1.1-jp.pdf]] |
| |
| 関連 |
| |
| * [[https://wlog.flatlib.jp/archive/1/2019-6-16|Snapdragon 845 ARMv8.2A 半精度 fp16 演算命令を使ってみる / Deep Learning 命令]] |
===== NVIDIA TensorCore ===== | ===== NVIDIA TensorCore ===== |
| |
| |
| |
^ SBC ^ SoC ^ CPU core ^ core ^ CPU clock ^ GPU ^ sp ^ GPU clock ^ GPU fp32 ^ GPU fp16 ^ NPU ^ NPU int16 ^ RAM ^ MEM B/W ^ ROM ^ | ^ SBC ^ SoC ^ CPU core ^ IA ^ core ^ CPU clock ^ CPU fp32 ^ GPU ^ GPU API ^ sp ^ GPU clock ^ GPU fp32 ^ GPU fp16 ^ ROP ^ NPU ^ NPU int16 ^ RAM ^ MEM B/W ^ ROM ^ price ^ |
| Coral Dev Board | NXP i.MX 8M | Cortex-A53 | 4 | 1.5 GHz | Vivante GC7000 Lite | 16 sp | 1.0 GHz | 32 GFLOPS | 64 GFLOPS | Edge TPU | 4 TOPS | LPDDR4-3200 1GB | 32bit 12.8 GB/s | eMMC 8GB | | | Coral Dev Board | NXP i.MX 8M | Cortex-A53 |ARMv8.0A| 4 | 1.5 GHz | 48 GFLOPS | Vivante GC7000 Lite | ES3.x | 16 sp | 1.0 GHz | 32 GFLOPS | 64 GFLOPS | 1 | Edge TPU | 4 TOPS | LPDDR4-3200 1GB | 32bit 12.8 GB/s | eMMC 8GB | $150 | |
| NVIDIA Jetson Nano | Tegra X1 | Cortex-A57 | 4 | 1.4 GHz | Maxwell | 128 sp | 0.92 GHz | 236 GFLOPS | 472 GFLOPS | -- | -- | LPDDR4-3200 4GB | 64bit 25.6 GB/s | eMMC 16GB | | | NVIDIA Jetson Nano (DevKit) | Tegra X1 | Cortex-A57 |ARMv8.0A| 4 | 1.43 GHz | 46 GFLOPS | Maxwell | ES3.2/GL4.6/Vulkan/CUDA | 128 sp | 922 MHz | 236 GFLOPS | 472 GFLOPS | 16 | -- | -- | LPDDR4-3200 4GB | 64bit 25.6 GB/s | -- | $99 | |
| Raspberry Pi 4 | BCM2711 | Cortex-A72 | 4 | 1.5 GHz | VideoCore VI | sp | 0.5 GHz | GFLOPS | | -- | -- | LPDDR4-2400 4GB | ?bit ? GB/s | -- | | | Raspberry Pi 4B | BCM2711 | Cortex-A72 |ARMv8.0A| 4 | 1.5 GHz | 48 GFLOPS | VideoCore VI | ES3.x | 48? sp | 500 MHz | 32? GFLOPS | 64? GFLOPS | | -- | -- | LPDDR4-2400 1-4GB | 32bit 9.6 GB/s | -- | $35-55 | |
| Raspberry Pi 3+ | BCM2837B0 | Cortex-A53 | 4 | 1.4 GHz | VideoCore IV | 48 sp | 0.3 GHz | 28.8 GFLOPS | | -- | -- | LPDDR2-900 1GB | 32bit 3.6 GB/s | -- | | | Raspberry Pi 3B+ | BCM2837B0 | Cortex-A53 |ARMv8.0A| 4 | 1.4 GHz | 45 GFLOPS | VideoCore IV | ES2.0 | 48 sp | 300 MHz | 28.8 GFLOPS | -- | 4 | -- | -- | LPDDR2-900 1GB | 32bit 3.6 GB/s | -- | $35 | |
| | Raspberry Pi 3B | BCM2837 | Cortex-A53 |ARMv8.0A| 4 | 1.2 GHz | 38 GFLOPS | VideoCore IV | ES2.0 | 48 sp | 300 MHz | 28.8 GFLOPS | -- | 4 | -- | -- | LPDDR2-900 1GB | 32bit 3.6 GB/s | -- | $35 | |
| | Raspberry Pi 2B v1.2 | BCM2837 | Cortex-A53 |ARMv8.0A| 4 | 0.9 GHz | 29 GFLOPS | VideoCore IV | ES2.0 | 48 sp | 300 MHz | 28.8 GFLOPS | -- | 4 | -- | -- | LPDDR2-900 1GB | 32bit 3.6 GB/s | -- | $35 | |
| | Raspberry Pi 2B | BCM2836 | Cortex-A7 |ARMv7A | 4 | 0.9 GHz | 7 GFLOPS | VideoCore IV | ES2.0 | 48 sp | 250 MHz | 24.0 GFLOPS | -- | 4 | -- | -- | LPDDR2-900 1GB | 32bit 3.6 GB/s | -- | $35 | |
| | Raspberry Pi 1B | BCM2835 | ARM1176JFZ-S |ARMv6 | 1 | 0.7 GHz | 0.7 GFLOPS | VideoCore IV | ES2.0 | 48 sp | 250 MHz | 24.0 GFLOPS | -- | 4 | -- | -- | 0.5GB | | -- | $35 | |
| | Dragonboard 410c | Snapdragon 410 | Cortex-A53 |ARMv8.0A| 4 | 1.2 GHz | 38 GFLOPS | Adreno 306 | ES3.0 | 24 sp | 450 MHz | 21.6 GFLOPS | -- | 2? | -- | -- | LPDDR3-1066 1GB | 32bit 4.3 GB/s | eMMC 8GB | $75 | |
| | ASUS Tinker Board | RK3288 | Cortex-A17 |ARMv7A | 4 | 1.8 GHz | 58 GFLOPS | Mali-T764MP4 | ES3.x | 68 sp | 600 MHz | 81.6 GFLOPS | 163.2 GFLOPS | 4 | -- | -- | LPDDR3 2GB | 64bit GB/s | -- | $60 | |
| |
| |
| |
| |
| |
| * [[https://www.asus.com/jp/Single-Board-Computer/Tinker-Board/specifications/|ASUS: Tinker Board]] |
| * [[https://www.4gamer.net/games/137/G013737/20131031011/|4Gamer: ARM,新世代GPUコア「Mali-T700」を発表。シェーダコア倍増のハイエンド市場向けと,Androidに特化したエントリー市場向けの2本立て]] |
| |
| <code> |
| Mali-T760 17sp/core |
| </code> |
| |
| |