両方とも前のリビジョン前のリビジョン次のリビジョン | 前のリビジョン次のリビジョン両方とも次のリビジョン |
opengl:vfpbench [2013/09/22 19:02] – [解説] oga | opengl:vfpbench [2013/09/23 02:58] – [解説] oga |
---|
====== ARM の VFP 命令ごとの速度比較 ====== | ====== ARM CPU core 毎の浮動小数点演算速度の比較 (VFP/NEON) ====== |
| |
| |
| |
| |
^ 命令 ^ iPad3 ^ Nexus7^ EVO 3D ^ iPhone5^ iPad4 ^ HTL21 ^ Nexus10^ iPhone5s^ | ^ 命令 ^ (1) ^ (2) ^ (3) ^ (4) ^ (5) ^ (6) ^ (7) ^ (8) ^ (9) ^ (10) ^ (11) ^ (12) ^ |
^ ::: ^ C-A9 ^ C-A9 ^Scorpion^ Swift ^ Swift ^ Krait ^ C-A15 ^ A7CPU ^ | ^ ::: ^iPhone3GS^iPod touch4^ Ziio7 ^ LTNote ^ iPad3 ^ Nexus7 ^ EVO 3D ^ iPhone5 ^ iPad4 ^ HTL21 ^ Nexus10^ iPhone5s ^ |
^ ::: ^ A5X ^ Tegra3^ MSM8660^ A6 ^ A6X ^ APQ8064^Exynos5D^ A7 ^ | ^ ::: ^ C-A8 ^ C-A8 ^ C-A8 ^ C-A9 ^ C-A9 ^ C-A9 ^ Scorpion^ Swift ^ Swift ^ Krait ^ C-A15 ^ ? ^ |
^ ::: ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv8 A32 ^ | ^ ::: ^ S5PC100 ^ A4 ^ ZMS-08 ^ Tegra2 ^ A5X ^ Tegra3 ^ MSM8660 ^ A6 ^ A6X ^ APQ8064 ^ Exynos5D^ A7 ^ |
^ ::: ^ VFPv3 ^ VFPv3 ^ VFPv3 ^ VFPv4 ^ VFPv4 ^ VFPv4 ^ VFPv4 ^ VFPv4 ^ | ^ ::: ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv8 A32 ^ |
^ ::: ^ 1.0GHz ^ 1.2GHz ^ 1.2GHz ^ 1.3GHz ^ 1.4GHz ^ 1.5GHz ^ 1.7GHz ^ 1.?GHz ^ | ^ ::: ^ VFPv3 ^ VFPv3 ^ VFPv3 ^ VFPv3 ^ VFPv3 ^ VFPv3 ^ VFPv3 ^ VFPv4 ^ VFPv4 ^ VFPv4 ^ VFPv4 ^ VFPv4 ^ |
| a:m44 vmla_A Q | 4.784 | 3.959 | 2.859 | 1.293 | 1.204 | 1.337 | 0.619 | 0.700 | | ^ ::: ^ 0.6GHz ^ 0.8GHz ^ 1.0GHz ^ 1.0GHz ^ 1.0GHz ^ 1.2GHz ^ 1.2GHz ^ 1.3GHz ^ 1.4GHz ^ 1.5GHz ^ 1.7GHz ^ 1.3GHz? ^ |
| b:m44 vmla_B Q | 2.408 | 2.002 | 1.136 | 1.359 | 1.266 | 0.931 | 0.569 | 0.670 | | | a:m44 vmla_A Q | 9.907 | 7.725 | 7.387 | ----- | 4.784 | 3.959 | 2.859 | 1.293 | 1.204 | 1.337 | 0.619 | 0.700 | |
| c:m44 vmla_A D | 4.781 | 3.980 | 3.053 | 1.669 | 1.554 | 1.889 | 0.557 | 0.649 | | | b:m44 vmla_B Q | 5.874 | 4.200 | 4.542 | ----- | 2.408 | 2.002 | 1.136 | 1.359 | 1.266 | 0.931 | 0.569 | 0.670 | |
| d:m44 vmla_B D | 2.406 | 2.003 | 1.434 | 1.329 | 1.238 | 1.532 | 0.568 | 0.745 | | | c:m44 vmla_A D | 9.816 | 7.052 | 7.331 | ----- | 4.781 | 3.980 | 3.053 | 1.669 | 1.554 | 1.889 | 0.557 | 0.649 | |
| A:m44 vfma_A Q | ----- | ----- | ----- | 1.632 | 1.519 | 1.882 | 0.746 | 0.707 | | | d:m44 vmla_B D | 5.894 | 4.162 | 4.528 | ----- | 2.406 | 2.003 | 1.434 | 1.329 | 1.238 | 1.532 | 0.568 | 0.745 | |
| B:m44 vfma_B Q | ----- | ----- | ----- | 1.594 | 1.484 | 0.695 | 0.840 | 0.699 | | | A:m44 vfma_A Q | ----- | ----- | ----- | ----- | ----- | ----- | ----- | 1.632 | 1.519 | 1.882 | 0.746 | 0.707 | |
| e:fadds A | 4.010 | 3.343 | 3.383 | 3.090 | 2.878 | 2.774 | 2.383 | 3.551 | | | B:m44 vfma_B Q | ----- | ----- | ----- | ----- | ----- | ----- | ----- | 1.594 | 1.484 | 0.695 | 0.840 | 0.699 | |
| f:fmuls A | 4.010 | 3.337 | 3.383 | 3.167 | 2.953 | 2.747 | 2.369 | 3.475 | | | e:fadds A | 50.964 | 36.793 | 49.616 | 4.611 | 4.010 | 3.343 | 3.383 | 3.090 | 2.878 | 2.774 | 2.383 | 3.551 | |
| g:fmacs A | 4.012 | 3.337 | 3.379 | 6.180 | 5.757 | 5.574 | 2.956 | 3.480 | | | f:fmuls A | 49.512 | 36.148 | 55.088 | 4.513 | 4.010 | 3.337 | 3.383 | 3.167 | 2.953 | 2.747 | 2.369 | 3.475 | |
| h:vfma.f32 A | ----- | ----- | ----- | 6.180 | 5.756 | 2.747 | 2.957 | 3.480 | | | g:fmacs A | 78.228 | 56.711 | 99.153 | 4.310 | 4.012 | 3.337 | 3.379 | 6.180 | 5.757 | 5.574 | 2.956 | 3.480 | |
| i:vadd.f32 D A | 4.111 | 3.426 | 3.377 | 3.091 | 2.877 | 2.762 | 1.183 | 1.031 | | | h:vfma.f32 A | ----- | ----- | ----- | ----- | ----- | ----- | ----- | 6.180 | 5.756 | 2.747 | 2.957 | 3.480 | |
| j:vmul.f32 D A | 4.110 | 3.421 | 3.383 | 3.168 | 2.950 | 2.746 | 1.478 | 1.545 | | | i:vadd.f32 D A | 7.331 | 5.136 | 5.509 | ----- | 4.111 | 3.426 | 3.377 | 3.091 | 2.877 | 2.762 | 1.183 | 1.031 | |
| k:vmla.f32 D A | 4.512 | 3.792 | 3.380 | 3.166 | 2.951 | 5.604 | 1.480 | 1.567 | | | j:vmul.f32 D A | 7.044 | 5.134 | 5.511 | ----- | 4.110 | 3.421 | 3.383 | 3.168 | 2.950 | 2.746 | 1.478 | 1.545 | |
| l:vadd.f32 Q A | 8.023 | 6.688 | 3.377 | 3.090 | 2.878 | 2.801 | 2.365 | 1.031 | | | k:vmla.f32 D A | 7.831 | 5.891 | 6.195 | ----- | 4.512 | 3.792 | 3.380 | 3.166 | 2.951 | 5.604 | 1.480 | 1.567 | |
| m:vmul.f32 Q A | 8.022 | 6.681 | 3.384 | 3.166 | 2.952 | 2.761 | 2.364 | 1.548 | | | l:vadd.f32 Q A | 14.065 | 10.339 | 11.016 | ----- | 8.023 | 6.688 | 3.377 | 3.090 | 2.878 | 2.801 | 2.365 | 1.031 | |
| n:vmla.f32 Q A | 8.025 | 6.681 | 3.380 | 3.167 | 2.950 | 5.606 | 2.367 | 1.574 | | | m:vmul.f32 Q A | 14.413 | 10.302 | 11.016 | ----- | 8.022 | 6.681 | 3.384 | 3.166 | 2.952 | 2.761 | 2.364 | 1.548 | |
| o:vfma.f32 D A | ----- | ----- | ----- | 3.167 | 2.494 | 2.833 | 1.479 | 1.574 | | | n:vmla.f32 Q A | 14.381 | 10.287 | 11.035 | ----- | 8.025 | 6.681 | 3.380 | 3.167 | 2.950 | 5.606 | 2.367 | 1.574 | |
| p:fadds B | 4.014 | 3.347 | 5.917 | 6.181 | 5.756 | 3.467 | 2.956 | 6.953 | | | o:vfma.f32 D A | ----- | ----- | ----- | ----- | ----- | ----- | ----- | 3.167 | 2.494 | 2.833 | 1.479 | 1.574 | |
| q:fmuls B | 5.013 | 4.195 | 5.917 | 6.180 | 5.756 | 3.556 | 3.558 | 6.652 | | | p:fadds B | 49.628 | 36.236 | 50.921 | 4.611 | 4.014 | 3.347 | 5.917 | 6.181 | 5.756 | 3.467 | 2.956 | 6.953 | |
| r:fmacs B | 8.023 | 6.688 | 8.451 | 12.361 | 11.514 | 6.298 | 5.912 | 9.867 | | | q:fmuls B | 49.711 | 36.011 | 89.488 | 5.315 | 5.013 | 4.195 | 5.917 | 6.180 | 5.756 | 3.556 | 3.558 | 6.652 | |
| s:vfma.f32 B | ----- | ----- | ----- | 12.363 | 11.513 | 3.430 | 5.910 | 9.859 | | | r:fmacs B | 78.259 | 56.706 | 172.132 | 8.033 | 8.023 | 6.688 | 8.451 | 12.361 | 11.514 | 6.298 | 5.912 | 9.867 | |
| t:vadd.f32 D B | 4.113 | 3.421 | 5.916 | 3.090 | 2.881 | 3.529 | 2.958 | 3.663 | | | s:vfma.f32 B | ----- | ----- | ----- | ----- | ----- | ----- | ----- | 12.363 | 11.513 | 3.430 | 5.910 | 9.859 | |
| u:vmul.f32 D B | 4.118 | 3.422 | 5.073 | 3.169 | 2.949 | 3.447 | 2.364 | 3.114 | | | t:vadd.f32 D B | 7.280 | 5.139 | 5.507 | ----- | 4.113 | 3.421 | 5.916 | 3.090 | 2.881 | 3.529 | 2.958 | 3.663 | |
| v:vmla.f32 D B | 9.027 | 7.561 | 8.451 | 6.180 | 5.755 | 6.293 | 4.728 | 6.185 | | | u:vmul.f32 D B | 7.044 | 5.140 | 5.511 | ----- | 4.118 | 3.422 | 5.073 | 3.169 | 2.949 | 3.447 | 2.364 | 3.114 | |
| w:vadd.f32 Q B | 8.021 | 6.705 | 5.916 | 3.090 | 2.879 | 3.457 | 2.961 | 3.659 | | | v:vmla.f32 D B | 16.371 | 11.661 | 12.392 | ----- | 9.027 | 7.561 | 8.451 | 6.180 | 5.755 | 6.293 | 4.728 | 6.185 | |
| x:vmul.f32 Q B | 8.029 | 6.683 | 5.074 | 3.167 | 2.950 | 3.428 | 2.363 | 3.101 | | | w:vadd.f32 Q B | 14.209 | 10.354 | 11.020 | ----- | 8.021 | 6.705 | 5.916 | 3.090 | 2.879 | 3.457 | 2.961 | 3.659 | |
| y:vmla.f32 Q B | 9.026 | 7.532 | 8.457 | 6.179 | 5.759 | 6.372 | 4.729 | 6.199 | | | x:vmul.f32 Q B | 13.879 | 10.365 | 11.016 | ----- | 8.029 | 6.683 | 5.074 | 3.167 | 2.950 | 3.428 | 2.363 | 3.101 | |
| z:vfma.f32 D B | ----- | ----- | ----- | 6.181 | 5.755 | 3.437 | 4.730 | 6.188 | | | y:vmla.f32 Q B | 16.026 | 11.578 | 12.394 | ----- | 9.026 | 7.532 | 8.457 | 6.179 | 5.759 | 6.372 | 4.729 | 6.199 | |
| | z:vfma.f32 D B | ----- | ----- | ----- | ----- | ----- | ----- | ----- | 6.181 | 5.755 | 3.437 | 4.730 | 6.188 | |
| |
| * ↑数値は実行時間(秒) 数値が小さい方が速い |
| * すべて単精度 32bit float の演算です。 |
| |
| ^ id ^ device ^ SoC ^ CPU core ^ core ^ clock ^ arch ^ VFP/SIMD ^ OS ^ |
| | (1) | Apple iPhone 3GS | S5PC100 | Cortex-A8 | 1 | 0.6GHz | ARMv7A | VFPv3-D32 + NEON | i6.1 | |
| | (2) | Apple iPod touch 4 | A4 | Cortex-A8 | 1 | 0.8GHz | ARMv7A | VFPv3-D32 + NEON | i6.1 | |
| | (3) | Creative Ziio7 | ZMS-08 | Cortex-A8 | 1 | 1.0GHz | ARMv7A | VFPv3-D32 + NEON | A2.2 | |
| | (4) | Life Touch Note NA70W | Tegra250 | Cortex-A9 | 2 | 1.0GHz | ARMv7A | VFPv3-D16 | A2.2 | |
| | (5) | Apple iPad 3 | A5X | Cortex-A9 | 2 | 1.0GHz | ARMv7A | VFPv3-D32 + NEON | i6.1 | |
| | (6) | ASUS Nexus 7 | Tegra 3 | Cortex-A9 | 4 | 1.2GHz | ARMv7A | VFPv3-D32 + NEON | A4.2 | |
| | (7) | HTC EVO 3D ISW12HT | MSM8660 | Scorpion | 2 | 1.2GHz | ARMv7A | VFPv3-D32 + NEON | A4.0 | |
| | (8) | Apple iPhone 5 | A6 | Swift | 2 | 1.3GHz | ARMv7A | VFPv4-D32 + NEON | i6.1 | |
| | (9) | Apple iPad 4 | A6X | Swift | 2 | 1.4GHz | ARMv7A | VFPv4-D32 + NEON | i6.1 | |
| | (10) | HTC J butterfly HTL21 | APQ8064 | Krait | 4 | 1.5GHz | ARMv7A | VFPv4-D32 + NEON | A4.1 | |
| | (11) | Samsung Nexus 10 | Exynos5D | Cortex-A15 | 2 | 1.7GHz | ARMv7A | VFPv4-D32 + NEON | A4.2 | |
| | (12) | Apple iPhone 5s | A7 | ? A7 CPU | 2 | 1.3GHz | ARMv8A + AArch32 | VFPv4-D32 + NEON | i7.0 | |
| |
| |
| * テスト項目の詳細は下記の解説記事を参照 (a:~z:) |
| |
===== 解説 ===== | ===== 解説 ===== |
| |
| * [[http://wlog.flatlib.jp/archive/1/2013-9-23|2013/09/23 iPhone 5s A7 CPU の浮動小数点演算速度 (32bit)]] |
* [[http://wlog.flatlib.jp/archive/1/2013-4-8|2013/04/08 Nexus 10 CPU Cortex-A15 の浮動小数点演算速度]] | * [[http://wlog.flatlib.jp/archive/1/2013-4-8|2013/04/08 Nexus 10 CPU Cortex-A15 の浮動小数点演算速度]] |
* [[http://wlog.flatlib.jp/archive/1/2013-1-9|2013/01/09 Qualcomm APQ8064 GPU Adreno 320 の速度]] | * [[http://wlog.flatlib.jp/archive/1/2013-1-9|2013/01/09 Qualcomm APQ8064 GPU Adreno 320 の速度]] |
* [[http://wlog.flatlib.jp/archive/1/2011-5-2|2011/05/02 Snapdragon の本当の浮動小数点演算能力]] | * [[http://wlog.flatlib.jp/archive/1/2011-5-2|2011/05/02 Snapdragon の本当の浮動小数点演算能力]] |
* [[http://wlog.flatlib.jp/archive/1/2009-10-4|2009/10/04 ARM Cortex-A8 の NEON と浮動小数演算最適化]] | * [[http://wlog.flatlib.jp/archive/1/2009-10-4|2009/10/04 ARM Cortex-A8 の NEON と浮動小数演算最適化]] |
| |
| |
| |