両方とも前のリビジョン前のリビジョン次のリビジョン | 前のリビジョン最新のリビジョン両方とも次のリビジョン |
opengl:vfpbench [2013/09/22 18:54] – oga | opengl:vfpbench [2014/01/26 21:18] – [ARM CPU core 毎の浮動小数点演算速度の比較 (VFP/NEON)] oga |
---|
====== ARM の VFP 命令ごとの速度比較 ====== | ====== ARM CPU core 毎の浮動小数点演算速度の比較 (VFP/NEON) ====== |
| |
| |
| 計測アプリを公開しました。 |
| |
| * [[:app:vfpbench|VFP Benchmark]] |
| * [[:opengl:cpuflops|CPU Flops 理論値]] |
| |
^ ^ C-A9 ^ C-A9 ^Scorpion^ Swift ^ Swift ^ Krait ^ C-A15 ^ | |
^ 命令 ^ A5X ^ Tegra3^ MSM8660^ A6 ^ A6X ^ APQ8064^Exynos5D^ | |
^ ::: ^ VFPv3 ^ VFPv3 ^ VFPv3 ^ VFPv4 ^ VFPv4 ^ VFPv4 ^ VFPv4 ^ | ^ 命令(AArch32) ^ (1) ^ (2) ^ (3) ^ (4) ^ (5) ^ (6) ^ (7) ^ (8) ^ (9) ^ (10) ^ (11) ^ (12) ^ (13) ^ (14) ^ 命令(AArch64) ^ |
| a:m44 vmla_A Q | 4.784 | 3.959 | 2.859 | 1.293 | 1.204 | 1.337 | 0.619 | | ^ ::: ^iPhone3GS^iPod touch4^ Ziio7 ^ LTNote ^ iPad3 ^ Nexus7 ^ EVO 3D ^ iPhone5 ^ iPad4 ^ HTL21 ^ Nexus10^ iPhone5s ^ iPhone5s ^ Kin HDX7 ^ ::: ^ |
| b:m44 vmla_B Q | 2.408 | 2.002 | 1.136 | 1.359 | 1.266 | 0.931 | 0.569 | | ^ ::: ^ C-A8 ^ C-A8 ^ C-A8 ^ C-A9 ^ C-A9 ^ C-A9 ^ Scorpion^ Swift ^ Swift ^ Krait ^ C-A15 ^ Cyclone ^ Cyclone ^ Krait4 ^ ::: ^ |
| c:m44 vmla_A D | 4.781 | 3.980 | 3.053 | 1.669 | 1.554 | 1.889 | 0.557 | | ^ ::: ^ S5PC100 ^ A4 ^ ZMS-08 ^ Tegra2 ^ A5X ^ Tegra3 ^ MSM8660 ^ A6 ^ A6X ^ APQ8064 ^ Exynos5D^ A7 ^ A7 ^ MSM8974 ^ ::: ^ |
| d:m44 vmla_B D | 2.406 | 2.003 | 1.434 | 1.329 | 1.238 | 1.532 | 0.568 | | ^ ::: ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv7 ^ ARMv8 A32 ^ ARMv8 A64 ^ ARMv7 ^ ::: ^ |
| A:m44 vfma_A Q | ----- | ----- | ----- | 1.632 | 1.519 | 1.882 | 0.746 | | ^ ::: ^ VFPv3 ^ VFPv3 ^ VFPv3 ^ VFPv3 ^ VFPv3 ^ VFPv3 ^ VFPv3 ^ VFPv4 ^ VFPv4 ^ VFPv4 ^ VFPv4 ^ VFPv4 ^ AdvNEON ^ VFPv4 ^ ::: ^ |
| B:m44 vfma_B Q | ----- | ----- | ----- | 1.594 | 1.484 | 0.695 | 0.840 | | ^ ::: ^ 0.6GHz ^ 0.8GHz ^ 1.0GHz ^ 1.0GHz ^ 1.0GHz ^ 1.2GHz ^ 1.2GHz ^ 1.3GHz ^ 1.4GHz ^ 1.5GHz ^ 1.7GHz ^ 1.3GHz? ^ 1.3GHz? ^ 2.2GHz ^ ::: ^ |
| e:fadds A | 4.010 | 3.343 | 3.383 | 3.090 | 2.878 | 2.774 | 2.383 | | | a:m44 vmla_A Q | 9.907 | 7.725 | 7.387 | ----- | 4.784 | 3.959 | 2.859 | 1.293 | 1.204 | 1.337 | 0.619 | 0.700 | ----- | 0.661 | | |
| f:fmuls A | 4.010 | 3.337 | 3.383 | 3.167 | 2.953 | 2.747 | 2.369 | | | b:m44 vmla_B Q | 5.874 | 4.200 | 4.542 | ----- | 2.408 | 2.002 | 1.136 | 1.359 | 1.266 | 0.931 | 0.569 | 0.670 | ----- | 0.542 | | |
| g:fmacs A | 4.012 | 3.337 | 3.379 | 6.180 | 5.757 | 5.574 | 2.956 | | | c:m44 vmla_A D | 9.816 | 7.052 | 7.331 | ----- | 4.781 | 3.980 | 3.053 | 1.669 | 1.554 | 1.889 | 0.557 | 0.649 | ----- | 0.888 | | |
| h:vfma.f32 A | ----- | ----- | ----- | 6.180 | 5.756 | 2.747 | 2.957 | | | d:m44 vmla_B D | 5.894 | 4.162 | 4.528 | ----- | 2.406 | 2.003 | 1.434 | 1.329 | 1.238 | 1.532 | 0.568 | 0.745 | ----- | 0.768 | | |
| i:vadd.f32 D A | 4.111 | 3.426 | 3.377 | 3.091 | 2.877 | 2.762 | 1.183 | | | A:m44 vfma_A Q | ----- | ----- | ----- | ----- | ----- | ----- | ----- | 1.632 | 1.519 | 1.882 | 0.746 | 0.707 | 0.692 | 1.178 | fmla v AQ | |
| j:vmul.f32 D A | 4.110 | 3.421 | 3.383 | 3.168 | 2.950 | 2.746 | 1.478 | | | B:m44 vfma_B Q | ----- | ----- | ----- | ----- | ----- | ----- | ----- | 1.594 | 1.484 | 0.695 | 0.840 | 0.699 | 0.696 | 0.463 | fmla v BQ | |
| k:vmla.f32 D A | 4.512 | 3.792 | 3.380 | 3.166 | 2.951 | 5.604 | 1.480 | | | e:fadds A | 50.964 | 36.793 | 49.616 | 4.611 | 4.010 | 3.343 | 3.383 | 3.090 | 2.878 | 2.774 | 2.383 | 3.551 | 1.043 | 1.864 | fadd s A | |
| l:vadd.f32 Q A | 8.023 | 6.688 | 3.377 | 3.090 | 2.878 | 2.801 | 2.365 | | | f:fmuls A | 49.512 | 36.148 | 55.088 | 4.513 | 4.010 | 3.337 | 3.383 | 3.167 | 2.953 | 2.747 | 2.369 | 3.475 | 1.548 | 1.867 | fmul s A | |
| m:vmul.f32 Q A | 8.022 | 6.681 | 3.384 | 3.166 | 2.952 | 2.761 | 2.364 | | | g:fmacs A | 78.228 | 56.711 | 99.153 | 4.310 | 4.012 | 3.337 | 3.379 | 6.180 | 5.757 | 5.574 | 2.956 | 3.480 | ----- | 2.052 | | |
| n:vmla.f32 Q A | 8.025 | 6.681 | 3.380 | 3.167 | 2.950 | 5.606 | 2.367 | | | h:vfma.f32 A | ----- | ----- | ----- | ----- | ----- | ----- | ----- | 6.180 | 5.756 | 2.747 | 2.957 | 3.480 | 3.185 | 1.864 | fmadd s A | |
| o:vfma.f32 D A | ----- | ----- | ----- | 3.167 | 2.494 | 2.833 | 1.479 | | | i:vadd.f32 D A | 7.331 | 5.136 | 5.509 | ----- | 4.111 | 3.426 | 3.377 | 3.091 | 2.877 | 2.762 | 1.183 | 1.031 | 1.031 | 1.866 | fadd.2s D A | |
| p:fadds B | 4.014 | 3.347 | 5.917 | 6.181 | 5.756 | 3.467 | 2.956 | | | j:vmul.f32 D A | 7.044 | 5.134 | 5.511 | ----- | 4.110 | 3.421 | 3.383 | 3.168 | 2.950 | 2.746 | 1.478 | 1.545 | 1.545 | 1.864 | fmul.2s D A | |
| q:fmuls B | 5.013 | 4.195 | 5.917 | 6.180 | 5.756 | 3.556 | 3.558 | | | k:vmla.f32 D A | 7.831 | 5.891 | 6.195 | ----- | 4.512 | 3.792 | 3.380 | 3.166 | 2.951 | 5.604 | 1.480 | 1.567 | ----- | 2.051 | | |
| r:fmacs B | 8.023 | 6.688 | 8.451 | 12.361 | 11.514 | 6.298 | 5.912 | | | o:vfma.f32 D A | ----- | ----- | ----- | ----- | ----- | ----- | ----- | 3.167 | 2.494 | 2.833 | 1.479 | 1.574 | 1.753 | 1.871 | fmla.2s D A | |
| s:vfma.f32 B | ----- | ----- | ----- | 12.363 | 11.513 | 3.430 | 5.910 | | | l:vadd.f32 Q A | 14.065 | 10.339 | 11.016 | ----- | 8.023 | 6.688 | 3.377 | 3.090 | 2.878 | 2.801 | 2.365 | 1.031 | 1.039 | 1.872 | fadd.4s Q A | |
| t:vadd.f32 D B | 4.113 | 3.421 | 5.916 | 3.090 | 2.881 | 3.529 | 2.958 | | | m:vmul.f32 Q A | 14.413 | 10.302 | 11.016 | ----- | 8.022 | 6.681 | 3.384 | 3.166 | 2.952 | 2.761 | 2.364 | 1.548 | 1.548 | 1.879 | fmul.4s Q A | |
| u:vmul.f32 D B | 4.118 | 3.422 | 5.073 | 3.169 | 2.949 | 3.447 | 2.364 | | | n:vmla.f32 Q A | 14.381 | 10.287 | 11.035 | ----- | 8.025 | 6.681 | 3.380 | 3.167 | 2.950 | 5.606 | 2.367 | 1.574 | ----- | 2.059 | | |
| v:vmla.f32 D B | 9.027 | 7.561 | 8.451 | 6.180 | 5.755 | 6.293 | 4.728 | | | N:vfma.f32 Q A | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | 1.696 | ----- | fmla.4s Q A | |
| w:vadd.f32 Q B | 8.021 | 6.705 | 5.916 | 3.090 | 2.879 | 3.457 | 2.961 | | | p:fadds B | 49.628 | 36.236 | 50.921 | 4.611 | 4.014 | 3.347 | 5.917 | 6.181 | 5.756 | 3.467 | 2.956 | 6.953 | 3.663 | ----- | fadd s B | |
| x:vmul.f32 Q B | 8.029 | 6.683 | 5.074 | 3.167 | 2.950 | 3.428 | 2.363 | | | q:fmuls B | 49.711 | 36.011 | 89.488 | 5.315 | 5.013 | 4.195 | 5.917 | 6.180 | 5.756 | 3.556 | 3.558 | 6.652 | 3.296 | ----- | fmul s B | |
| y:vmla.f32 Q B | 9.026 | 7.532 | 8.457 | 6.179 | 5.759 | 6.372 | 4.729 | | | r:fmacs B | 78.259 | 56.706 | 172.132 | 8.033 | 8.023 | 6.688 | 8.451 | 12.361 | 11.514 | 6.298 | 5.912 | 9.867 | ----- | ----- | | |
| z:vfma.f32 D B | ----- | ----- | ----- | 6.181 | 5.755 | 3.437 | 4.730 | | | s:vfma.f32 B | ----- | ----- | ----- | ----- | ----- | ----- | ----- | 12.363 | 11.513 | 3.430 | 5.910 | 9.859 | 3.292 | ----- | fmadd s B | |
| | t:vadd.f32 D B | 7.280 | 5.139 | 5.507 | ----- | 4.113 | 3.421 | 5.916 | 3.090 | 2.881 | 3.529 | 2.958 | 3.663 | 3.643 | 1.865 | fadd.2s D B | |
| | u:vmul.f32 D B | 7.044 | 5.140 | 5.511 | ----- | 4.118 | 3.422 | 5.073 | 3.169 | 2.949 | 3.447 | 2.364 | 3.114 | 3.289 | 2.339 | fmul.2s D B | |
| | v:vmla.f32 D B | 16.371 | 11.661 | 12.392 | ----- | 9.027 | 7.561 | 8.451 | 6.180 | 5.755 | 6.293 | 4.728 | 6.185 | ----- | 3.773 | | |
| | z:vfma.f32 D B | ----- | ----- | ----- | ----- | ----- | ----- | ----- | 6.181 | 5.755 | 3.437 | 4.730 | 6.188 | 6.237 | 2.340 | fmla.2s D B | |
| | w:vadd.f32 Q B | 14.209 | 10.354 | 11.020 | ----- | 8.021 | 6.705 | 5.916 | 3.090 | 2.879 | 3.457 | 2.961 | 3.659 | 3.641 | 1.875 | fadd.4s Q B | |
| | x:vmul.f32 Q B | 13.879 | 10.365 | 11.016 | ----- | 8.029 | 6.683 | 5.074 | 3.167 | 2.950 | 3.428 | 2.363 | 3.101 | 3.276 | 2.340 | fmul.4s Q B | |
| | y:vmla.f32 Q B | 16.026 | 11.578 | 12.394 | ----- | 9.026 | 7.532 | 8.457 | 6.179 | 5.759 | 6.372 | 4.729 | 6.199 | ----- | 3.746 | | |
| | Y:vfma.f32 Q B | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | 6.226 | ----- | fmla.4s Q B | |
| |
| * ↑数値は実行時間(秒) 数値が小さい方が速い |
| * すべて単精度 32bit float の演算です。 |
| |
| ^ id ^ device ^ SoC ^ CPU core ^ core ^ clock ^ arch ^ VFP/SIMD ^ OS ^ |
| | (1) | Apple iPhone 3GS | S5PC100 | Cortex-A8 | 1 | 0.6GHz | ARMv7A | VFPv3-D32 + NEON | i6.1 | |
| | (2) | Apple iPod touch 4 | A4 | Cortex-A8 | 1 | 0.8GHz | ARMv7A | VFPv3-D32 + NEON | i6.1 | |
| | (3) | Creative Ziio7 | ZMS-08 | Cortex-A8 | 1 | 1.0GHz | ARMv7A | VFPv3-D32 + NEON | A2.2 | |
| | (4) | Life Touch Note NA70W | Tegra250 | Cortex-A9 | 2 | 1.0GHz | ARMv7A | VFPv3-D16 | A2.2 | |
| | (5) | Apple iPad 3 | A5X | Cortex-A9 | 2 | 1.0GHz | ARMv7A | VFPv3-D32 + NEON | i6.1 | |
| | (6) | ASUS Nexus 7 | Tegra 3 | Cortex-A9 | 4 | 1.2GHz | ARMv7A | VFPv3-D32 + NEON | A4.2 | |
| | (7) | HTC EVO 3D ISW12HT | MSM8660 | Scorpion | 2 | 1.2GHz | ARMv7A | VFPv3-D32 + NEON | A4.0 | |
| | (8) | Apple iPhone 5 | A6 | Swift | 2 | 1.3GHz | ARMv7A | VFPv4-D32 + NEON | i6.1 | |
| | (9) | Apple iPad 4 | A6X | Swift | 2 | 1.4GHz | ARMv7A | VFPv4-D32 + NEON | i6.1 | |
| | (10) | HTC J butterfly HTL21 | APQ8064 | Krait | 4 | 1.5GHz | ARMv7A | VFPv4-D32 + NEON | A4.1 | |
| | (11) | Samsung Nexus 10 | Exynos5D | Cortex-A15 | 2 | 1.7GHz | ARMv7A | VFPv4-D32 + NEON | A4.2 | |
| | (12) | Apple iPhone 5s | A7 | Cyclone | 2 | 1.3GHz | ARMv8A + AArch32 | VFPv4-D32 + NEON | i7.0 | |
| | (13) | Apple iPhone 5s | A7 | Cyclone | 2 | 1.3GHz | ARMv8A + AArch64 | Advanced NEON | i7.0 | |
| | (14) | Amazon Kindle HDX 7 | MSM8974 | Krait 400 | 4 | 2.2GHz | ARMv7A | VFPv4-D32 + NEON | A4.2 | |
| |
| |
| * テスト項目の詳細は下記の解説記事を参照 (a:~z:) |
| |
| ===== 解説 ===== |
| |
| * [[http://wlog.flatlib.jp/archive/1/2013-9-24|2013/09/24 iPhone 5s A7 CPU の浮動小数点演算速度 (2) (AArch64/64bit)]] |
| * [[http://wlog.flatlib.jp/archive/1/2013-9-23|2013/09/23 iPhone 5s A7 CPU の浮動小数点演算速度 (32bit)]] |
| * [[http://wlog.flatlib.jp/archive/1/2013-4-8|2013/04/08 Nexus 10 CPU Cortex-A15 の浮動小数点演算速度]] |
| * [[http://wlog.flatlib.jp/archive/1/2013-1-9|2013/01/09 Qualcomm APQ8064 GPU Adreno 320 の速度]] |
| * [[http://wlog.flatlib.jp/archive/1/2012-12-23|2012/12/23 Qualcomm APQ8064 Krait/A6 swift の浮動小数点演算能力]] |
| * [[http://wlog.flatlib.jp/archive/1/2011-5-2|2011/05/02 Snapdragon の本当の浮動小数点演算能力]] |
| * [[http://wlog.flatlib.jp/archive/1/2009-10-4|2009/10/04 ARM Cortex-A8 の NEON と浮動小数演算最適化]] |
| |
| |
| |
| |