opengl:vfpbenchlog
差分
このページの2つのバージョン間の差分を表示します。
両方とも前のリビジョン前のリビジョン次のリビジョン | 前のリビジョン次のリビジョン両方とも次のリビジョン | ||
opengl:vfpbenchlog [2019/01/05 18:03] – [ARM Cortex-A72 (ARMv8A AArch64 arm64) FPU+NEON] oga | opengl:vfpbenchlog [2020/08/15 13:23] – [結果一覧] oga | ||
---|---|---|---|
行 7: | 行 7: | ||
~~NOTOC~~ | ~~NOTOC~~ | ||
+ | |||
===== 結果一覧 ===== | ===== 結果一覧 ===== | ||
+ | ^ Device | ||
+ | ^ ::: ^ ::: ^ ::: ^ Half-p | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | | [[https:// | ||
+ | |||
+ | |||
+ | * Half-p, Single-p, Dobule-p の単位は GFLOPS | ||
+ | |||
+ | |||
+ | |||
+ | ==== 旧リスト ==== | ||
^ Device | ^ Device | ||
- | | PC AMD Ryzen 7 1800X | Win10 | AMD Ryzen 7 1800X | + | | PC AMD Ryzen 9 3950X | Win10 | AMD Ryzen 9 3950X |
- | | PC Intel Core i7-6700K | + | | PC Intel Core i7-6700K |
- | | PC Intel Core i7-4790K | + | | PC Intel Core i7-4790K |
+ | | PC AMD Ryzen 7 1800X | Win10 | AMD Ryzen 7 1800X | Zen | x64 | SSE4.2/ | ||
| Apple Mac mini Late 2012 | OSX.10 | | Apple Mac mini Late 2012 | OSX.10 | ||
| Apple MacBook Pro Late 2011 | OSX.10 | | Apple MacBook Pro Late 2011 | OSX.10 | ||
+ | | Google Pixel 3 | A10 | Snapdragon 845 | Kryo 385(A75/55) | ARMv8.2A | AArch64 | ||
+ | | Essential Phone PH-1 | A10 | Snapdragon 835 | Kryo (A73/53) | ARMv8A | AArch64 | ||
+ | | Amazon Fire HD 10 2019 | A9.0 | Mediatek MT8183 | ||
+ | | PC AMD A10-7870K | ||
| Apple MacBook Pro Late 2013 | OSX.10 | | Apple MacBook Pro Late 2013 | OSX.10 | ||
| iPhone SE | iOS9.3 | | iPhone SE | iOS9.3 | ||
+ | | Chromebook Flip C101PA | ||
| NVIDIA SHIELD Tablet | | NVIDIA SHIELD Tablet | ||
| Apple iPad A8X | i8.0 | Apple A8X | Typhoon | | Apple iPad A8X | i8.0 | Apple A8X | Typhoon | ||
行 28: | 行 73: | ||
| NVIDIA Tegra Note 7 | A4.4 | NVIDIA Tegra 4 | Cortex-A15 | | NVIDIA Tegra Note 7 | A4.4 | NVIDIA Tegra 4 | Cortex-A15 | ||
| PC Intel N3150 Braswell | | PC Intel N3150 Braswell | ||
+ | | Raspberry Pi 4 | Ubuntu | ||
| ASUS Nexus 7 2013 | A4.4 | Qualcomm S4 APQ8064 | | ASUS Nexus 7 2013 | A4.4 | Qualcomm S4 APQ8064 | ||
| HTC J butterfly HTL21 | A4.1 | Qualcomm S4 APQ8064 | | HTC J butterfly HTL21 | A4.1 | Qualcomm S4 APQ8064 | ||
+ | | NVIDIA Jetson nano | Ubuntu | ||
| Apple TV (2015) | | Apple TV (2015) | ||
| Apple iPhone 5s | i8.0 | Apple A7 | Cyclone | | Apple iPhone 5s | i8.0 | Apple A7 | Cyclone | ||
行 7590: | 行 7637: | ||
---- | ---- | ||
- | ---- | + | |
- | ---- | + | |
+ | |||
===== Mobile CPU 64bit ===== | ===== Mobile CPU 64bit ===== | ||
行 8793: | 行 8842: | ||
++++ | ++++ | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ==== Qualcomm Kryo (ARMv8A AArch64 arm64) FPU+NEON ==== | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ++++ZenFone AR Snapdragon 821 big core Kryo 2.3G4Hz x2 ARM64 (AArch64) Android 7.0| | ||
+ | |||
+ | < | ||
+ | ARCH: ARMv8A 3 | ||
+ | FPU: AArch64 NEON | ||
+ | SingleT SP max: 18.681 GFLOPS | ||
+ | SingleT DP max: 9.263 GFLOPS | ||
+ | MultiT | ||
+ | MultiT | ||
+ | CPU core: 2 | ||
+ | FPHP : no | ||
+ | SIMDHP: no | ||
+ | |||
+ | * FPU/NEON (single fp) | ||
+ | TIME(s) | ||
+ | FPU fmul (32bit x1) n8 : 0.284 | ||
+ | FPU fadd (32bit x1) n8 : 0.141 | ||
+ | FPU fmadd (32bit x1) n8 : | ||
+ | NEON fmul.2s (32bit x2) n8 : 0.257 | ||
+ | NEON fadd.2s (32bit x2) n8 : 0.141 16984.7 | ||
+ | NEON fmla.2s (32bit x2) n8 : 0.257 18681.0 | ||
+ | NEON fmul.4s (32bit x4) n8 : 0.514 | ||
+ | NEON fadd.4s (32bit x4) n8 : 0.275 17435.6 | ||
+ | NEON fmla.4s (32bit x4) n8 : 0.520 18448.6 | ||
+ | FPU fmul (32bit x1) ns4 : | ||
+ | FPU fadd (32bit x1) ns4 : | ||
+ | FPU fmadd (32bit x1) ns4 : 0.514 | ||
+ | NEON fmul.2s (32bit x2) ns4 : | ||
+ | NEON fadd.2s (32bit x2) ns4 : | ||
+ | NEON fmla.2s (32bit x2) ns4 : | ||
+ | NEON fmul.4s (32bit x4) ns4 : | ||
+ | NEON fadd.4s (32bit x4) ns4 : | ||
+ | NEON fmla.4s (32bit x4) ns4 : | ||
+ | FPU fmul (32bit x1) n1 : 0.257 | ||
+ | FPU fadd (32bit x1) n1 : 0.141 | ||
+ | FPU fmadd (32bit x1) n1 : | ||
+ | NEON fmul.2s (32bit x2) n1 : 0.257 | ||
+ | NEON fadd.2s (32bit x2) n1 : 0.141 16986.1 | ||
+ | NEON fmla.2s (32bit x2) n1 : 2.056 | ||
+ | NEON fmul.4s (32bit x4) n1 : 0.514 | ||
+ | NEON fadd.4s (32bit x4) n1 : 0.275 17457.1 | ||
+ | NEON fmla.4s (32bit x4) n1 : 2.056 | ||
+ | NEON fmul.4s (32bit x4) n12 : | ||
+ | NEON fadd.4s (32bit x4) n12 : | ||
+ | NEON fmla.4s (32bit x4) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * FPU/NEON (double fp) | ||
+ | TIME(s) | ||
+ | FPU fmul (64bit x1) n8 : 0.278 | ||
+ | FPU fadd (64bit x1) n8 : 0.154 | ||
+ | FPU fmadd (64bit x1) n8 : | ||
+ | NEON fmul.2d (64bit x2) n8 : 0.514 | ||
+ | NEON fadd.2d (64bit x2) n8 : 0.275 | ||
+ | NEON fmla.2d (64bit x2) n8 : 0.520 | ||
+ | FPU fmul (64bit x1) ns4 : | ||
+ | FPU fadd (64bit x1) ns4 : | ||
+ | FPU fmadd (64bit x1) ns4 : 0.514 | ||
+ | NEON fmul.2d (64bit x2) ns4 : | ||
+ | NEON fadd.2d (64bit x2) ns4 : | ||
+ | NEON fmla.2d (64bit x2) ns4 : | ||
+ | FPU fmul (64bit x1) n1 : 0.257 | ||
+ | FPU fadd (64bit x1) n1 : 0.154 | ||
+ | FPU fmadd (64bit x1) n1 : | ||
+ | NEON fmul.2d (64bit x2) n1 : 0.514 | ||
+ | NEON fadd.2d (64bit x2) n1 : 0.275 | ||
+ | NEON fmla.2d (64bit x2) n1 : 2.056 | ||
+ | NEON fmul.2d (64bit x2) n12 : | ||
+ | NEON fadd.2d (64bit x2) n12 : | ||
+ | NEON fmla.2d (64bit x2) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Matrix 4x4 | ||
+ | TIME(s) | ||
+ | C++ code : 0.192 | ||
+ | NEON fmla.4s 128bit A : | ||
+ | NEON fmla.4s 128bit B : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * FPU/NEON (single fp) multi-thread | ||
+ | TIME(s) | ||
+ | FPU fmul (32bit x1) n8 : 0.289 | ||
+ | FPU fadd (32bit x1) n8 : 0.141 17003.7 | ||
+ | FPU fmadd (32bit x1) n8 : | ||
+ | NEON fmul.2s (32bit x2) n8 : 0.257 18702.1 | ||
+ | NEON fadd.2s (32bit x2) n8 : 0.141 34006.9 | ||
+ | NEON fmla.2s (32bit x2) n8 : 0.257 37406.9 | ||
+ | NEON fmul.4s (32bit x4) n8 : 0.514 18694.8 | ||
+ | NEON fadd.4s (32bit x4) n8 : 0.275 34934.6 | ||
+ | NEON fmla.4s (32bit x4) n8 : 0.520 36945.7 | ||
+ | FPU fmul (32bit x1) ns4 : | ||
+ | FPU fadd (32bit x1) ns4 : | ||
+ | FPU fmadd (32bit x1) ns4 : 0.513 | ||
+ | NEON fmul.2s (32bit x2) ns4 : | ||
+ | NEON fadd.2s (32bit x2) ns4 : | ||
+ | NEON fmla.2s (32bit x2) ns4 : | ||
+ | NEON fmul.4s (32bit x4) ns4 : | ||
+ | NEON fadd.4s (32bit x4) ns4 : | ||
+ | NEON fmla.4s (32bit x4) ns4 : | ||
+ | FPU fmul (32bit x1) n1 : 0.257 | ||
+ | FPU fadd (32bit x1) n1 : 0.141 17001.5 | ||
+ | FPU fmadd (32bit x1) n1 : | ||
+ | NEON fmul.2s (32bit x2) n1 : 0.257 18704.8 | ||
+ | NEON fadd.2s (32bit x2) n1 : 0.141 34007.3 | ||
+ | NEON fmla.2s (32bit x2) n1 : 2.053 | ||
+ | NEON fmul.4s (32bit x4) n1 : 0.513 18695.7 | ||
+ | NEON fadd.4s (32bit x4) n1 : 0.275 34960.7 | ||
+ | NEON fmla.4s (32bit x4) n1 : 2.053 | ||
+ | NEON fmul.4s (32bit x4) n12 : | ||
+ | NEON fadd.4s (32bit x4) n12 : | ||
+ | NEON fmla.4s (32bit x4) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * FPU/NEON (double fp) multi-thread | ||
+ | TIME(s) | ||
+ | FPU fmul (64bit x1) n8 : 0.285 | ||
+ | FPU fadd (64bit x1) n8 : 0.154 15589.2 | ||
+ | FPU fmadd (64bit x1) n8 : | ||
+ | NEON fmul.2d (64bit x2) n8 : 0.513 | ||
+ | NEON fadd.2d (64bit x2) n8 : 0.275 17474.9 | ||
+ | NEON fmla.2d (64bit x2) n8 : 0.520 18473.3 | ||
+ | FPU fmul (64bit x1) ns4 : | ||
+ | FPU fadd (64bit x1) ns4 : | ||
+ | FPU fmadd (64bit x1) ns4 : 0.513 | ||
+ | NEON fmul.2d (64bit x2) ns4 : | ||
+ | NEON fadd.2d (64bit x2) ns4 : | ||
+ | NEON fmla.2d (64bit x2) ns4 : | ||
+ | FPU fmul (64bit x1) n1 : 0.257 | ||
+ | FPU fadd (64bit x1) n1 : 0.154 15584.4 | ||
+ | FPU fmadd (64bit x1) n1 : | ||
+ | NEON fmul.2d (64bit x2) n1 : 0.513 | ||
+ | NEON fadd.2d (64bit x2) n1 : 0.275 17467.3 | ||
+ | NEON fmla.2d (64bit x2) n1 : 2.056 | ||
+ | NEON fmul.2d (64bit x2) n12 : | ||
+ | NEON fadd.2d (64bit x2) n12 : | ||
+ | NEON fmla.2d (64bit x2) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Matrix 4x4 multi-thread | ||
+ | TIME(s) | ||
+ | C++ code : 0.202 17703.0 | ||
+ | NEON fmla.4s 128bit A : | ||
+ | NEON fmla.4s 128bit B : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | cpu0 2188800 307200 | ||
+ | cpu1 2188800 307200 | ||
+ | cpu2 2342400 307200 | ||
+ | cpu3 2342400 307200 | ||
+ | |||
+ | Processor : AArch64 Processor rev 1 (aarch64) | ||
+ | processor : 0 | ||
+ | BogoMIPS : 38.40 | ||
+ | Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 | ||
+ | CPU implementer : | ||
+ | CPU architecture: | ||
+ | CPU variant : 0x2 | ||
+ | CPU part : 0x201 | ||
+ | CPU revision : 1 | ||
+ | |||
+ | processor : 1 | ||
+ | BogoMIPS : 38.40 | ||
+ | Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 | ||
+ | CPU implementer : | ||
+ | CPU architecture: | ||
+ | CPU variant : 0x2 | ||
+ | CPU part : 0x201 | ||
+ | CPU revision : 1 | ||
+ | |||
+ | processor : 2 | ||
+ | BogoMIPS : 38.40 | ||
+ | Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 | ||
+ | CPU implementer : | ||
+ | CPU architecture: | ||
+ | CPU variant : 0x2 | ||
+ | CPU part : 0x205 | ||
+ | CPU revision : 1 | ||
+ | |||
+ | processor : 3 | ||
+ | BogoMIPS : 38.40 | ||
+ | Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 | ||
+ | CPU implementer : | ||
+ | CPU architecture: | ||
+ | CPU variant : 0x2 | ||
+ | CPU part : 0x205 | ||
+ | CPU revision : 1 | ||
+ | |||
+ | Hardware : Qualcomm Technologies, | ||
+ | |||
+ | Qualcomm Technologies, | ||
+ | |||
+ | 2019/01/05 16: | ||
+ | </ | ||
+ | |||
+ | ++++ | ||
+ | |||
+ | ++++ZenFone AR Snapdragon 821 little core Kryo 2.18GHz x2 ARM64 (AArch64) Android 7.0| | ||
+ | |||
+ | < | ||
+ | ARCH: ARMv8A 3 | ||
+ | FPU: AArch64 NEON | ||
+ | SingleT SP max: 12.599 GFLOPS | ||
+ | SingleT DP max: 6.259 GFLOPS | ||
+ | MultiT | ||
+ | MultiT | ||
+ | CPU core: 2 | ||
+ | FPHP : no | ||
+ | SIMDHP: no | ||
+ | |||
+ | * FPU/NEON (single fp) | ||
+ | TIME(s) | ||
+ | FPU fmul (32bit x1) n8 : 0.407 | ||
+ | FPU fadd (32bit x1) n8 : 0.209 | ||
+ | FPU fmadd (32bit x1) n8 : | ||
+ | NEON fmul.2s (32bit x2) n8 : 0.380 | ||
+ | NEON fadd.2s (32bit x2) n8 : 0.210 11446.6 | ||
+ | NEON fmla.2s (32bit x2) n8 : 0.381 12598.8 | ||
+ | NEON fmul.4s (32bit x4) n8 : 0.765 | ||
+ | NEON fadd.4s (32bit x4) n8 : 0.409 11736.7 | ||
+ | NEON fmla.4s (32bit x4) n8 : 0.771 12458.3 | ||
+ | FPU fmul (32bit x1) ns4 : | ||
+ | FPU fadd (32bit x1) ns4 : | ||
+ | FPU fmadd (32bit x1) ns4 : 0.761 | ||
+ | NEON fmul.2s (32bit x2) ns4 : | ||
+ | NEON fadd.2s (32bit x2) ns4 : | ||
+ | NEON fmla.2s (32bit x2) ns4 : | ||
+ | NEON fmul.4s (32bit x4) ns4 : | ||
+ | NEON fadd.4s (32bit x4) ns4 : | ||
+ | NEON fmla.4s (32bit x4) ns4 : | ||
+ | FPU fmul (32bit x1) n1 : 0.381 | ||
+ | FPU fadd (32bit x1) n1 : 0.209 | ||
+ | FPU fmadd (32bit x1) n1 : | ||
+ | NEON fmul.2s (32bit x2) n1 : 0.381 | ||
+ | NEON fadd.2s (32bit x2) n1 : 0.210 11438.4 | ||
+ | NEON fmla.2s (32bit x2) n1 : 3.046 | ||
+ | NEON fmul.4s (32bit x4) n1 : 0.761 | ||
+ | NEON fadd.4s (32bit x4) n1 : 0.408 11771.3 | ||
+ | NEON fmla.4s (32bit x4) n1 : 3.046 | ||
+ | NEON fmul.4s (32bit x4) n12 : | ||
+ | NEON fadd.4s (32bit x4) n12 : | ||
+ | NEON fmla.4s (32bit x4) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * FPU/NEON (double fp) | ||
+ | TIME(s) | ||
+ | FPU fmul (64bit x1) n8 : 0.402 | ||
+ | FPU fadd (64bit x1) n8 : 0.230 | ||
+ | FPU fmadd (64bit x1) n8 : | ||
+ | NEON fmul.2d (64bit x2) n8 : 0.761 | ||
+ | NEON fadd.2d (64bit x2) n8 : 0.407 | ||
+ | NEON fmla.2d (64bit x2) n8 : 0.771 | ||
+ | FPU fmul (64bit x1) ns4 : | ||
+ | FPU fadd (64bit x1) ns4 : | ||
+ | FPU fmadd (64bit x1) ns4 : 0.762 | ||
+ | NEON fmul.2d (64bit x2) ns4 : | ||
+ | NEON fadd.2d (64bit x2) ns4 : | ||
+ | NEON fmla.2d (64bit x2) ns4 : | ||
+ | FPU fmul (64bit x1) n1 : 0.383 | ||
+ | FPU fadd (64bit x1) n1 : 0.232 | ||
+ | FPU fmadd (64bit x1) n1 : | ||
+ | NEON fmul.2d (64bit x2) n1 : 0.762 | ||
+ | NEON fadd.2d (64bit x2) n1 : 0.407 | ||
+ | NEON fmla.2d (64bit x2) n1 : 3.168 | ||
+ | NEON fmul.2d (64bit x2) n12 : | ||
+ | NEON fadd.2d (64bit x2) n12 : | ||
+ | NEON fmla.2d (64bit x2) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Matrix 4x4 | ||
+ | TIME(s) | ||
+ | C++ code : 0.277 | ||
+ | NEON fmla.4s 128bit A : | ||
+ | NEON fmla.4s 128bit B : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * FPU/NEON (single fp) multi-thread | ||
+ | TIME(s) | ||
+ | FPU fmul (32bit x1) n8 : 0.396 | ||
+ | FPU fadd (32bit x1) n8 : 0.214 11231.7 | ||
+ | FPU fmadd (32bit x1) n8 : | ||
+ | NEON fmul.2s (32bit x2) n8 : 0.380 12635.4 | ||
+ | NEON fadd.2s (32bit x2) n8 : 0.209 22995.6 | ||
+ | NEON fmla.2s (32bit x2) n8 : 0.379 25303.8 | ||
+ | NEON fmul.4s (32bit x4) n8 : 0.761 12608.9 | ||
+ | NEON fadd.4s (32bit x4) n8 : 0.409 23474.8 | ||
+ | NEON fmla.4s (32bit x4) n8 : 0.779 24650.8 | ||
+ | FPU fmul (32bit x1) ns4 : | ||
+ | FPU fadd (32bit x1) ns4 : | ||
+ | FPU fmadd (32bit x1) ns4 : 0.763 | ||
+ | NEON fmul.2s (32bit x2) ns4 : | ||
+ | NEON fadd.2s (32bit x2) ns4 : | ||
+ | NEON fmla.2s (32bit x2) ns4 : | ||
+ | NEON fmul.4s (32bit x4) ns4 : | ||
+ | NEON fadd.4s (32bit x4) ns4 : | ||
+ | NEON fmla.4s (32bit x4) ns4 : | ||
+ | FPU fmul (32bit x1) n1 : 0.379 | ||
+ | FPU fadd (32bit x1) n1 : 0.211 11398.3 | ||
+ | FPU fmadd (32bit x1) n1 : | ||
+ | NEON fmul.2s (32bit x2) n1 : 0.379 12652.5 | ||
+ | NEON fadd.2s (32bit x2) n1 : 0.209 23004.2 | ||
+ | NEON fmla.2s (32bit x2) n1 : 3.044 | ||
+ | NEON fmul.4s (32bit x4) n1 : 0.757 12680.9 | ||
+ | NEON fadd.4s (32bit x4) n1 : 0.407 23604.5 | ||
+ | NEON fmla.4s (32bit x4) n1 : 3.043 | ||
+ | NEON fmul.4s (32bit x4) n12 : | ||
+ | NEON fadd.4s (32bit x4) n12 : | ||
+ | NEON fmla.4s (32bit x4) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * FPU/NEON (double fp) multi-thread | ||
+ | TIME(s) | ||
+ | FPU fmul (64bit x1) n8 : 0.414 | ||
+ | FPU fadd (64bit x1) n8 : 0.230 10431.7 | ||
+ | FPU fmadd (64bit x1) n8 : | ||
+ | NEON fmul.2d (64bit x2) n8 : 0.759 | ||
+ | NEON fadd.2d (64bit x2) n8 : 0.407 11797.4 | ||
+ | NEON fmla.2d (64bit x2) n8 : 0.769 12475.7 | ||
+ | FPU fmul (64bit x1) ns4 : | ||
+ | FPU fadd (64bit x1) ns4 : | ||
+ | FPU fmadd (64bit x1) ns4 : 0.764 | ||
+ | NEON fmul.2d (64bit x2) ns4 : | ||
+ | NEON fadd.2d (64bit x2) ns4 : | ||
+ | NEON fmla.2d (64bit x2) ns4 : | ||
+ | FPU fmul (64bit x1) n1 : 0.380 | ||
+ | FPU fadd (64bit x1) n1 : 0.229 10484.1 | ||
+ | FPU fmadd (64bit x1) n1 : | ||
+ | NEON fmul.2d (64bit x2) n1 : 0.764 | ||
+ | NEON fadd.2d (64bit x2) n1 : 0.407 11799.8 | ||
+ | NEON fmla.2d (64bit x2) n1 : 3.050 | ||
+ | NEON fmul.2d (64bit x2) n12 : | ||
+ | NEON fadd.2d (64bit x2) n12 : | ||
+ | NEON fmla.2d (64bit x2) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Matrix 4x4 multi-thread | ||
+ | TIME(s) | ||
+ | C++ code : 0.292 12273.5 | ||
+ | NEON fmla.4s 128bit A : | ||
+ | NEON fmla.4s 128bit B : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | cpu0 2188800 307200 | ||
+ | cpu1 2188800 307200 | ||
+ | cpu2 2342400 307200 | ||
+ | cpu3 2342400 307200 | ||
+ | |||
+ | Processor : AArch64 Processor rev 1 (aarch64) | ||
+ | processor : 0 | ||
+ | BogoMIPS : 38.40 | ||
+ | Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 | ||
+ | CPU implementer : | ||
+ | CPU architecture: | ||
+ | CPU variant : 0x2 | ||
+ | CPU part : 0x201 | ||
+ | CPU revision : 1 | ||
+ | |||
+ | processor : 1 | ||
+ | BogoMIPS : 38.40 | ||
+ | Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 | ||
+ | CPU implementer : | ||
+ | CPU architecture: | ||
+ | CPU variant : 0x2 | ||
+ | CPU part : 0x201 | ||
+ | CPU revision : 1 | ||
+ | |||
+ | processor : 2 | ||
+ | BogoMIPS : 38.40 | ||
+ | Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 | ||
+ | CPU implementer : | ||
+ | CPU architecture: | ||
+ | CPU variant : 0x2 | ||
+ | CPU part : 0x205 | ||
+ | CPU revision : 1 | ||
+ | |||
+ | processor : 3 | ||
+ | BogoMIPS : 38.40 | ||
+ | Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 | ||
+ | CPU implementer : | ||
+ | CPU architecture: | ||
+ | CPU variant : 0x2 | ||
+ | CPU part : 0x205 | ||
+ | CPU revision : 1 | ||
+ | |||
+ | Hardware : Qualcomm Technologies, | ||
+ | |||
+ | Qualcomm Technologies, | ||
+ | |||
+ | 2019/01/05 16: | ||
+ | </ | ||
+ | |||
+ | ++++ | ||
+ | |||
+ | |||
+ | |||
+ | ==== Qualcomm Kryo 280 (Cortex-A73 + A53) (ARMv8A AArch64 arm64) FPU+ASIMD ==== | ||
+ | |||
+ | |||
+ | ++++Essential Phone PH-1 Snapdragon 835 Kryo 280 2.45GHz x4 + 1.9GHz x4 ARM64 (AArch64) Android 9.0| | ||
+ | |||
+ | < | ||
+ | Date: 20200810 123729 | ||
+ | ARCH: ARMv8A AArch64 | ||
+ | FPU : ASIMD(AArch64 NEON) | ||
+ | Name: Qualcomm Technologies, | ||
+ | |||
+ | CPU Thread: | ||
+ | CPU Core : 8 | ||
+ | CPU Group : 2 | ||
+ | Group 0: Thread= 4 Clock=1.900800 GHz (mask:f) | ||
+ | Group 1: Thread= 4 Clock=2.457600 GHz (mask:f0) | ||
+ | NEON : yes | ||
+ | FMA : yes | ||
+ | FPHP : no | ||
+ | SIMDHP : no | ||
+ | DotProd: no | ||
+ | |||
+ | Total: | ||
+ | SingleThread HP max: - | ||
+ | SingleThread SP max: | ||
+ | SingleThread DP max: 9.776 GFLOPS | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | |||
+ | Group 0: Thread=4 | ||
+ | SingleThread HP max: - | ||
+ | SingleThread SP max: | ||
+ | SingleThread DP max: 7.401 GFLOPS | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | |||
+ | Group 1: Thread=4 | ||
+ | SingleThread HP max: - | ||
+ | SingleThread SP max: | ||
+ | SingleThread DP max: 9.776 GFLOPS | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | |||
+ | |||
+ | * Group 0: Thread=1 | ||
+ | * FPU/NEON (SP fp) | ||
+ | TIME(s) | ||
+ | FPU fmul (32bit x1) n8 : 0.335 | ||
+ | FPU fadd (32bit x1) n8 : 0.319 | ||
+ | FPU fmadd (32bit x1) n8 : | ||
+ | NEON fmul.2s (32bit x2) n8 : 0.318 | ||
+ | NEON fadd.2s (32bit x2) n8 : 0.318 | ||
+ | NEON fmla.2s (32bit x2) n8 : 0.318 14338.8 | ||
+ | NEON fmul.4s (32bit x4) n8 : 0.622 | ||
+ | NEON fadd.4s (32bit x4) n8 : 0.623 | ||
+ | NEON fmla.4s (32bit x4) n8 : 0.621 14685.5 | ||
+ | FPU fmul (32bit x1) ns4 : | ||
+ | FPU fadd (32bit x1) ns4 : | ||
+ | FPU fmadd (32bit x1) ns4 : 0.607 | ||
+ | NEON fmul.2s (32bit x2) ns4 : | ||
+ | NEON fadd.2s (32bit x2) ns4 : | ||
+ | NEON fmla.2s (32bit x2) ns4 : | ||
+ | NEON fmul.4s (32bit x4) ns4 : | ||
+ | NEON fadd.4s (32bit x4) ns4 : | ||
+ | NEON fmla.4s (32bit x4) ns4 : | ||
+ | FPU fmul (32bit x1) n1 : 0.607 | ||
+ | FPU fadd (32bit x1) n1 : 0.607 | ||
+ | FPU fmadd (32bit x1) n1 : | ||
+ | NEON fmul.2s (32bit x2) n1 : 0.607 | ||
+ | NEON fadd.2s (32bit x2) n1 : 0.607 | ||
+ | NEON fmla.2s (32bit x2) n1 : 2.428 | ||
+ | NEON fmul.4s (32bit x4) n1 : 0.623 | ||
+ | NEON fadd.4s (32bit x4) n1 : 0.623 | ||
+ | NEON fmla.4s (32bit x4) n1 : 2.429 | ||
+ | NEON fmul.4s (32bit x4) n12 : | ||
+ | NEON fadd.4s (32bit x4) n12 : | ||
+ | NEON fmla.4s (32bit x4) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 0: Thread=1 | ||
+ | * FPU/NEON (DP fp) | ||
+ | TIME(s) | ||
+ | FPU fmul (64bit x1) n8 : 0.318 | ||
+ | FPU fadd (64bit x1) n8 : 0.333 | ||
+ | FPU fmadd (64bit x1) n8 : | ||
+ | NEON fmul.2d (64bit x2) n8 : 0.622 | ||
+ | NEON fadd.2d (64bit x2) n8 : 0.622 | ||
+ | NEON fmla.2d (64bit x2) n8 : 0.623 | ||
+ | FPU fmul (64bit x1) ns4 : | ||
+ | FPU fadd (64bit x1) ns4 : | ||
+ | FPU fmadd (64bit x1) ns4 : 0.684 | ||
+ | NEON fmul.2d (64bit x2) ns4 : | ||
+ | NEON fadd.2d (64bit x2) ns4 : | ||
+ | NEON fmla.2d (64bit x2) ns4 : | ||
+ | FPU fmul (64bit x1) n1 : 0.606 | ||
+ | FPU fadd (64bit x1) n1 : 0.607 | ||
+ | FPU fmadd (64bit x1) n1 : | ||
+ | NEON fmul.2d (64bit x2) n1 : 0.621 | ||
+ | NEON fadd.2d (64bit x2) n1 : 0.621 | ||
+ | NEON fmla.2d (64bit x2) n1 : 2.425 | ||
+ | NEON fmul.2d (64bit x2) n12 : | ||
+ | NEON fadd.2d (64bit x2) n12 : | ||
+ | NEON fmla.2d (64bit x2) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 0: Thread=4 | ||
+ | * FPU/NEON (SP fp) multi-thread | ||
+ | TIME(s) | ||
+ | FPU fmul (32bit x1) n8 : 0.334 13666.0 | ||
+ | FPU fadd (32bit x1) n8 : 0.320 14246.0 | ||
+ | FPU fmadd (32bit x1) n8 : | ||
+ | NEON fmul.2s (32bit x2) n8 : 0.319 28609.4 | ||
+ | NEON fadd.2s (32bit x2) n8 : 0.318 28688.6 | ||
+ | NEON fmla.2s (32bit x2) n8 : 0.318 57306.2 | ||
+ | NEON fmul.4s (32bit x4) n8 : 0.623 29292.0 | ||
+ | NEON fadd.4s (32bit x4) n8 : 0.623 29296.3 | ||
+ | NEON fmla.4s (32bit x4) n8 : 0.622 58721.2 | ||
+ | FPU fmul (32bit x1) ns4 : | ||
+ | FPU fadd (32bit x1) ns4 : | ||
+ | FPU fmadd (32bit x1) ns4 : 0.609 14992.5 | ||
+ | NEON fmul.2s (32bit x2) ns4 : | ||
+ | NEON fadd.2s (32bit x2) ns4 : | ||
+ | NEON fmla.2s (32bit x2) ns4 : | ||
+ | NEON fmul.4s (32bit x4) ns4 : | ||
+ | NEON fadd.4s (32bit x4) ns4 : | ||
+ | NEON fmla.4s (32bit x4) ns4 : | ||
+ | FPU fmul (32bit x1) n1 : 0.609 | ||
+ | FPU fadd (32bit x1) n1 : 0.621 | ||
+ | FPU fmadd (32bit x1) n1 : | ||
+ | NEON fmul.2s (32bit x2) n1 : 0.608 15003.8 | ||
+ | NEON fadd.2s (32bit x2) n1 : 0.607 15024.3 | ||
+ | NEON fmla.2s (32bit x2) n1 : 2.425 | ||
+ | NEON fmul.4s (32bit x4) n1 : 0.621 29364.2 | ||
+ | NEON fadd.4s (32bit x4) n1 : 0.623 29273.6 | ||
+ | NEON fmla.4s (32bit x4) n1 : 2.431 15015.4 | ||
+ | NEON fmul.4s (32bit x4) n12 : | ||
+ | NEON fadd.4s (32bit x4) n12 : | ||
+ | NEON fmla.4s (32bit x4) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 0: Thread=4 | ||
+ | * FPU/NEON (DP fp) multi-thread | ||
+ | TIME(s) | ||
+ | FPU fmul (64bit x1) n8 : 0.321 14232.9 | ||
+ | FPU fadd (64bit x1) n8 : 0.333 13683.0 | ||
+ | FPU fmadd (64bit x1) n8 : | ||
+ | NEON fmul.2d (64bit x2) n8 : 0.622 14665.8 | ||
+ | NEON fadd.2d (64bit x2) n8 : 0.622 14673.4 | ||
+ | NEON fmla.2d (64bit x2) n8 : 0.623 29311.8 | ||
+ | FPU fmul (64bit x1) ns4 : | ||
+ | FPU fadd (64bit x1) ns4 : | ||
+ | FPU fmadd (64bit x1) ns4 : 0.685 13321.0 | ||
+ | NEON fmul.2d (64bit x2) ns4 : | ||
+ | NEON fadd.2d (64bit x2) ns4 : | ||
+ | NEON fmla.2d (64bit x2) ns4 : | ||
+ | FPU fmul (64bit x1) n1 : 0.607 | ||
+ | FPU fadd (64bit x1) n1 : 0.608 | ||
+ | FPU fmadd (64bit x1) n1 : | ||
+ | NEON fmul.2d (64bit x2) n1 : 0.625 14589.9 | ||
+ | NEON fadd.2d (64bit x2) n1 : 0.621 14682.3 | ||
+ | NEON fmla.2d (64bit x2) n1 : 2.427 | ||
+ | NEON fmul.2d (64bit x2) n12 : | ||
+ | NEON fadd.2d (64bit x2) n12 : | ||
+ | NEON fmla.2d (64bit x2) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 1: Thread=1 | ||
+ | * FPU/NEON (SP fp) | ||
+ | TIME(s) | ||
+ | FPU fmul (32bit x1) n8 : 0.317 | ||
+ | FPU fadd (32bit x1) n8 : 0.317 | ||
+ | FPU fmadd (32bit x1) n8 : | ||
+ | NEON fmul.2s (32bit x2) n8 : 0.318 | ||
+ | NEON fadd.2s (32bit x2) n8 : 0.317 | ||
+ | NEON fmla.2s (32bit x2) n8 : 0.317 18615.9 | ||
+ | NEON fmul.4s (32bit x4) n8 : 0.603 | ||
+ | NEON fadd.4s (32bit x4) n8 : 0.604 | ||
+ | NEON fmla.4s (32bit x4) n8 : 0.604 19545.9 | ||
+ | FPU fmul (32bit x1) ns4 : | ||
+ | FPU fadd (32bit x1) ns4 : | ||
+ | FPU fmadd (32bit x1) ns4 : 0.754 | ||
+ | NEON fmul.2s (32bit x2) ns4 : | ||
+ | NEON fadd.2s (32bit x2) ns4 : | ||
+ | NEON fmla.2s (32bit x2) ns4 : | ||
+ | NEON fmul.4s (32bit x4) ns4 : | ||
+ | NEON fadd.4s (32bit x4) ns4 : | ||
+ | NEON fmla.4s (32bit x4) ns4 : | ||
+ | FPU fmul (32bit x1) n1 : 0.317 | ||
+ | FPU fadd (32bit x1) n1 : 0.317 | ||
+ | FPU fmadd (32bit x1) n1 : | ||
+ | NEON fmul.2s (32bit x2) n1 : 0.317 | ||
+ | NEON fadd.2s (32bit x2) n1 : 0.317 | ||
+ | NEON fmla.2s (32bit x2) n1 : 1.810 | ||
+ | NEON fmul.4s (32bit x4) n1 : 0.604 | ||
+ | NEON fadd.4s (32bit x4) n1 : 0.604 | ||
+ | NEON fmla.4s (32bit x4) n1 : 1.811 | ||
+ | NEON fmul.4s (32bit x4) n12 : | ||
+ | NEON fadd.4s (32bit x4) n12 : | ||
+ | NEON fmla.4s (32bit x4) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 1: Thread=1 | ||
+ | * FPU/NEON (DP fp) | ||
+ | TIME(s) | ||
+ | FPU fmul (64bit x1) n8 : 0.317 | ||
+ | FPU fadd (64bit x1) n8 : 0.317 | ||
+ | FPU fmadd (64bit x1) n8 : | ||
+ | NEON fmul.2d (64bit x2) n8 : 0.603 | ||
+ | NEON fadd.2d (64bit x2) n8 : 0.603 | ||
+ | NEON fmla.2d (64bit x2) n8 : 0.603 | ||
+ | FPU fmul (64bit x1) ns4 : | ||
+ | FPU fadd (64bit x1) ns4 : | ||
+ | FPU fmadd (64bit x1) ns4 : 0.531 | ||
+ | NEON fmul.2d (64bit x2) ns4 : | ||
+ | NEON fadd.2d (64bit x2) ns4 : | ||
+ | NEON fmla.2d (64bit x2) ns4 : | ||
+ | FPU fmul (64bit x1) n1 : 0.317 | ||
+ | FPU fadd (64bit x1) n1 : 0.317 | ||
+ | FPU fmadd (64bit x1) n1 : | ||
+ | NEON fmul.2d (64bit x2) n1 : 0.603 | ||
+ | NEON fadd.2d (64bit x2) n1 : 0.603 | ||
+ | NEON fmla.2d (64bit x2) n1 : 1.810 | ||
+ | NEON fmul.2d (64bit x2) n12 : | ||
+ | NEON fadd.2d (64bit x2) n12 : | ||
+ | NEON fmla.2d (64bit x2) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 1: Thread=4 | ||
+ | * FPU/NEON (SP fp) multi-thread | ||
+ | TIME(s) | ||
+ | FPU fmul (32bit x1) n8 : 0.335 17600.7 | ||
+ | FPU fadd (32bit x1) n8 : 0.329 17915.7 | ||
+ | FPU fmadd (32bit x1) n8 : | ||
+ | NEON fmul.2s (32bit x2) n8 : 0.329 35832.1 | ||
+ | NEON fadd.2s (32bit x2) n8 : 0.329 35831.5 | ||
+ | NEON fmla.2s (32bit x2) n8 : 0.329 71648.0 | ||
+ | NEON fmul.4s (32bit x4) n8 : 0.627 37622.6 | ||
+ | NEON fadd.4s (32bit x4) n8 : 0.627 37624.2 | ||
+ | NEON fmla.4s (32bit x4) n8 : 0.627 75249.0 | ||
+ | FPU fmul (32bit x1) ns4 : | ||
+ | FPU fadd (32bit x1) ns4 : | ||
+ | FPU fmadd (32bit x1) ns4 : 0.784 15047.0 | ||
+ | NEON fmul.2s (32bit x2) ns4 : | ||
+ | NEON fadd.2s (32bit x2) ns4 : | ||
+ | NEON fmla.2s (32bit x2) ns4 : | ||
+ | NEON fmul.4s (32bit x4) ns4 : | ||
+ | NEON fadd.4s (32bit x4) ns4 : | ||
+ | NEON fmla.4s (32bit x4) ns4 : | ||
+ | FPU fmul (32bit x1) n1 : 0.329 17914.3 | ||
+ | FPU fadd (32bit x1) n1 : 0.329 17914.5 | ||
+ | FPU fmadd (32bit x1) n1 : | ||
+ | NEON fmul.2s (32bit x2) n1 : 0.329 35829.5 | ||
+ | NEON fadd.2s (32bit x2) n1 : 0.329 35830.3 | ||
+ | NEON fmla.2s (32bit x2) n1 : 1.881 12541.3 | ||
+ | NEON fmul.4s (32bit x4) n1 : 0.627 37625.7 | ||
+ | NEON fadd.4s (32bit x4) n1 : 0.627 37623.9 | ||
+ | NEON fmla.4s (32bit x4) n1 : 1.881 25082.9 | ||
+ | NEON fmul.4s (32bit x4) n12 : | ||
+ | NEON fadd.4s (32bit x4) n12 : | ||
+ | NEON fmla.4s (32bit x4) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 1: Thread=4 | ||
+ | * FPU/NEON (DP fp) multi-thread | ||
+ | TIME(s) | ||
+ | FPU fmul (64bit x1) n8 : 0.329 17914.1 | ||
+ | FPU fadd (64bit x1) n8 : 0.329 17915.2 | ||
+ | FPU fmadd (64bit x1) n8 : | ||
+ | NEON fmul.2d (64bit x2) n8 : 0.627 18810.7 | ||
+ | NEON fadd.2d (64bit x2) n8 : 0.627 18813.0 | ||
+ | NEON fmla.2d (64bit x2) n8 : 0.627 37620.8 | ||
+ | FPU fmul (64bit x1) ns4 : | ||
+ | FPU fadd (64bit x1) ns4 : | ||
+ | FPU fmadd (64bit x1) ns4 : 0.554 21294.7 | ||
+ | NEON fmul.2d (64bit x2) ns4 : | ||
+ | NEON fadd.2d (64bit x2) ns4 : | ||
+ | NEON fmla.2d (64bit x2) ns4 : | ||
+ | FPU fmul (64bit x1) n1 : 0.329 17915.9 | ||
+ | FPU fadd (64bit x1) n1 : 0.329 17916.5 | ||
+ | FPU fmadd (64bit x1) n1 : | ||
+ | NEON fmul.2d (64bit x2) n1 : 0.627 18812.4 | ||
+ | NEON fadd.2d (64bit x2) n1 : 0.627 18813.0 | ||
+ | NEON fmla.2d (64bit x2) n1 : 1.881 12541.2 | ||
+ | NEON fmul.2d (64bit x2) n12 : | ||
+ | NEON fadd.2d (64bit x2) n12 : | ||
+ | NEON fmla.2d (64bit x2) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | </ | ||
+ | |||
+ | ++++ | ||
+ | |||
+ | |||
+ | |||
+ | ==== Qualcomm Kryo 385 (Cortex-A75 + A55) (ARMv8.2A AArch64 arm64) FPU+ASIMD+HALFFP ==== | ||
+ | |||
+ | |||
+ | ++++Pixel 3 Snapdragon 845 Kryo 385 2.8GHz x4 + 1.77GHz x4 ARM64 (AArch64) Android 9.0| | ||
+ | |||
+ | < | ||
+ | Date: 20200808 162535 | ||
+ | ARCH: ARMv8.2A AArch64 | ||
+ | FPU : ASIMD(AArch64 NEON) FPHP ASIMDHP | ||
+ | Name: Qualcomm Technologies, | ||
+ | |||
+ | CPU Thread: | ||
+ | CPU Core : 8 | ||
+ | CPU Group : 2 | ||
+ | Group 0: Thread= 4 Clock=1.766400 GHz (mask:f) | ||
+ | Group 1: Thread= 4 Clock=2.803200 GHz (mask:f0) | ||
+ | NEON : yes | ||
+ | FMA : yes | ||
+ | FPHP : yes | ||
+ | SIMDHP : yes | ||
+ | DotProd: no | ||
+ | |||
+ | Total: | ||
+ | SingleThread HP max: | ||
+ | SingleThread SP max: | ||
+ | SingleThread DP max: | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | |||
+ | Group 0: Thread=4 | ||
+ | SingleThread HP max: | ||
+ | SingleThread SP max: | ||
+ | SingleThread DP max: 6.862 GFLOPS | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | |||
+ | Group 1: Thread=4 | ||
+ | SingleThread HP max: | ||
+ | SingleThread SP max: | ||
+ | SingleThread DP max: | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | |||
+ | |||
+ | * Group 0: Thread=1 | ||
+ | * FPU/NEON (HP fp) | ||
+ | TIME(s) | ||
+ | FPU fmul (16bit x1) n8 : 0.319 | ||
+ | FPU fadd (16bit x1) n8 : 0.334 | ||
+ | FPU fmadd (16bit x1) n8 : | ||
+ | NEON fmul.4h (16bit x4) n8 : 0.319 13302.8 | ||
+ | NEON fadd.4h (16bit x4) n8 : 0.320 13263.4 | ||
+ | NEON fmla.4h (16bit x4) n8 : 0.319 26604.2 | ||
+ | NEON fmul.8h (16bit x8) n8 : 0.623 13616.4 | ||
+ | NEON fadd.8h (16bit x8) n8 : 0.623 13619.7 | ||
+ | NEON fmla.8h (16bit x8) n8 : 0.623 27220.4 | ||
+ | FPU fmul (16bit x1) ns4 : | ||
+ | FPU fadd (16bit x1) ns4 : | ||
+ | FPU fmadd (16bit x1) ns4 : 0.608 | ||
+ | NEON fmul.4h (16bit x4) ns4 : | ||
+ | NEON fadd.4h (16bit x4) ns4 : | ||
+ | NEON fmla.4h (16bit x4) ns4 : | ||
+ | NEON fmul.8h (16bit x8) ns4 : | ||
+ | NEON fadd.8h (16bit x8) ns4 : | ||
+ | NEON fmla.8h (16bit x8) ns4 : | ||
+ | FPU fmul (16bit x1) n1 : 0.608 | ||
+ | FPU fadd (16bit x1) n1 : 0.608 | ||
+ | FPU fmadd (16bit x1) n1 : | ||
+ | NEON fmul.4h (16bit x4) n1 : 0.608 | ||
+ | NEON fadd.4h (16bit x4) n1 : 0.608 | ||
+ | NEON fmla.4h (16bit x4) n1 : 2.431 | ||
+ | NEON fmul.8h (16bit x8) n1 : 0.622 13627.2 | ||
+ | NEON fadd.8h (16bit x8) n1 : 0.623 13601.9 | ||
+ | NEON fmla.8h (16bit x8) n1 : 2.432 | ||
+ | NEON fmul.8h (16bit x8) n12 : | ||
+ | NEON fadd.8h (16bit x8) n12 : | ||
+ | NEON fmla.8h (16bit x8) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 0: Thread=1 | ||
+ | * FPU/NEON (SP fp) | ||
+ | TIME(s) | ||
+ | FPU fmul (32bit x1) n8 : 0.335 | ||
+ | FPU fadd (32bit x1) n8 : 0.319 | ||
+ | FPU fmadd (32bit x1) n8 : | ||
+ | NEON fmul.2s (32bit x2) n8 : 0.319 | ||
+ | NEON fadd.2s (32bit x2) n8 : 0.319 | ||
+ | NEON fmla.2s (32bit x2) n8 : 0.320 13261.5 | ||
+ | NEON fmul.4s (32bit x4) n8 : 0.624 | ||
+ | NEON fadd.4s (32bit x4) n8 : 0.624 | ||
+ | NEON fmla.4s (32bit x4) n8 : 0.623 13610.0 | ||
+ | FPU fmul (32bit x1) ns4 : | ||
+ | FPU fadd (32bit x1) ns4 : | ||
+ | FPU fmadd (32bit x1) ns4 : 0.608 | ||
+ | NEON fmul.2s (32bit x2) ns4 : | ||
+ | NEON fadd.2s (32bit x2) ns4 : | ||
+ | NEON fmla.2s (32bit x2) ns4 : | ||
+ | NEON fmul.4s (32bit x4) ns4 : | ||
+ | NEON fadd.4s (32bit x4) ns4 : | ||
+ | NEON fmla.4s (32bit x4) ns4 : | ||
+ | FPU fmul (32bit x1) n1 : 0.609 | ||
+ | FPU fadd (32bit x1) n1 : 0.607 | ||
+ | FPU fmadd (32bit x1) n1 : | ||
+ | NEON fmul.2s (32bit x2) n1 : 0.609 | ||
+ | NEON fadd.2s (32bit x2) n1 : 0.608 | ||
+ | NEON fmla.2s (32bit x2) n1 : 2.432 | ||
+ | NEON fmul.4s (32bit x4) n1 : 0.623 | ||
+ | NEON fadd.4s (32bit x4) n1 : 0.625 | ||
+ | NEON fmla.4s (32bit x4) n1 : 2.431 | ||
+ | NEON fmul.4s (32bit x4) n12 : | ||
+ | NEON fadd.4s (32bit x4) n12 : | ||
+ | NEON fmla.4s (32bit x4) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 0: Thread=1 | ||
+ | * FPU/NEON (DP fp) | ||
+ | TIME(s) | ||
+ | FPU fmul (64bit x1) n8 : 0.319 | ||
+ | FPU fadd (64bit x1) n8 : 0.334 | ||
+ | FPU fmadd (64bit x1) n8 : | ||
+ | NEON fmul.2d (64bit x2) n8 : 0.623 | ||
+ | NEON fadd.2d (64bit x2) n8 : 0.623 | ||
+ | NEON fmla.2d (64bit x2) n8 : 0.624 | ||
+ | FPU fmul (64bit x1) ns4 : | ||
+ | FPU fadd (64bit x1) ns4 : | ||
+ | FPU fmadd (64bit x1) ns4 : 0.608 | ||
+ | NEON fmul.2d (64bit x2) ns4 : | ||
+ | NEON fadd.2d (64bit x2) ns4 : | ||
+ | NEON fmla.2d (64bit x2) ns4 : | ||
+ | FPU fmul (64bit x1) n1 : 0.610 | ||
+ | FPU fadd (64bit x1) n1 : 0.608 | ||
+ | FPU fmadd (64bit x1) n1 : | ||
+ | NEON fmul.2d (64bit x2) n1 : 0.622 | ||
+ | NEON fadd.2d (64bit x2) n1 : 0.623 | ||
+ | NEON fmla.2d (64bit x2) n1 : 2.430 | ||
+ | NEON fmul.2d (64bit x2) n12 : | ||
+ | NEON fadd.2d (64bit x2) n12 : | ||
+ | NEON fmla.2d (64bit x2) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 0: Thread=4 | ||
+ | * FPU/NEON (HP fp) multi-thread | ||
+ | TIME(s) | ||
+ | FPU fmul (16bit x1) n8 : 0.322 13169.4 | ||
+ | FPU fadd (16bit x1) n8 : 0.339 12507.7 | ||
+ | FPU fmadd (16bit x1) n8 : | ||
+ | NEON fmul.4h (16bit x4) n8 : 0.320 52913.5 | ||
+ | NEON fadd.4h (16bit x4) n8 : 0.321 52853.6 | ||
+ | NEON fmla.4h (16bit x4) n8 : 0.321 | ||
+ | NEON fmul.8h (16bit x8) n8 : 0.625 54302.0 | ||
+ | NEON fadd.8h (16bit x8) n8 : 0.623 54438.4 | ||
+ | NEON fmla.8h (16bit x8) n8 : 0.632 | ||
+ | FPU fmul (16bit x1) ns4 : | ||
+ | FPU fadd (16bit x1) ns4 : | ||
+ | FPU fmadd (16bit x1) ns4 : 0.607 13962.3 | ||
+ | NEON fmul.4h (16bit x4) ns4 : | ||
+ | NEON fadd.4h (16bit x4) ns4 : | ||
+ | NEON fmla.4h (16bit x4) ns4 : | ||
+ | NEON fmul.8h (16bit x8) ns4 : | ||
+ | NEON fadd.8h (16bit x8) ns4 : | ||
+ | NEON fmla.8h (16bit x8) ns4 : | ||
+ | FPU fmul (16bit x1) n1 : 0.608 | ||
+ | FPU fadd (16bit x1) n1 : 0.607 | ||
+ | FPU fmadd (16bit x1) n1 : | ||
+ | NEON fmul.4h (16bit x4) n1 : 0.607 27921.8 | ||
+ | NEON fadd.4h (16bit x4) n1 : 0.608 27906.0 | ||
+ | NEON fmla.4h (16bit x4) n1 : 2.433 13938.4 | ||
+ | NEON fmul.8h (16bit x8) n1 : 0.627 54113.0 | ||
+ | NEON fadd.8h (16bit x8) n1 : 0.622 54490.1 | ||
+ | NEON fmla.8h (16bit x8) n1 : 2.436 27840.8 | ||
+ | NEON fmul.8h (16bit x8) n12 : | ||
+ | NEON fadd.8h (16bit x8) n12 : | ||
+ | NEON fmla.8h (16bit x8) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 0: Thread=4 | ||
+ | * FPU/NEON (SP fp) multi-thread | ||
+ | TIME(s) | ||
+ | FPU fmul (32bit x1) n8 : 0.336 12617.5 | ||
+ | FPU fadd (32bit x1) n8 : 0.322 13185.6 | ||
+ | FPU fmadd (32bit x1) n8 : | ||
+ | NEON fmul.2s (32bit x2) n8 : 0.320 26467.9 | ||
+ | NEON fadd.2s (32bit x2) n8 : 0.321 26401.9 | ||
+ | NEON fmla.2s (32bit x2) n8 : 0.323 52475.1 | ||
+ | NEON fmul.4s (32bit x4) n8 : 0.628 26989.3 | ||
+ | NEON fadd.4s (32bit x4) n8 : 0.626 27107.1 | ||
+ | NEON fmla.4s (32bit x4) n8 : 0.628 53977.4 | ||
+ | FPU fmul (32bit x1) ns4 : | ||
+ | FPU fadd (32bit x1) ns4 : | ||
+ | FPU fmadd (32bit x1) ns4 : 0.608 13934.5 | ||
+ | NEON fmul.2s (32bit x2) ns4 : | ||
+ | NEON fadd.2s (32bit x2) ns4 : | ||
+ | NEON fmla.2s (32bit x2) ns4 : | ||
+ | NEON fmul.4s (32bit x4) ns4 : | ||
+ | NEON fadd.4s (32bit x4) ns4 : | ||
+ | NEON fmla.4s (32bit x4) ns4 : | ||
+ | FPU fmul (32bit x1) n1 : 0.610 | ||
+ | FPU fadd (32bit x1) n1 : 0.608 | ||
+ | FPU fmadd (32bit x1) n1 : | ||
+ | NEON fmul.2s (32bit x2) n1 : 0.610 13895.0 | ||
+ | NEON fadd.2s (32bit x2) n1 : 0.608 13944.9 | ||
+ | NEON fmla.2s (32bit x2) n1 : 2.451 | ||
+ | NEON fmul.4s (32bit x4) n1 : 0.625 27142.9 | ||
+ | NEON fadd.4s (32bit x4) n1 : 0.630 26929.2 | ||
+ | NEON fmla.4s (32bit x4) n1 : 2.445 13872.3 | ||
+ | NEON fmul.4s (32bit x4) n12 : | ||
+ | NEON fadd.4s (32bit x4) n12 : | ||
+ | NEON fmla.4s (32bit x4) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 0: Thread=4 | ||
+ | * FPU/NEON (DP fp) multi-thread | ||
+ | TIME(s) | ||
+ | FPU fmul (64bit x1) n8 : 0.320 13254.5 | ||
+ | FPU fadd (64bit x1) n8 : 0.334 12685.9 | ||
+ | FPU fmadd (64bit x1) n8 : | ||
+ | NEON fmul.2d (64bit x2) n8 : 0.626 13534.5 | ||
+ | NEON fadd.2d (64bit x2) n8 : 0.628 13500.3 | ||
+ | NEON fmla.2d (64bit x2) n8 : 0.624 27196.3 | ||
+ | FPU fmul (64bit x1) ns4 : | ||
+ | FPU fadd (64bit x1) ns4 : | ||
+ | FPU fmadd (64bit x1) ns4 : 0.613 13820.5 | ||
+ | NEON fmul.2d (64bit x2) ns4 : | ||
+ | NEON fadd.2d (64bit x2) ns4 : | ||
+ | NEON fmla.2d (64bit x2) ns4 : | ||
+ | FPU fmul (64bit x1) n1 : 0.609 | ||
+ | FPU fadd (64bit x1) n1 : 0.608 | ||
+ | FPU fmadd (64bit x1) n1 : | ||
+ | NEON fmul.2d (64bit x2) n1 : 0.627 13531.7 | ||
+ | NEON fadd.2d (64bit x2) n1 : 0.623 13613.9 | ||
+ | NEON fmla.2d (64bit x2) n1 : 2.457 | ||
+ | NEON fmul.2d (64bit x2) n12 : | ||
+ | NEON fadd.2d (64bit x2) n12 : | ||
+ | NEON fmla.2d (64bit x2) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 1: Thread=1 | ||
+ | * FPU/NEON (HP fp) | ||
+ | TIME(s) | ||
+ | FPU fmul (16bit x1) n8 : 0.308 | ||
+ | FPU fadd (16bit x1) n8 : 0.307 | ||
+ | FPU fmadd (16bit x1) n8 : | ||
+ | NEON fmul.4h (16bit x4) n8 : 0.305 22089.8 | ||
+ | NEON fadd.4h (16bit x4) n8 : 0.304 22142.5 | ||
+ | NEON fmla.4h (16bit x4) n8 : 0.304 44283.6 | ||
+ | NEON fmul.8h (16bit x8) n8 : 0.608 22145.2 | ||
+ | NEON fadd.8h (16bit x8) n8 : 0.609 22110.9 | ||
+ | NEON fmla.8h (16bit x8) n8 : 0.607 44326.9 | ||
+ | FPU fmul (16bit x1) ns4 : | ||
+ | FPU fadd (16bit x1) ns4 : | ||
+ | FPU fmadd (16bit x1) ns4 : 0.476 | ||
+ | NEON fmul.4h (16bit x4) ns4 : | ||
+ | NEON fadd.4h (16bit x4) ns4 : | ||
+ | NEON fmla.4h (16bit x4) ns4 : | ||
+ | NEON fmul.8h (16bit x8) ns4 : | ||
+ | NEON fadd.8h (16bit x8) ns4 : | ||
+ | NEON fmla.8h (16bit x8) ns4 : | ||
+ | FPU fmul (16bit x1) n1 : 0.304 | ||
+ | FPU fadd (16bit x1) n1 : 0.303 | ||
+ | FPU fmadd (16bit x1) n1 : | ||
+ | NEON fmul.4h (16bit x4) n1 : 0.302 22273.0 | ||
+ | NEON fadd.4h (16bit x4) n1 : 0.302 22291.3 | ||
+ | NEON fmla.4h (16bit x4) n1 : 1.819 | ||
+ | NEON fmul.8h (16bit x8) n1 : 0.606 22201.1 | ||
+ | NEON fadd.8h (16bit x8) n1 : 0.607 22159.7 | ||
+ | NEON fmla.8h (16bit x8) n1 : 1.822 14771.4 | ||
+ | NEON fmul.8h (16bit x8) n12 : | ||
+ | NEON fadd.8h (16bit x8) n12 : | ||
+ | NEON fmla.8h (16bit x8) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 1: Thread=1 | ||
+ | * FPU/NEON (SP fp) | ||
+ | TIME(s) | ||
+ | FPU fmul (32bit x1) n8 : 0.302 | ||
+ | FPU fadd (32bit x1) n8 : 0.301 | ||
+ | FPU fmadd (32bit x1) n8 : | ||
+ | NEON fmul.2s (32bit x2) n8 : 0.302 11156.8 | ||
+ | NEON fadd.2s (32bit x2) n8 : 0.302 11130.7 | ||
+ | NEON fmla.2s (32bit x2) n8 : 0.302 22252.9 | ||
+ | NEON fmul.4s (32bit x4) n8 : 0.603 11156.5 | ||
+ | NEON fadd.4s (32bit x4) n8 : 0.605 11118.1 | ||
+ | NEON fmla.4s (32bit x4) n8 : 0.607 22171.6 | ||
+ | FPU fmul (32bit x1) ns4 : | ||
+ | FPU fadd (32bit x1) ns4 : | ||
+ | FPU fmadd (32bit x1) ns4 : 0.470 | ||
+ | NEON fmul.2s (32bit x2) ns4 : | ||
+ | NEON fadd.2s (32bit x2) ns4 : | ||
+ | NEON fmla.2s (32bit x2) ns4 : | ||
+ | NEON fmul.4s (32bit x4) ns4 : | ||
+ | NEON fadd.4s (32bit x4) ns4 : | ||
+ | NEON fmla.4s (32bit x4) ns4 : | ||
+ | FPU fmul (32bit x1) n1 : 0.305 | ||
+ | FPU fadd (32bit x1) n1 : 0.305 | ||
+ | FPU fmadd (32bit x1) n1 : | ||
+ | NEON fmul.2s (32bit x2) n1 : 0.304 11079.4 | ||
+ | NEON fadd.2s (32bit x2) n1 : 0.305 11035.2 | ||
+ | NEON fmla.2s (32bit x2) n1 : 1.816 | ||
+ | NEON fmul.4s (32bit x4) n1 : 0.609 11055.8 | ||
+ | NEON fadd.4s (32bit x4) n1 : 0.608 11067.4 | ||
+ | NEON fmla.4s (32bit x4) n1 : 1.823 | ||
+ | NEON fmul.4s (32bit x4) n12 : | ||
+ | NEON fadd.4s (32bit x4) n12 : | ||
+ | NEON fmla.4s (32bit x4) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 1: Thread=1 | ||
+ | * FPU/NEON (DP fp) | ||
+ | TIME(s) | ||
+ | FPU fmul (64bit x1) n8 : 0.301 | ||
+ | FPU fadd (64bit x1) n8 : 0.301 | ||
+ | FPU fmadd (64bit x1) n8 : | ||
+ | NEON fmul.2d (64bit x2) n8 : 0.604 | ||
+ | NEON fadd.2d (64bit x2) n8 : 0.604 | ||
+ | NEON fmla.2d (64bit x2) n8 : 0.608 11063.3 | ||
+ | FPU fmul (64bit x1) ns4 : | ||
+ | FPU fadd (64bit x1) ns4 : | ||
+ | FPU fmadd (64bit x1) ns4 : 0.476 | ||
+ | NEON fmul.2d (64bit x2) ns4 : | ||
+ | NEON fadd.2d (64bit x2) ns4 : | ||
+ | NEON fmla.2d (64bit x2) ns4 : | ||
+ | FPU fmul (64bit x1) n1 : 0.308 | ||
+ | FPU fadd (64bit x1) n1 : 0.307 | ||
+ | FPU fmadd (64bit x1) n1 : | ||
+ | NEON fmul.2d (64bit x2) n1 : 0.607 | ||
+ | NEON fadd.2d (64bit x2) n1 : 0.608 | ||
+ | NEON fmla.2d (64bit x2) n1 : 1.828 | ||
+ | NEON fmul.2d (64bit x2) n12 : | ||
+ | NEON fadd.2d (64bit x2) n12 : | ||
+ | NEON fmla.2d (64bit x2) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 1: Thread=4 | ||
+ | * FPU/NEON (HP fp) multi-thread | ||
+ | TIME(s) | ||
+ | FPU fmul (16bit x1) n8 : 0.319 21091.6 | ||
+ | FPU fadd (16bit x1) n8 : 0.319 21094.1 | ||
+ | FPU fmadd (16bit x1) n8 : | ||
+ | NEON fmul.4h (16bit x4) n8 : 0.319 84394.0 | ||
+ | NEON fadd.4h (16bit x4) n8 : 0.319 84371.8 | ||
+ | NEON fmla.4h (16bit x4) n8 : 0.319 | ||
+ | NEON fmul.8h (16bit x8) n8 : 0.638 84391.0 | ||
+ | NEON fadd.8h (16bit x8) n8 : 0.638 84381.8 | ||
+ | NEON fmla.8h (16bit x8) n8 : 0.638 | ||
+ | FPU fmul (16bit x1) ns4 : | ||
+ | FPU fadd (16bit x1) ns4 : | ||
+ | FPU fmadd (16bit x1) ns4 : 0.505 26643.9 | ||
+ | NEON fmul.4h (16bit x4) ns4 : | ||
+ | NEON fadd.4h (16bit x4) ns4 : | ||
+ | NEON fmla.4h (16bit x4) ns4 : | ||
+ | NEON fmul.8h (16bit x8) ns4 : | ||
+ | NEON fadd.8h (16bit x8) ns4 : | ||
+ | NEON fmla.8h (16bit x8) ns4 : | ||
+ | FPU fmul (16bit x1) n1 : 0.319 21092.1 | ||
+ | FPU fadd (16bit x1) n1 : 0.319 21088.6 | ||
+ | FPU fmadd (16bit x1) n1 : | ||
+ | NEON fmul.4h (16bit x4) n1 : 0.319 84362.3 | ||
+ | NEON fadd.4h (16bit x4) n1 : 0.319 84353.3 | ||
+ | NEON fmla.4h (16bit x4) n1 : 1.914 28120.1 | ||
+ | NEON fmul.8h (16bit x8) n1 : 0.638 84355.7 | ||
+ | NEON fadd.8h (16bit x8) n1 : 0.638 84365.0 | ||
+ | NEON fmla.8h (16bit x8) n1 : 1.914 56233.2 | ||
+ | NEON fmul.8h (16bit x8) n12 : | ||
+ | NEON fadd.8h (16bit x8) n12 : | ||
+ | NEON fmla.8h (16bit x8) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 1: Thread=4 | ||
+ | * FPU/NEON (SP fp) multi-thread | ||
+ | TIME(s) | ||
+ | FPU fmul (32bit x1) n8 : 0.319 21088.7 | ||
+ | FPU fadd (32bit x1) n8 : 0.319 21089.7 | ||
+ | FPU fmadd (32bit x1) n8 : | ||
+ | NEON fmul.2s (32bit x2) n8 : 0.319 42171.7 | ||
+ | NEON fadd.2s (32bit x2) n8 : 0.319 42178.9 | ||
+ | NEON fmla.2s (32bit x2) n8 : 0.320 84163.3 | ||
+ | NEON fmul.4s (32bit x4) n8 : 0.638 42178.4 | ||
+ | NEON fadd.4s (32bit x4) n8 : 0.638 42176.0 | ||
+ | NEON fmla.4s (32bit x4) n8 : 0.638 84357.5 | ||
+ | FPU fmul (32bit x1) ns4 : | ||
+ | FPU fadd (32bit x1) ns4 : | ||
+ | FPU fmadd (32bit x1) ns4 : 0.500 26910.5 | ||
+ | NEON fmul.2s (32bit x2) ns4 : | ||
+ | NEON fadd.2s (32bit x2) ns4 : | ||
+ | NEON fmla.2s (32bit x2) ns4 : | ||
+ | NEON fmul.4s (32bit x4) ns4 : | ||
+ | NEON fadd.4s (32bit x4) ns4 : | ||
+ | NEON fmla.4s (32bit x4) ns4 : | ||
+ | FPU fmul (32bit x1) n1 : 0.319 21088.5 | ||
+ | FPU fadd (32bit x1) n1 : 0.319 21086.2 | ||
+ | FPU fmadd (32bit x1) n1 : | ||
+ | NEON fmul.2s (32bit x2) n1 : 0.319 42178.8 | ||
+ | NEON fadd.2s (32bit x2) n1 : 0.319 42180.9 | ||
+ | NEON fmla.2s (32bit x2) n1 : 1.914 14059.6 | ||
+ | NEON fmul.4s (32bit x4) n1 : 0.638 42166.1 | ||
+ | NEON fadd.4s (32bit x4) n1 : 0.638 42179.5 | ||
+ | NEON fmla.4s (32bit x4) n1 : 1.914 28119.1 | ||
+ | NEON fmul.4s (32bit x4) n12 : | ||
+ | NEON fadd.4s (32bit x4) n12 : | ||
+ | NEON fmla.4s (32bit x4) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 1: Thread=4 | ||
+ | * FPU/NEON (DP fp) multi-thread | ||
+ | TIME(s) | ||
+ | FPU fmul (64bit x1) n8 : 0.319 21090.4 | ||
+ | FPU fadd (64bit x1) n8 : 0.319 21091.1 | ||
+ | FPU fmadd (64bit x1) n8 : | ||
+ | NEON fmul.2d (64bit x2) n8 : 0.638 21092.3 | ||
+ | NEON fadd.2d (64bit x2) n8 : 0.638 21084.8 | ||
+ | NEON fmla.2d (64bit x2) n8 : 0.638 42165.7 | ||
+ | FPU fmul (64bit x1) ns4 : | ||
+ | FPU fadd (64bit x1) ns4 : | ||
+ | FPU fmadd (64bit x1) ns4 : 0.494 27244.1 | ||
+ | NEON fmul.2d (64bit x2) ns4 : | ||
+ | NEON fadd.2d (64bit x2) ns4 : | ||
+ | NEON fmla.2d (64bit x2) ns4 : | ||
+ | FPU fmul (64bit x1) n1 : 0.319 21091.2 | ||
+ | FPU fadd (64bit x1) n1 : 0.319 21090.4 | ||
+ | FPU fmadd (64bit x1) n1 : | ||
+ | NEON fmul.2d (64bit x2) n1 : 0.638 21092.7 | ||
+ | NEON fadd.2d (64bit x2) n1 : 0.638 21092.3 | ||
+ | NEON fmla.2d (64bit x2) n1 : 1.914 14061.5 | ||
+ | NEON fmul.2d (64bit x2) n12 : | ||
+ | NEON fadd.2d (64bit x2) n12 : | ||
+ | NEON fmla.2d (64bit x2) n12 : | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | </ | ||
+ | |||
+ | ++++ | ||
+ | |||
行 10538: | 行 11787: | ||
< | < | ||
- | ARCH: ARMv7A | + | Date: 20200808 172338 |
- | FPU: VFPv3-D32 NEON | + | ARCH: ARMv7A |
- | SingleT SP max: 16.066 GFLOPS | + | FPU : VFPv4-D32 NEON |
- | SingleT DP max: 8.027 GFLOPS | + | Name: RK3399 ChromebookFlipC101PA |
- | MultiT | + | CPU Thread: 6 |
- | MultiT | + | CPU Core : |
- | CPU core: 2 | + | CPU Group : 2 |
- | NEON: yes | + | Group 0: Thread= 4 Clock=1.512000 GHz (mask:f) |
- | FMA : no | + | Group 1: Thread= |
+ | NEON | ||
+ | FMA : yes | ||
+ | FPHP : no | ||
+ | SIMDHP : no | ||
+ | DotProd: no | ||
- | * VFP/NEON (single fp) | + | Total: |
- | | + | SingleThread HP max: - |
- | VFP fmuls (32bit x1) n8 : | + | SingleThread SP max: 16.062 GFLOPS |
- | VFP fadds (32bit x1) n8 : | + | SingleThread DP max: 8.030 GFLOPS |
- | VFP fmacs (32bit x1) n8 : | + | MultiThread |
- | VFP vfma.f32 (32bit x1) n8 : - - - - | + | MultiThread |
- | NEON vmul.f32 (32bit x2) n8 : | + | MultiThread |
- | NEON vadd.f32 (32bit x2) n8 | + | |
- | NEON vmla.f32 (32bit x2) n8 : | + | |
- | NEON vfma.f32 (32bit x2) n8 : | + | |
- | NEON vmul.f32 (32bit x4) n8 : | + | |
- | NEON vadd.f32 (32bit x4) n8 : | + | |
- | NEON vmla.f32 (32bit x4) n8 : | + | |
- | NEON vfma.f32 (32bit x4) n8 : | + | |
- | VFP fmuls (32bit x1) ns4 : 0.598 | + | |
- | VFP fadds (32bit x1) ns4 | + | |
- | VFP fmacs (32bit x1) ns4 : 1.046 | + | |
- | VFP vfma.f32 (32bit x1) ns4 : | + | |
- | NEON vmul.f32 (32bit x2) ns4 : | + | |
- | NEON vadd.f32 (32bit x2) ns4 : 0.597 | + | |
- | NEON vmla.f32 (32bit x2) ns4 : 1.046 | + | |
- | NEON vfma.f32 (32bit x2) ns4 : - - - - - | + | |
- | NEON vmul.f32 (32bit x4) ns4 : 0.597 | + | |
- | NEON vadd.f32 (32bit x4) ns4 : 0.597 | + | |
- | NEON vmla.f32 (32bit x4) ns4 : 1.046 | + | |
- | NEON vfma.f32 (32bit x4) ns4 : - - - - - | + | |
- | VFP fmuls (32bit x1) n1 : | + | |
- | VFP fadds (32bit x1) n1 : | + | |
- | VFP fmacs (32bit x1) n1 : | + | |
- | VFP vfma.f32 (32bit x1) n1 : - - - - - | + | |
- | NEON vmul.f32 (32bit x2) n1 | + | |
- | NEON vadd.f32 (32bit x2) n1 : | + | |
- | NEON vmla.f32 (32bit x2) n1 : | + | |
- | NEON vfma.f32 (32bit x2) n1 : | + | |
- | NEON vmul.f32 (32bit x4) n1 : | + | |
- | NEON vadd.f32 (32bit x4) n1 : | + | |
- | NEON vmla.f32 (32bit x4) n1 : | + | |
- | NEON vfma.f32 (32bit x4) n1 : | + | |
- | NEON vmul.f32 (32bit x4) n12 : 0.896 | + | |
- | NEON vadd.f32 (32bit x4) n12 : 0.896 | + | |
- | NEON vmla.f32 (32bit x4) n12 : 0.896 16066.1 | + | |
- | NEON vfma.f32 (32bit x4) n12 : - - - - - | + | |
- | Average | + | |
- | Highest | + | |
+ | Group 0: Thread=4 | ||
+ | SingleThread HP max: - | ||
+ | SingleThread SP max: | ||
+ | SingleThread DP max: 5.459 GFLOPS | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | MultiThread | ||
- | * VFP/NEON (double fp) | + | Group 1: |
- | TIME(s) | + | |
- | VFP fmuld (64bit x1) n8 : | + | |
- | VFP faddd (64bit x1) n8 : | + | SingleThread DP max: |
- | VFP fmacd (64bit x1) n8 : | + | |
- | VFP vfma.f64 (64bit x1) n8 : - - - - - | + | |
- | VFP fmuld (64bit x1) ns4 : 0.598 | + | MultiThread |
- | VFP faddd (64bit x1) ns4 | + | |
- | VFP fmacd (64bit x1) ns4 : 1.046 | + | |
- | VFP vfma.f64 (64bit x1) ns4 : | + | |
- | VFP fmuld (64bit x1) n1 : 0.301 | + | |
- | VFP faddd (64bit x1) n1 : | + | |
- | VFP fmacd (64bit x1) n1 : | + | |
- | VFP vfma.f64 (64bit x1) n1 | + | |
- | Average | + | |
- | Highest | + | |
- | * Matrix 4x4 | + | * Group 0: Thread=1 |
- | TIME(s) | + | * VFP/NEON (SP fp) |
- | C++ code | + | TIME(s) |
- | NEON vmla 128bit A | + | VFP fmuls (32bit x1) n8 : 0.364 |
- | NEON vmla 64bit A | + | VFP fadds (32bit x1) n8 : |
- | NEON vfma 128bit A | + | VFP fmacs (32bit x1) n8 : |
- | NEON vmla 128bit B | + | VFP vfma.f32 (32bit x1) n8 : - - - |
- | NEON vmla | + | NEON vmul.f32 (32bit x2) n8 : 0.318 |
- | NEON vfma 128bit B | + | NEON vadd.f32 (32bit x2) n8 : |
- | NEON vfma 128bit C | + | NEON vmla.f32 (32bit x2) n8 : 0.580 |
- | Average | + | NEON vfma.f32 (32bit x2) n8 : |
- | Highest | + | NEON vmul.f32 (32bit x4) n8 : |
+ | NEON vadd.f32 (32bit x4) n8 : | ||
+ | NEON vmla.f32 (32bit x4) n8 : | ||
+ | NEON vfma.f32 (32bit x4) n8 : - - - - - | ||
+ | VFP fmuls (32bit x1) ns4 : 0.606 | ||
+ | VFP fadds (32bit x1) ns4 : 0.607 | ||
+ | VFP fmacs (32bit x1) ns4 : 1.210 | ||
+ | VFP vfma.f32 (32bit x1) ns4 : | ||
+ | NEON vmul.f32 (32bit x2) ns4 | ||
+ | NEON vadd.f32 (32bit x2) ns4 : 0.605 | ||
+ | NEON vmla.f32 (32bit x2) ns4 : | ||
+ | NEON vfma.f32 (32bit x2) ns4 : - - - | ||
+ | NEON vmul.f32 (32bit x4) ns4 : 0.620 | ||
+ | NEON vadd.f32 (32bit x4) ns4 : 0.619 | ||
+ | NEON vmla.f32 (32bit x4) ns4 : 1.209 | ||
+ | NEON vfma.f32 (32bit x4) ns4 | ||
+ | VFP fmuls (32bit x1) n1 : | ||
+ | VFP fadds (32bit x1) n1 : | ||
+ | VFP fmacs (32bit x1) n1 : | ||
+ | VFP vfma.f32 (32bit x1) n1 : | ||
+ | NEON vmul.f32 (32bit x2) n1 : | ||
+ | NEON vadd.f32 (32bit x2) n1 : | ||
+ | NEON vmla.f32 (32bit x2) n1 : | ||
+ | NEON vfma.f32 (32bit x2) n1 : - - - | ||
+ | NEON vmul.f32 (32bit x4) n1 : 0.619 | ||
+ | NEON vadd.f32 (32bit x4) n1 : | ||
+ | NEON vmla.f32 | ||
+ | NEON vfma.f32 (32bit x4) n1 : | ||
+ | NEON vmul.f32 (32bit x4) n12 : 0.922 | ||
+ | NEON vadd.f32 (32bit x4) n12 : 0.922 5903.3 1475.8 | ||
+ | NEON vmla.f32 (32bit x4) n12 : 0.923 11800.8 | ||
+ | NEON vfma.f32 (32bit x4) n12 : - - - | ||
+ | Average | ||
+ | Highest | ||
- | * VFP/NEON (single | + | * Group 0: Thread=1 |
- | TIME(s) | + | * VFP/NEON (DP fp) |
- | VFP fmuls (32bit x1) n8 : | + | TIME(s) |
- | VFP fadds (32bit x1) n8 : | + | VFP fmuld (64bit x1) n8 : |
- | VFP fmacs (32bit x1) n8 : | + | VFP faddd (64bit x1) n8 : |
- | VFP vfma.f32 | + | VFP fmacd (64bit x1) n8 |
- | NEON vmul.f32 (32bit x2) n8 : 0.300 15996.9 | + | VFP vfma.f64 (64bit x1) n8 : - - - |
- | NEON vadd.f32 (32bit x2) n8 : | + | VFP fmuld (64bit x1) ns4 : 0.604 |
- | NEON vmla.f32 (32bit x2) n8 : | + | VFP faddd (64bit x1) ns4 : 0.604 |
- | NEON vfma.f32 (32bit x2) n8 : | + | VFP fmacd (64bit x1) ns4 : 1.218 |
- | NEON vmul.f32 (32bit x4) n8 : | + | VFP vfma.f64 (64bit x1) ns4 |
- | NEON vadd.f32 (32bit x4) n8 : | + | VFP fmuld (64bit x1) n1 : |
- | NEON vmla.f32 (32bit x4) n8 : 0.601 31941.5 | + | VFP faddd (64bit x1) n1 : |
- | NEON vfma.f32 (32bit x4) n8 : - - - - - | + | VFP fmacd (64bit x1) n1 |
- | VFP fmuls (32bit x1) ns4 : 0.599 | + | VFP vfma.f64 (64bit x1) n1 : - - - |
- | VFP fadds (32bit x1) ns4 : 0.606 | + | Average |
- | VFP fmacs (32bit x1) ns4 : | + | Highest |
- | VFP vfma.f32 (32bit x1) ns4 : | + | |
- | NEON vmul.f32 (32bit x2) ns4 : 0.599 | + | |
- | NEON vadd.f32 (32bit x2) ns4 : 0.601 | + | |
- | NEON vmla.f32 | + | |
- | NEON vfma.f32 (32bit x2) ns4 : - - - - - | + | |
- | NEON vmul.f32 (32bit x4) ns4 : 0.599 16014.8 2001.9 ( 8 1.0) | + | |
- | NEON vadd.f32 | + | |
- | NEON vmla.f32 (32bit x4) ns4 : 1.049 18307.6 | + | |
- | NEON vfma.f32 (32bit x4) ns4 : - - - - - | + | |
- | VFP fmuls (32bit x1) n1 : | + | |
- | VFP fadds (32bit x1) n1 : | + | |
- | VFP fmacs (32bit x1) n1 : | + | |
- | VFP vfma.f32 | + | |
- | NEON vmul.f32 (32bit x2) n1 : | + | |
- | NEON vadd.f32 (32bit x2) n1 : | + | |
- | NEON vmla.f32 (32bit x2) n1 : | + | |
- | NEON vfma.f32 (32bit x2) n1 | + | |
- | NEON vmul.f32 (32bit x4) n1 : 0.602 15955.7 | + | |
- | NEON vadd.f32 (32bit x4) n1 : | + | |
- | NEON vmla.f32 (32bit x4) n1 : | + | |
- | NEON vfma.f32 (32bit x4) n1 : | + | |
- | NEON vmul.f32 (32bit x4) n12 | + | |
- | NEON vadd.f32 (32bit x4) n12 : 0.900 15994.6 | + | |
- | NEON vmla.f32 (32bit x4) n12 : 0.898 32053.6 | + | |
- | NEON vfma.f32 (32bit x4) n12 : - - - - - | + | |
- | Average | + | |
- | Highest | + | |
- | * VFP/NEON (double | + | * Group 0: Thread=4 |
- | TIME(s) | + | * VFP/NEON (SP fp) multi-thread |
- | VFP fmuld (64bit x1) n8 : | + | TIME(s) |
- | VFP faddd (64bit x1) n8 : | + | VFP fmuls (32bit x1) n8 : |
- | VFP fmacd (64bit x1) n8 : | + | VFP fadds (32bit x1) n8 : |
- | VFP vfma.f64 (64bit x1) n8 : - - - - - | + | VFP fmacs (32bit x1) n8 : |
- | VFP fmuld (64bit x1) ns4 : 0.600 | + | VFP vfma.f32 |
- | VFP faddd (64bit x1) ns4 : 0.605 | + | NEON vmul.f32 (32bit x2) n8 : 0.322 22549.4 |
- | VFP fmacd (64bit x1) ns4 : 1.051 | + | NEON vadd.f32 (32bit x2) n8 : |
- | VFP vfma.f64 (64bit x1) ns4 : | + | NEON vmla.f32 (32bit x2) n8 : |
- | VFP fmuld (64bit x1) n1 : 0.300 | + | NEON vfma.f32 |
- | VFP faddd (64bit x1) n1 : | + | NEON vmul.f32 (32bit x4) n8 : 0.626 23194.3 |
- | VFP fmacd (64bit x1) n1 : | + | NEON vadd.f32 (32bit x4) n8 : |
- | VFP vfma.f64 (64bit x1) n1 : - - - - - | + | NEON vmla.f32 (32bit x4) n8 |
- | Average | + | NEON vfma.f32 (32bit x4) n8 |
- | Highest | + | VFP fmuls (32bit x1) ns4 : 0.613 |
+ | VFP fadds (32bit x1) ns4 : 0.608 | ||
+ | VFP fmacs (32bit x1) ns4 : 1.219 | ||
+ | VFP vfma.f32 (32bit x1) ns4 : | ||
+ | NEON vmul.f32 | ||
+ | NEON vadd.f32 (32bit x2) ns4 : 0.607 11947.7 | ||
+ | NEON vmla.f32 | ||
+ | NEON vfma.f32 (32bit x2) ns4 : - - - | ||
+ | NEON vmul.f32 (32bit x4) ns4 : 0.623 23296.9 | ||
+ | NEON vadd.f32 (32bit x4) ns4 : 0.623 23288.7 | ||
+ | NEON vmla.f32 (32bit x4) ns4 : 1.214 23903.3 | ||
+ | NEON vfma.f32 (32bit x4) ns4 : - - - | ||
+ | VFP fmuls (32bit x1) n1 : | ||
+ | VFP fadds (32bit x1) n1 : | ||
+ | VFP fmacs (32bit x1) n1 : | ||
+ | VFP vfma.f32 (32bit x1) n1 : - - - | ||
+ | NEON vmul.f32 (32bit x2) n1 : 0.609 11916.6 | ||
+ | NEON vadd.f32 (32bit x2) n1 : | ||
+ | NEON vmla.f32 (32bit x2) n1 : | ||
+ | NEON vfma.f32 (32bit x2) n1 : | ||
+ | NEON vmul.f32 (32bit x4) n1 : 0.626 23197.0 | ||
+ | NEON vadd.f32 (32bit x4) n1 : | ||
+ | NEON vmla.f32 (32bit x4) n1 : | ||
+ | NEON vfma.f32 (32bit x4) n1 : | ||
+ | NEON vmul.f32 (32bit x4) n12 : 0.929 23441.0 | ||
+ | NEON vadd.f32 (32bit x4) n12 : 0.930 23401.3 | ||
+ | NEON vmla.f32 (32bit x4) n12 : 0.928 46918.9 | ||
+ | NEON vfma.f32 (32bit x4) n12 : - - - | ||
+ | Average | ||
+ | Highest | ||
- | * Matrix 4x4 multi-thread | + | * Group 0: Thread=4 |
- | TIME(s) | + | * VFP/NEON (DP fp) multi-thread |
- | C++ code | + | TIME(s) |
- | NEON vmla 128bit A | + | VFP fmuld (64bit x1) n8 : 0.354 10241.0 |
- | NEON vmla | + | VFP faddd (64bit x1) n8 : 0.320 11325.3 |
- | NEON vfma 128bit A | + | VFP fmacd (64bit x1) n8 : 0.334 21746.4 |
- | NEON vmla 128bit B | + | VFP vfma.f64 (64bit x1) n8 |
- | NEON vmla | + | VFP fmuld (64bit x1) ns4 |
- | NEON vfma 128bit B | + | VFP faddd (64bit x1) ns4 |
- | NEON vfma 128bit C | + | VFP fmacd (64bit x1) ns4 : 1.224 |
- | Average | + | VFP vfma.f64 (64bit x1) ns4 : - - - |
- | Highest | + | VFP fmuld (64bit x1) n1 : |
+ | VFP faddd (64bit x1) n1 : | ||
+ | VFP fmacd (64bit x1) n1 : | ||
+ | VFP vfma.f64 (64bit x1) n1 | ||
+ | Average | ||
+ | Highest | ||
- | cpu0 1512000 408000 | + | * Group 1: Thread=1 |
- | cpu1 1512000 408000 | + | * VFP/NEON (SP fp) |
- | cpu2 1512000 408000 | + | TIME(s) |
- | cpu3 1512000 408000 | + | VFP fmuls (32bit x1) n8 : |
- | cpu4 2016000 408000 | + | VFP fadds (32bit x1) n8 : |
- | cpu5 2016000 408000 | + | VFP fmacs (32bit x1) n8 : |
+ | VFP vfma.f32 (32bit x1) n8 : - - - | ||
+ | NEON vmul.f32 (32bit x2) n8 : | ||
+ | NEON vadd.f32 (32bit x2) n8 : | ||
+ | NEON vmla.f32 (32bit x2) n8 : | ||
+ | NEON vfma.f32 (32bit x2) n8 : | ||
+ | NEON vmul.f32 (32bit x4) n8 : | ||
+ | NEON vadd.f32 (32bit x4) n8 : | ||
+ | NEON vmla.f32 (32bit x4) n8 : | ||
+ | NEON vfma.f32 (32bit x4) n8 : | ||
+ | VFP fmuls (32bit x1) ns4 : 0.602 | ||
+ | VFP fadds (32bit x1) ns4 : 0.602 | ||
+ | VFP fmacs (32bit x1) ns4 : 1.054 | ||
+ | VFP vfma.f32 (32bit x1) ns4 : | ||
+ | NEON vmul.f32 (32bit x2) ns4 : 0.602 | ||
+ | NEON vadd.f32 (32bit x2) ns4 : 0.602 | ||
+ | NEON vmla.f32 (32bit x2) ns4 : 1.054 | ||
+ | NEON vfma.f32 (32bit x2) ns4 : - - - | ||
+ | NEON vmul.f32 (32bit x4) ns4 : 0.602 | ||
+ | NEON vadd.f32 (32bit x4) ns4 : 0.602 | ||
+ | NEON vmla.f32 (32bit x4) ns4 : 1.055 | ||
+ | NEON vfma.f32 (32bit x4) ns4 : - - - | ||
+ | VFP fmuls (32bit x1) n1 : | ||
+ | VFP fadds (32bit x1) n1 : | ||
+ | VFP fmacs (32bit x1) n1 : | ||
+ | VFP vfma.f32 (32bit x1) n1 : - - - | ||
+ | NEON vmul.f32 (32bit x2) n1 : | ||
+ | NEON vadd.f32 (32bit x2) n1 : | ||
+ | NEON vmla.f32 (32bit x2) n1 : | ||
+ | NEON vfma.f32 (32bit x2) n1 : | ||
+ | NEON vmul.f32 (32bit x4) n1 : | ||
+ | NEON vadd.f32 (32bit x4) n1 : | ||
+ | NEON vmla.f32 (32bit x4) n1 : | ||
+ | NEON vfma.f32 (32bit x4) n1 : | ||
+ | NEON vmul.f32 (32bit x4) n12 : 0.904 | ||
+ | NEON vadd.f32 (32bit x4) n12 : 0.904 | ||
+ | NEON vmla.f32 (32bit x4) n12 : 0.904 16062.4 | ||
+ | NEON vfma.f32 (32bit x4) n12 : - - - | ||
+ | Average | ||
+ | Highest | ||
- | processor : 0 | ||
- | model name : ARMv8 Processor rev 4 (v8l) | ||
- | BogoMIPS : 48.00 | ||
- | Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt lpae evtstrm aes pmull sha1 sha2 crc32 | ||
- | CPU implementer : | ||
- | CPU architecture: | ||
- | CPU variant : 0x0 | ||
- | CPU part : 0xd03 | ||
- | CPU revision : 4 | ||
- | processor : 1 | + | * Group 1: |
- | model name : ARMv8 Processor rev 4 (v8l) | + | * VFP/NEON (DP fp) |
- | BogoMIPS : 48.00 | + | TIME(s) |
- | Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt lpae evtstrm aes pmull sha1 sha2 crc32 | + | VFP fmuld (64bit x1) n8 : |
- | CPU implementer : 0x41 | + | VFP faddd (64bit x1) n8 : 0.301 |
- | CPU architecture: 8 | + | VFP fmacd (64bit x1) n8 : |
- | CPU variant : 0x0 | + | VFP vfma.f64 (64bit x1) n8 |
- | CPU part : 0xd03 | + | VFP fmuld (64bit x1) ns4 |
- | CPU revision : 4 | + | VFP faddd (64bit x1) ns4 : 0.603 |
+ | VFP fmacd (64bit x1) ns4 : 1.054 2294.8 1147.4 | ||
+ | VFP vfma.f64 (64bit x1) ns4 : | ||
+ | VFP fmuld (64bit x1) n1 : | ||
+ | VFP faddd (64bit x1) n1 : | ||
+ | VFP fmacd (64bit x1) n1 : | ||
+ | VFP vfma.f64 (64bit x1) n1 : - - - | ||
+ | Average | ||
+ | Highest | ||
- | processor : 2 | ||
- | model name : ARMv8 Processor rev 4 (v8l) | ||
- | BogoMIPS : 48.00 | ||
- | Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt lpae evtstrm aes pmull sha1 sha2 crc32 | ||
- | CPU implementer : | ||
- | CPU architecture: | ||
- | CPU variant : 0x0 | ||
- | CPU part : 0xd03 | ||
- | CPU revision : 4 | ||
- | processor : 3 | + | * Group 1: |
- | model name : ARMv8 Processor rev 4 (v8l) | + | * VFP/NEON (SP fp) multi-thread |
- | BogoMIPS : 48.00 | + | TIME(s) |
- | Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt lpae evtstrm aes pmull sha1 sha2 crc32 | + | VFP fmuls (32bit x1) n8 : |
- | CPU implementer : 0x41 | + | VFP fadds (32bit x1) n8 : |
- | CPU architecture: 8 | + | VFP fmacs (32bit x1) n8 : |
- | CPU variant : 0x0 | + | VFP vfma.f32 (32bit x1) n8 |
- | CPU part : 0xd03 | + | NEON vmul.f32 (32bit x2) n8 : |
- | CPU revision : 4 | + | NEON vadd.f32 |
+ | NEON vmla.f32 (32bit x2) n8 : 0.302 32089.1 | ||
+ | NEON vfma.f32 (32bit x2) n8 : | ||
+ | NEON vmul.f32 (32bit x4) n8 : | ||
+ | NEON vadd.f32 (32bit x4) n8 : | ||
+ | NEON vmla.f32 (32bit x4) n8 : | ||
+ | NEON vfma.f32 (32bit x4) n8 : | ||
+ | VFP fmuls (32bit x1) ns4 | ||
+ | VFP fadds (32bit x1) ns4 : 0.603 | ||
+ | VFP fmacs (32bit x1) ns4 : 1.055 | ||
+ | VFP vfma.f32 (32bit x1) ns4 : | ||
+ | NEON vmul.f32 (32bit x2) ns4 : 0.602 | ||
+ | NEON vadd.f32 (32bit x2) ns4 : 0.603 | ||
+ | NEON vmla.f32 (32bit x2) ns4 : 1.055 | ||
+ | NEON vfma.f32 (32bit x2) ns4 : - - - | ||
+ | NEON vmul.f32 (32bit x4) ns4 : 0.603 16058.0 | ||
+ | NEON vadd.f32 (32bit x4) ns4 : 0.602 16066.4 | ||
+ | NEON vmla.f32 (32bit x4) ns4 : 1.054 18359.9 | ||
+ | NEON vfma.f32 (32bit x4) ns4 : - - - | ||
+ | VFP fmuls (32bit x1) n1 : | ||
+ | VFP fadds (32bit x1) n1 : | ||
+ | VFP fmacs (32bit x1) n1 : | ||
+ | VFP vfma.f32 (32bit x1) n1 : - - - | ||
+ | NEON vmul.f32 (32bit x2) n1 : | ||
+ | NEON vadd.f32 (32bit x2) n1 : | ||
+ | NEON vmla.f32 (32bit x2) n1 : | ||
+ | NEON vfma.f32 (32bit x2) n1 : | ||
+ | NEON vmul.f32 (32bit x4) n1 : | ||
+ | NEON vadd.f32 (32bit x4) n1 : | ||
+ | NEON vmla.f32 (32bit x4) n1 : | ||
+ | NEON vfma.f32 (32bit x4) n1 : | ||
+ | NEON vmul.f32 (32bit x4) n12 : 0.904 16062.1 | ||
+ | NEON vadd.f32 (32bit x4) n12 : 0.904 16063.5 | ||
+ | NEON vmla.f32 (32bit x4) n12 : 0.904 32117.4 | ||
+ | NEON vfma.f32 (32bit x4) n12 : - - - | ||
+ | Average | ||
+ | Highest | ||
- | processor : 4 | ||
- | model name : ARMv8 Processor rev 2 (v8l) | ||
- | BogoMIPS : 48.00 | ||
- | Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt lpae evtstrm aes pmull sha1 sha2 crc32 | ||
- | CPU implementer : | ||
- | CPU architecture: | ||
- | CPU variant : 0x0 | ||
- | CPU part : 0xd08 | ||
- | CPU revision : 2 | ||
- | processor : 5 | + | * Group 1: |
- | model name : ARMv8 Processor rev 2 (v8l) | + | * VFP/NEON (DP fp) multi-thread |
- | BogoMIPS : 48.00 | + | TIME(s) |
- | Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt lpae evtstrm aes pmull sha1 sha2 crc32 | + | VFP fmuld (64bit x1) n8 : |
- | CPU implementer : 0x41 | + | VFP faddd (64bit x1) n8 : 0.301 |
- | CPU architecture: 8 | + | VFP fmacd (64bit x1) n8 : |
- | CPU variant : 0x0 | + | VFP vfma.f64 (64bit x1) n8 |
- | CPU part : 0xd08 | + | VFP fmuld (64bit x1) ns4 |
- | CPU revision : 2 | + | VFP faddd (64bit x1) ns4 |
- | + | VFP fmacd (64bit x1) ns4 | |
- | + | VFP vfma.f64 (64bit x1) ns4 : | |
- | ARMv8 Processor rev 4 (v8l) | + | VFP fmuld (64bit x1) n1 : |
+ | VFP faddd (64bit x1) n1 : | ||
+ | VFP fmacd (64bit x1) n1 : | ||
+ | VFP vfma.f64 (64bit x1) n1 : - - - | ||
+ | Average | ||
+ | Highest | ||
- | 2019/01/05 16: | ||
</ | </ | ||
行 12303: | 行 13618: | ||
< | < | ||
- | Windows 10 1703 bash | + | Date: 20200808 132716 |
- | Skylake | + | ARCH: x64 (x86_64) |
- | RAM 32GB | + | FPU : SSE SSE2 SSSE3 SSE4.1 SSE4.2 AVX AVX2 FMA3 F16C |
+ | Name: Intel(R) | ||
+ | CPU Thread: | ||
+ | CPU Core : 4 | ||
+ | CPU Group : 1 | ||
+ | Group 0: Thread= 8 Clock=4.200000 GHz (mask:ff) | ||
+ | SSE : yes | ||
+ | AVX : yes | ||
+ | FMA : yes | ||
+ | F16C : yes | ||
+ | AVX512: no | ||
- | ARCH: x64 | + | Total: |
- | FPU: SSSE3 SSE4.1 SSE4.2 AVX FMA3 | + | SingleThread HP max: - |
- | SingleT | + | SingleThread |
- | SingleT | + | SingleThread |
- | MultiT | + | MultiThread |
- | MultiT | + | MultiThread |
- | CPU core: 8 | + | MultiThread |
- | SSE: yes | + | |
- | AVX: yes | + | |
- | FMA: yes | + | |
- | * SSE/AVX (single fp) | + | Group 0: |
- | SSE mulss (32bit x1) n8 : | + | |
- | SSE addss (32bit x1) n8 : | + | |
- | FMA vfmaddss (32bit x1) n8 : 0.152 15775.5 | + | |
- | SSE mulps (32bit x4) n8 : | + | |
- | SSE addps (32bit x4) n8 : | + | |
- | SSE mul+addps (32bit x4) n8 : | + | |
- | FMA vfmaddss | + | |
- | SSE ml+ad+addps (32bit x4) n6 : 0.171 31570.6 | + | |
- | SSE mulss (32bit x1) ns4 | + | |
- | SSE addss (32bit x1) ns4 | + | |
- | SSE mulps (32bit x4) ns4 : 0.291 16488.3 | + | |
- | SSE addps (32bit x4) ns4 : 0.292 16411.6 | + | |
- | AVX vmulps (32bit x8) n8 : 0.145 66011.1 | + | |
- | AVX vaddps (32bit x8) n8 : 0.146 65962.6 | + | |
- | AVX vmul+addps (32bit x8) n8 : | + | |
- | FMA vfmaddps (32bit x8) n8 : 0.145 | + | |
- | AVX vml+ad+adps (32bit x8) n6 : | + | |
- | Average | + | |
- | Highest | + | |
- | * SSE/AVX (double | + | * Group 0: Thread=1 |
- | SSE2 mulsd (64bit x1) n8 : 0.146 | + | * SSE/AVX (SP fp) |
- | SSE2 addsd (64bit x1) n8 : 0.144 | + | TIME(s) |
- | FMA | + | SSE mulss (32bit |
- | SSE2 mulpd (64bit x2) n8 : 0.148 16244.0 16244.0 | + | SSE addss (32bit x1) n8 |
- | SSE2 addpd (64bit x2) n8 : 0.152 15782.7 15782.7 | + | FMA vfmaddss |
- | SSE2 mul+addpd (64bit x2) n8 : 0.151 15845.7 15845.7 | + | FMA vfmaddss (32bit x1) n12 : 0.451 16747.6 |
- | FMA | + | FMA vfma+mlss (32bit x1) n12 : 0.452 12544.3 |
- | SSE2 ml+ad+dpd (64bit x2) n6 : | + | FMA vfma+adss (32bit x1) n12 : 0.446 12702.3 |
- | SSE2 mulsd (64bit x1) ns4 | + | SSE mulps (32bit x4) n8 |
- | SSE2 addsd (64bit x1) ns4 | + | SSE addps (32bit x4) n8 |
- | SSE2 mulpd (64bit x2) ns4 | + | SSE mul+addps (32bit x4) n8 |
- | SSE2 addpd (64bit x2) ns4 | + | FMA vfmaddps |
- | AVX vmulpd | + | FMA vfmaddps (32bit x4) n12 : 0.446 67744.5 |
- | AVX vaddpd | + | FMA vfma+mlps (32bit x4) n12 : 0.446 50806.3 |
- | AVX vmul+addpd (64bit x4) n8 : 0.151 31715.6 31715.6 | + | FMA vfma+adps (32bit x4) n12 : 0.449 50565.6 |
- | FMA vfmaddpd | + | SSE ml+ad+adps (32bit x4) n9 : |
- | AVX vml_ad_adpd | + | SSE mulss (32bit x1) ns4 : 0.595 |
- | Average | + | SSE addss (32bit x1) ns4 : 0.595 |
- | Highest | + | SSE mulps (32bit x4) ns4 : 0.595 16943.0 |
+ | SSE addps (32bit x4) ns4 : 0.595 16942.1 | ||
+ | AVX vmulps | ||
+ | AVX vaddps | ||
+ | AVX vmul+addps (32bit x8) n8 : 0.297 67772.0 8471.5 ( 8.0 2.0) | ||
+ | FMA vfmaddps | ||
+ | FMA vfmaddps (32bit x8) n12 : | ||
+ | FMA vfma+mlps (32bit x8) n12 : 0.447 | ||
+ | FMA vfma+adps | ||
+ | AVX vml+ad+adps (32bit x8) n9 : 0.572 39625.7 | ||
+ | Average | ||
+ | Highest | ||
- | * Matrix 4x4 | + | * Group 0: Thread=1 |
- | C++ code | + | * SSE/AVX (DP fp) |
- | C++ Intrinsic SSE 128bit | + | TIME(s) |
- | SSE mul/ | + | SSE2 mulsd (64bit x1) n8 |
- | AVX vmul/addps 256bit A : 0.104 17283.1 | + | SSE2 addsd (64bit x1) n8 : 0.298 |
- | Average | + | FMA vfmaddsd (64bit x1) n8 : |
- | Highest | + | FMA vfmaddsd (64bit x1) n12 : 0.446 16935.3 |
+ | FMA vfma+mlsd (64bit x1) n12 : 0.449 12618.1 | ||
+ | FMA vfma+adsd (64bit x1) n12 : 0.449 12642.0 | ||
+ | SSE2 mulpd (64bit x2) n8 : | ||
+ | SSE2 addpd (64bit x2) n8 : 0.298 16936.1 | ||
+ | SSE2 mul+addpd (64bit x2) n8 : | ||
+ | FMA vfmaddpd (64bit x2) n8 : | ||
+ | FMA vfmaddpd (64bit x2) n12 : 0.446 33874.7 | ||
+ | FMA vfma+mlpd (64bit x2) n12 : 0.446 25399.5 | ||
+ | FMA vfma+adpd (64bit x2) n12 : 0.446 25413.5 | ||
+ | SSE2 ml+ad+dpd (64bit x2) n9 : 0.338 16780.8 8390.4 | ||
+ | SSE2 mulsd (64bit x1) ns4 : | ||
+ | SSE2 addsd (64bit x1) ns4 : | ||
+ | SSE2 mulpd (64bit x2) ns4 : | ||
+ | SSE2 addpd (64bit x2) ns4 : | ||
+ | AVX vmulpd (64bit x4) n8 : 0.298 33863.8 | ||
+ | AVX vaddpd (64bit x4) n8 : 0.298 33858.0 | ||
+ | AVX vmul+addpd (64bit x4) n8 | ||
+ | FMA vfmaddpd (64bit x4) n8 : 0.298 67611.8 | ||
+ | FMA vfmaddpd (64bit x4) n12 : | ||
+ | FMA vfma+mlpd (64bit x4) n12 : 0.447 50713.3 | ||
+ | FMA vfma+adpd (64bit x4) n12 : 0.446 50820.7 | ||
+ | AVX vml_ad_adpd (64bit x4) n9 : 0.335 33858.6 | ||
+ | Average | ||
+ | Highest | ||
- | * SSE/AVX (single | + | * Group 0: Thread=8 |
- | SSE mulss (32bit x1) n8 : | + | * SSE/AVX (SP fp) multi-thread |
- | SSE addss (32bit x1) n8 : | + | TIME(s) |
- | FMA vfmaddss (32bit x1) n8 : 0.300 63913.1 | + | SSE mulss (32bit x1) n8 : |
- | SSE mulps (32bit x4) n8 : | + | SSE addss (32bit x1) n8 : |
- | SSE addps (32bit x4) n8 : | + | FMA vfmaddss (32bit x1) n8 : 0.608 66343.1 4146.4 |
- | SSE mul+addps (32bit x4) n8 : | + | FMA vfmaddss (32bit x1) n12 : |
- | FMA vfmaddss | + | FMA vfma+mlss (32bit x1) n12 : 0.928 48899.6 |
- | SSE ml+ad+addps (32bit x4) n6 : 0.337 | + | FMA vfma+adss (32bit x1) n12 : 0.910 49837.6 |
- | SSE mulss (32bit x1) ns4 : 0.303 31631.7 31631.7 | + | SSE mulps (32bit x4) n8 : |
- | SSE addss (32bit x1) ns4 : 0.305 31489.1 31489.1 | + | SSE addps (32bit x4) n8 : |
- | SSE mulps (32bit x4) ns4 : 0.301 | + | SSE mul+addps (32bit x4) n8 : |
- | SSE addps (32bit x4) ns4 : 0.302 | + | FMA vfmaddps |
- | AVX vmulps (32bit x8) n8 : 0.300 | + | FMA vfmaddps (32bit x4) n12 |
- | AVX vaddps (32bit x8) n8 : 0.300 | + | FMA vfma+mlps (32bit x4) n12 : 0.898 |
- | AVX vmul+addps (32bit x8) n8 : 0.301 | + | FMA vfma+adps (32bit x4) n12 : 0.898 |
- | FMA vfmaddps (32bit x8) n8 : | + | SSE ml+ad+adps (32bit x4) n9 |
- | AVX vml+ad+adps (32bit x8) n6 : 0.383 | + | SSE mulss (32bit x1) ns4 : 0.705 28580.7 3572.6 ( 8.0 0.9) |
- | Average | + | SSE addss (32bit x1) ns4 : 0.696 28953.7 3619.2 ( 8.0 0.9) |
- | Highest | + | SSE mulps (32bit x4) ns4 : 0.620 |
+ | SSE addps (32bit x4) ns4 : 0.635 | ||
+ | AVX vmulps (32bit x8) n8 : 0.622 | ||
+ | AVX vaddps (32bit x8) n8 : 0.577 | ||
+ | AVX vmul+addps (32bit x8) n8 : 0.594 | ||
+ | FMA vfmaddps (32bit x8) n8 : 0.600 | ||
+ | FMA vfmaddps (32bit x8) n12 : | ||
+ | FMA vfma+mlps (32bit x8) n12 | ||
+ | FMA vfma+adps (32bit x8) n12 : 0.860 | ||
+ | AVX vml+ad+adps (32bit x8) n9 : 0.650 | ||
+ | Average | ||
+ | Highest | ||
- | * SSE/AVX (double | + | * Group 0: Thread=8 |
- | SSE2 mulsd (64bit x1) n8 : 0.302 31776.5 31776.5 | + | * SSE/AVX (DP fp) multi-thread |
- | SSE2 addsd (64bit x1) n8 : 0.300 31957.5 31957.5 | + | TIME(s) |
- | FMA vfmaddsd (64bit x1) n8 : | + | SSE2 mulsd (64bit x1) n8 : 0.596 33802.4 4225.3 ( 8.0 1.0) |
- | SSE2 mulpd (64bit x2) n8 : 0.306 62653.0 62653.0 | + | SSE2 addsd (64bit x1) n8 : 0.595 33885.9 4235.7 ( 8.0 1.0) |
- | SSE2 addpd (64bit x2) n8 : 0.300 63899.9 63899.9 | + | FMA vfmaddsd (64bit x1) n8 : |
- | SSE2 mul+addpd (64bit x2) n8 : 0.304 63122.8 63122.8 | + | FMA vfmaddsd (64bit x1) n12 : 0.893 67747.1 |
- | FMA | + | FMA vfma+mlsd (64bit x1) n12 : 0.892 50829.3 |
- | SSE2 ml+ad+dpd (64bit x2) n6 : | + | FMA vfma+adsd (64bit x1) n12 : 0.892 50831.7 |
- | SSE2 mulsd (64bit x1) ns4 : | + | SSE2 mulpd (64bit x2) n8 : 0.595 67767.3 |
- | SSE2 addsd (64bit x1) ns4 : | + | SSE2 addpd (64bit x2) n8 : 0.595 67771.9 4235.7 ( 16.0 1.0) |
- | SSE2 mulpd (64bit x2) ns4 : | + | SSE2 mul+addpd (64bit x2) n8 : 0.595 67772.2 4235.8 ( 16.0 1.0) |
- | SSE2 addpd (64bit x2) ns4 : | + | FMA |
- | AVX vmulpd (64bit x4) n8 : 0.299 | + | FMA vfmaddpd (64bit x2) n12 : 0.892 |
- | AVX vaddpd (64bit x4) n8 : 0.300 | + | FMA vfma+mlpd (64bit x2) n12 : 0.892 |
- | AVX vmul+addpd (64bit x4) n8 : 0.300 | + | FMA vfma+adpd (64bit x2) n12 : 0.892 |
- | FMA vfmaddpd (64bit x4) n8 : 0.301 | + | SSE2 ml+ad+dpd (64bit x2) n9 : |
- | AVX vml_ad_adpd | + | SSE2 mulsd (64bit x1) ns4 : |
- | Average | + | SSE2 addsd (64bit x1) ns4 : |
- | Highest | + | SSE2 mulpd (64bit x2) ns4 : |
- | + | SSE2 addpd (64bit x2) ns4 : | |
- | + | AVX vmulpd (64bit x4) n8 : 0.594 | |
- | * Matrix 4x4 multi-thread | + | AVX vaddpd (64bit x4) n8 : 0.595 |
- | C++ code : | + | AVX vmul+addpd (64bit x4) n8 : 0.595 |
- | C++ Intrinsic SSE 128bit | + | FMA vfmaddpd (64bit x4) n8 : 0.595 |
- | SSE mul/ | + | FMA vfmaddpd |
- | AVX vmul/addps 256bit A : 0.116 | + | FMA vfma+mlpd (64bit x4) n12 |
- | Average | + | FMA vfma+adpd (64bit x4) n12 |
- | Highest | + | AVX vml_ad_adpd (64bit x4) n9 : 0.661 |
+ | Average | ||
+ | Highest | ||
</ | </ | ||
行 12431: | 行 13788: | ||
+ | ==== Intel Ice Lake (AMD64 x86_64 x64) SSE4.2/ | ||
+ | ++++Intel Core i5-1030NG7 1.1GHz (3.5GHz) 4 core 8 thread Windows 10| | ||
- | ==== AMD Ryzen 7 1800X (AMD64 x86_64 x64) SSE4.2/ | + | < |
+ | Date: 20200810 185418 | ||
+ | ARCH: x64 (x86_64) | ||
+ | FPU : SSE SSE2 SSSE3 SSE4.1 SSE4.2 AVX AVX2 FMA3 F16C AVX512F/ | ||
+ | Name: | ||
+ | CPU Thread: | ||
+ | CPU Core : 4 | ||
+ | CPU Group : 1 | ||
+ | Group 0: Thread= 8 Clock=1.100000 GHz (mask:0) | ||
+ | SSE : yes | ||
+ | AVX : yes | ||
+ | FMA : yes | ||
+ | F16C : yes | ||
+ | AVX512: yes | ||
+ | |||
+ | Total: | ||
+ | SingleThread HP max: - | ||
+ | SingleThread SP max: 111.310 GFLOPS | ||
+ | SingleThread DP max: | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | |||
+ | Group 0: Thread=8 Clock=1.100000 GHz (mask:0) | ||
+ | SingleThread HP max: - | ||
+ | SingleThread SP max: 111.310 GFLOPS | ||
+ | SingleThread DP max: | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | |||
+ | |||
+ | * Group 0: Thread=1 | ||
+ | * SSE/AVX (SP fp) | ||
+ | TIME(s) | ||
+ | SSE mulss (32bit x1) n8 : | ||
+ | SSE addss (32bit x1) n8 : | ||
+ | FMA vfmaddss (32bit x1) n8 : 0.101 13027.6 | ||
+ | FMA vfmaddss (32bit x1) n12 : | ||
+ | FMA vfma+mlss (32bit x1) n12 : 0.143 10399.8 | ||
+ | FMA vfma+adss (32bit x1) n12 : 0.142 10437.5 | ||
+ | SSE mulps (32bit x4) n8 : | ||
+ | SSE addps (32bit x4) n8 : | ||
+ | SSE mul+addps (32bit x4) n8 : | ||
+ | FMA vfmaddps (32bit x4) n8 : 0.102 51919.4 | ||
+ | FMA vfmaddps (32bit x4) n12 : | ||
+ | FMA vfma+mlps (32bit x4) n12 : 0.142 41781.8 | ||
+ | FMA vfma+adps (32bit x4) n12 : 0.143 41652.9 | ||
+ | SSE ml+ad+adps (32bit x4) n9 : 0.108 27519.6 | ||
+ | SSE mulss (32bit x1) ns4 : 0.190 | ||
+ | SSE addss (32bit x1) ns4 : 0.190 | ||
+ | SSE mulps (32bit x4) ns4 : 0.190 13906.4 | ||
+ | SSE addps (32bit x4) ns4 : 0.190 13867.9 | ||
+ | AVX vmulps (32bit x8) n8 : 0.095 55597.1 | ||
+ | AVX vaddps (32bit x8) n8 : 0.095 55388.9 | ||
+ | AVX vmul+addps (32bit x8) n8 : 0.095 55612.9 | ||
+ | FMA vfmaddps (32bit x8) n8 : 0.122 86880.7 | ||
+ | FMA vfmaddps (32bit x8) n12 : | ||
+ | FMA vfma+mlps (32bit x8) n12 : 0.142 83413.5 | ||
+ | FMA vfma+adps (32bit x8) n12 : 0.144 82441.6 | ||
+ | AVX vml+ad+adps (32bit x8) n9 : | ||
+ | AVX512 vmulps (32bit x16) n12 : | ||
+ | AVX512 vaddps (32bit x16) n12 : | ||
+ | AVX512 vfmaddps (32bit x16) n12 : | ||
+ | AVX512 vfma+mps (32bit x16) n12 : | ||
+ | AVX512 vfma+aps (32bit x16) n12 : | ||
+ | AVX512 vmulps (32bit x8) n12 : 0.144 55154.4 | ||
+ | AVX512 vaddps (32bit x8) n12 : 0.142 55624.6 | ||
+ | AVX512 vfmaddps (32bit x8) n12 : 0.142 | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 0: Thread=1 | ||
+ | * SSE/AVX (DP fp) | ||
+ | TIME(s) | ||
+ | SSE2 mulsd (64bit x1) n8 : 0.143 | ||
+ | SSE2 addsd (64bit x1) n8 : 0.102 | ||
+ | FMA vfmaddsd (64bit x1) n8 : | ||
+ | FMA vfmaddsd (64bit x1) n12 : 0.142 13910.1 | ||
+ | FMA vfma+mlsd (64bit x1) n12 : | ||
+ | FMA vfma+adsd (64bit x1) n12 : | ||
+ | SSE2 mulpd (64bit x2) n8 : 0.102 12983.3 | ||
+ | SSE2 addpd (64bit x2) n8 : 0.102 12988.4 | ||
+ | SSE2 mul+addpd (64bit x2) n8 : 0.101 13026.5 | ||
+ | FMA vfmaddpd (64bit x2) n8 : | ||
+ | FMA vfmaddpd (64bit x2) n12 : 0.143 27767.1 | ||
+ | FMA vfma+mlpd (64bit x2) n12 : | ||
+ | FMA vfma+adpd (64bit x2) n12 : | ||
+ | SSE2 ml+ad+dpd (64bit x2) n9 : 0.108 13686.9 | ||
+ | SSE2 mulsd (64bit x1) ns4 : | ||
+ | SSE2 addsd (64bit x1) ns4 : | ||
+ | SSE2 mulpd (64bit x2) ns4 : | ||
+ | SSE2 addpd (64bit x2) ns4 : | ||
+ | AVX vmulpd (64bit x4) n8 : 0.096 27464.0 | ||
+ | AVX vaddpd (64bit x4) n8 : 0.095 27868.4 | ||
+ | AVX vmul+addpd (64bit x4) n8 : 0.095 27776.9 | ||
+ | FMA vfmaddpd (64bit x4) n8 : 0.101 52105.9 | ||
+ | FMA vfmaddpd (64bit x4) n12 : | ||
+ | FMA vfma+mlpd (64bit x4) n12 : 0.143 41631.3 | ||
+ | FMA vfma+adpd (64bit x4) n12 : 0.142 41748.7 | ||
+ | AVX vml_ad_adpd (64bit x4) n9 : | ||
+ | AVX512 vmulpd (64bit x8) n12 : 0.294 26935.4 | ||
+ | AVX512 vaddpd (64bit x8) n12 : 0.294 26918.9 | ||
+ | AVX512 vfmaddpd (64bit x8) n12 : 0.294 53835.4 | ||
+ | AVX512 vfma+mpd (64bit x8) n12 : 0.293 40495.9 | ||
+ | AVX512 vfma+apd (64bit x8) n12 : 0.293 40512.9 | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 0: Thread=8 | ||
+ | * SSE/AVX (SP fp) multi-thread | ||
+ | TIME(s) | ||
+ | SSE mulss (32bit x1) n8 : | ||
+ | SSE addss (32bit x1) n8 : | ||
+ | FMA vfmaddss (32bit x1) n8 : 0.207 51050.5 | ||
+ | FMA vfmaddss (32bit x1) n12 : | ||
+ | FMA vfma+mlss (32bit x1) n12 : 0.310 38279.6 | ||
+ | FMA vfma+adss (32bit x1) n12 : 0.310 38294.5 | ||
+ | SSE mulps (32bit x4) n8 : | ||
+ | SSE addps (32bit x4) n8 : | ||
+ | SSE mul+addps (32bit x4) n8 : | ||
+ | FMA vfmaddps (32bit x4) n8 : 0.207 | ||
+ | FMA vfmaddps (32bit x4) n12 : | ||
+ | FMA vfma+mlps (32bit x4) n12 : 0.310 | ||
+ | FMA vfma+adps (32bit x4) n12 : 0.310 | ||
+ | SSE ml+ad+adps (32bit x4) n9 : 0.233 | ||
+ | SSE mulss (32bit x1) ns4 : 0.231 22819.0 | ||
+ | SSE addss (32bit x1) ns4 : 0.232 22796.0 | ||
+ | SSE mulps (32bit x4) ns4 : 0.232 90991.3 | ||
+ | SSE addps (32bit x4) ns4 : 0.232 91226.8 | ||
+ | AVX vmulps (32bit x8) n8 : 0.207 | ||
+ | AVX vaddps (32bit x8) n8 : 0.207 | ||
+ | AVX vmul+addps (32bit x8) n8 : 0.207 | ||
+ | FMA vfmaddps (32bit x8) n8 : 0.207 | ||
+ | FMA vfmaddps (32bit x8) n12 : | ||
+ | FMA vfma+mlps (32bit x8) n12 : 0.311 | ||
+ | FMA vfma+adps (32bit x8) n12 : 0.310 | ||
+ | AVX vml+ad+adps (32bit x8) n9 : | ||
+ | AVX512 vmulps (32bit x16) n12 : | ||
+ | AVX512 vaddps (32bit x16) n12 : | ||
+ | AVX512 vfmaddps (32bit x16) n12 : | ||
+ | AVX512 vfma+mps (32bit x16) n12 : | ||
+ | AVX512 vfma+aps (32bit x16) n12 : | ||
+ | AVX512 vmulps (32bit x8) n12 : 0.316 | ||
+ | AVX512 vaddps (32bit x8) n12 : 0.310 | ||
+ | AVX512 vfmaddps (32bit x8) n12 : 0.306 | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 0: Thread=8 | ||
+ | * SSE/AVX (DP fp) multi-thread | ||
+ | TIME(s) | ||
+ | SSE2 mulsd (64bit x1) n8 : 0.244 21634.6 | ||
+ | SSE2 addsd (64bit x1) n8 : 0.207 25508.6 | ||
+ | FMA vfmaddsd (64bit x1) n8 : | ||
+ | FMA vfmaddsd (64bit x1) n12 : 0.311 50924.6 | ||
+ | FMA vfma+mlsd (64bit x1) n12 : | ||
+ | FMA vfma+adsd (64bit x1) n12 : | ||
+ | SSE2 mulpd (64bit x2) n8 : 0.207 51029.3 | ||
+ | SSE2 addpd (64bit x2) n8 : 0.207 51025.8 | ||
+ | SSE2 mul+addpd (64bit x2) n8 : 0.207 51019.7 | ||
+ | FMA vfmaddpd (64bit x2) n8 : | ||
+ | FMA vfmaddpd (64bit x2) n12 : 0.311 | ||
+ | FMA vfma+mlpd (64bit x2) n12 : | ||
+ | FMA vfma+adpd (64bit x2) n12 : | ||
+ | SSE2 ml+ad+dpd (64bit x2) n9 : 0.233 51085.6 | ||
+ | SSE2 mulsd (64bit x1) ns4 : | ||
+ | SSE2 addsd (64bit x1) ns4 : | ||
+ | SSE2 mulpd (64bit x2) ns4 : | ||
+ | SSE2 addpd (64bit x2) ns4 : | ||
+ | AVX vmulpd (64bit x4) n8 : 0.207 | ||
+ | AVX vaddpd (64bit x4) n8 : 0.207 | ||
+ | AVX vmul+addpd (64bit x4) n8 : 0.207 | ||
+ | FMA vfmaddpd (64bit x4) n8 : 0.207 | ||
+ | FMA vfmaddpd (64bit x4) n12 : | ||
+ | FMA vfma+mlpd (64bit x4) n12 : 0.314 | ||
+ | FMA vfma+adpd (64bit x4) n12 : 0.318 | ||
+ | AVX vml_ad_adpd (64bit x4) n9 : | ||
+ | AVX512 vmulpd (64bit x8) n12 : 0.682 92879.9 | ||
+ | AVX512 vaddpd (64bit x8) n12 : 0.682 92855.7 | ||
+ | AVX512 vfmaddpd (64bit x8) n12 : 0.682 | ||
+ | AVX512 vfma+mpd (64bit x8) n12 : 0.682 | ||
+ | AVX512 vfma+apd (64bit x8) n12 : 0.682 | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | </ | ||
+ | |||
+ | ++++ | ||
+ | |||
+ | |||
+ | |||
+ | ==== AMD Zen (AMD64 x86_64 x64) SSE4.2/ | ||
行 12440: | 行 13994: | ||
< | < | ||
- | Windows 10 1703 bash | + | Date: 20200624 215250 |
- | RYZEN 7 1800X 3.6GHz | + | ARCH: x64 (x86_64) |
- | RAM 32GB | + | FPU : SSE SSE2 SSSE3 SSE4.1 SSE4.2 AVX AVX2 FMA3 F16C |
+ | Name: AMD Ryzen 7 1800X Eight-Core Processor | ||
+ | CPU Thread: 16 | ||
+ | CPU Core : 8 | ||
+ | CPU Group : 1 | ||
+ | Group 0: Thread=16 | ||
+ | SSE : yes | ||
+ | AVX : yes | ||
+ | FMA : yes | ||
+ | F16C : yes | ||
+ | AVX512: no | ||
- | ARCH: x64 | + | Total: |
- | FPU: SSSE3 SSE4.1 SSE4.2 AVX FMA3 | + | SingleThread HP max: - |
- | SingleT | + | SingleThread |
- | SingleT | + | SingleThread |
- | MultiT | + | MultiThread |
- | MultiT | + | MultiThread |
- | CPU core: 16 | + | MultiThread |
- | SSE: yes | + | |
- | AVX: yes | + | |
- | FMA: yes | + | |
- | * SSE/AVX (single fp) | + | Group 0: |
- | SSE mulss (32bit x1) n8 : | + | |
- | SSE addss (32bit x1) n8 : | + | |
- | FMA vfmaddss (32bit x1) n8 : 0.184 13063.9 | + | |
- | SSE mulps (32bit x4) n8 : | + | |
- | SSE addps (32bit x4) n8 : | + | |
- | SSE mul+addps (32bit x4) n8 | + | |
- | FMA vfmaddss (32bit x4) n8 | + | |
- | SSE ml+ad+addps (32bit x4) n6 : | + | |
- | SSE mulss (32bit x1) ns4 | + | |
- | SSE addss (32bit x1) ns4 | + | |
- | SSE mulps (32bit x4) ns4 : 0.222 21655.1 | + | |
- | SSE addps (32bit x4) ns4 : 0.228 | + | |
- | AVX vmulps (32bit x8) n8 : 0.295 32491.2 | + | |
- | AVX vaddps (32bit x8) n8 : 0.295 32505.0 | + | |
- | AVX vmul+addps (32bit x8) n8 : 0.148 64943.4 | + | |
- | FMA vfmaddps (32bit x8) n8 : 0.302 63654.8 | + | |
- | AVX vml+ad+adps (32bit x8) n6 : 0.302 35749.4 | + | |
- | Average | + | |
- | Highest | + | |
- | * SSE/AVX (double | + | * Group 0: Thread=1 |
- | SSE2 mulsd (64bit x1) n8 : 0.159 | + | * SSE/AVX (SP fp) |
- | SSE2 addsd (64bit x1) n8 : 0.147 | + | TIME(s) |
- | FMA | + | SSE mulss (32bit |
- | SSE2 mulpd (64bit x2) n8 | + | SSE addss (32bit x1) n8 |
- | SSE2 addpd (64bit x2) n8 : 0.148 16204.0 | + | FMA vfmaddss |
- | SSE2 mul+addpd (64bit x2) n8 : 0.148 16254.4 16254.4 | + | FMA vfmaddss (32bit x1) n12 : |
- | FMA | + | FMA vfma+mlss |
- | SSE2 ml+ad+dpd (64bit x2) n6 : | + | FMA vfma+adss (32bit x1) n12 : 0.365 13318.0 |
- | SSE2 mulsd (64bit x1) ns4 | + | SSE mulps (32bit x4) n8 |
- | SSE2 addsd (64bit x1) ns4 | + | SSE addps (32bit x4) n8 : |
- | SSE2 mulpd (64bit x2) ns4 | + | SSE mul+addps (32bit x4) n8 |
- | SSE2 addpd (64bit x2) ns4 | + | FMA vfmaddps |
- | AVX vmulpd | + | FMA vfmaddps (32bit x4) n12 : |
- | AVX vaddpd | + | FMA vfma+mlps (32bit x4) n12 : 0.436 44592.1 |
- | AVX vmul+addpd (64bit x4) n8 : 0.156 30721.8 | + | FMA vfma+adps (32bit x4) n12 : 0.367 53029.3 |
- | FMA vfmaddpd | + | SSE ml+ad+adps (32bit x4) n9 : |
- | AVX vml_ad_adpd | + | SSE mulss (32bit x1) ns4 : 0.425 |
- | Average | + | SSE addss (32bit x1) ns4 : 0.429 |
- | Highest | + | SSE mulps (32bit x4) ns4 : 0.421 20526.6 |
+ | SSE addps (32bit x4) ns4 : 0.424 20358.2 5089.6 ( 4.0 1.4) | ||
+ | AVX vmulps | ||
+ | AVX vaddps | ||
+ | AVX vmul+addps (32bit x8) n8 : 0.277 62298.1 | ||
+ | FMA vfmaddps (32bit x8) n8 : 0.572 60396.0 | ||
+ | FMA vfmaddps | ||
+ | FMA vfma+mlps (32bit x8) n12 | ||
+ | FMA vfma+adps (32bit x8) n12 : 0.646 60217.6 | ||
+ | AVX vml+ad+adps | ||
+ | Average | ||
+ | Highest | ||
- | * Matrix 4x4 | + | * Group 0: Thread=1 |
- | C++ code | + | * SSE/AVX (DP fp) |
- | C++ Intrinsic SSE 128bit | + | TIME(s) |
- | SSE mul/ | + | SSE2 mulsd (64bit x1) n8 |
- | AVX vmul/addps 256bit A : 0.120 14947.5 | + | SSE2 addsd (64bit x1) n8 : 0.281 |
- | Average | + | FMA vfmaddsd (64bit x1) n8 : |
- | Highest | + | FMA vfmaddsd (64bit x1) n12 : 0.429 15103.3 7551.7 |
+ | FMA vfma+mlsd (64bit x1) n12 : 0.460 10566.2 | ||
+ | FMA vfma+adsd (64bit x1) n12 : 0.356 13660.4 | ||
+ | SSE2 mulpd (64bit x2) n8 : 0.286 15127.0 | ||
+ | SSE2 addpd (64bit x2) n8 : 0.283 15291.6 | ||
+ | SSE2 mul+addpd (64bit x2) n8 : | ||
+ | FMA vfmaddpd (64bit x2) n8 : | ||
+ | FMA vfmaddpd (64bit x2) n12 : 0.420 30844.8 | ||
+ | FMA vfma+mlpd (64bit x2) n12 : 0.461 21077.2 | ||
+ | FMA vfma+adpd (64bit x2) n12 : 0.354 27446.3 | ||
+ | SSE2 ml+ad+dpd (64bit x2) n9 : 0.277 17524.8 | ||
+ | SSE2 mulsd (64bit x1) ns4 : | ||
+ | SSE2 addsd (64bit x1) ns4 : | ||
+ | SSE2 mulpd (64bit x2) ns4 : | ||
+ | SSE2 addpd (64bit x2) ns4 : | ||
+ | AVX vmulpd (64bit x4) n8 : 0.570 15147.5 | ||
+ | AVX vaddpd (64bit x4) n8 : 0.566 15274.7 | ||
+ | AVX vmul+addpd (64bit x4) n8 | ||
+ | FMA vfmaddpd (64bit x4) n8 : 0.566 30545.4 | ||
+ | FMA vfmaddpd (64bit x4) n12 : | ||
+ | FMA vfma+mlpd (64bit x4) n12 : 0.850 22877.5 3812.9 | ||
+ | FMA vfma+adpd (64bit x4) n12 | ||
+ | AVX vml_ad_adpd (64bit x4) n9 : 0.437 22232.3 | ||
+ | Average | ||
+ | Highest | ||
- | * SSE/AVX (single | + | * Group 0: Thread=16 |
- | SSE mulss (32bit x1) n8 : | + | * SSE/AVX (SP fp) multi-thread |
- | SSE addss (32bit x1) n8 : | + | TIME(s) |
- | FMA vfmaddss (32bit x1) n8 : 0.310 | + | SSE mulss (32bit x1) n8 : |
- | SSE mulps (32bit x4) n8 : | + | SSE addss (32bit x1) n8 : |
- | SSE addps (32bit x4) n8 : | + | FMA vfmaddss (32bit x1) n8 : 0.587 |
- | SSE mul+addps (32bit x4) n8 : | + | FMA vfmaddss (32bit x1) n12 |
- | FMA vfmaddss | + | FMA vfma+mlss (32bit x1) n12 : 0.878 88567.6 |
- | SSE ml+ad+addps (32bit x4) n6 : 0.259 | + | FMA vfma+adss (32bit x1) n12 : 1.009 77086.8 |
- | SSE mulss (32bit x1) ns4 : 0.309 62036.9 62036.9 | + | SSE mulps (32bit x4) n8 : |
- | SSE addss (32bit x1) ns4 : 0.309 62200.5 62200.5 | + | SSE addps (32bit x4) n8 : |
- | SSE mulps (32bit x4) ns4 : 0.304 | + | SSE mul+addps (32bit x4) n8 : |
- | SSE addps (32bit x4) ns4 : 0.300 | + | FMA vfmaddps |
- | AVX vmulps (32bit x8) n8 : | + | FMA vfmaddps (32bit x4) n12 |
- | AVX vaddps (32bit x8) n8 : | + | FMA vfma+mlps (32bit x4) n12 : 0.917 |
- | AVX vmul+addps (32bit x8) n8 : 0.388 | + | FMA vfma+adps (32bit x4) n12 : 1.050 |
- | FMA vfmaddps (32bit x8) n8 : 0.598 | + | SSE ml+ad+adps (32bit x4) n9 |
- | AVX vml+ad+adps (32bit x8) n6 : 0.454 | + | SSE mulss (32bit x1) ns4 : 0.589 58633.9 3664.6 ( 16.0 1.0) |
- | Average | + | SSE addss (32bit x1) ns4 : 0.593 58281.8 3642.6 ( 16.0 1.0) |
- | Highest | + | SSE mulps (32bit x4) ns4 : 0.593 |
+ | SSE addps (32bit x4) ns4 : 0.592 | ||
+ | AVX vmulps (32bit x8) n8 : | ||
+ | AVX vaddps (32bit x8) n8 : | ||
+ | AVX vmul+addps (32bit x8) n8 : 0.638 | ||
+ | FMA vfmaddps (32bit x8) n8 : | ||
+ | FMA vfmaddps (32bit x8) n12 | ||
+ | FMA vfma+mlps (32bit x8) n12 : 1.849 | ||
+ | FMA vfma+adps (32bit x8) n12 : 1.525 | ||
+ | AVX vml+ad+adps (32bit x8) n9 : 0.929 | ||
+ | Average | ||
+ | Highest | ||
- | * SSE/AVX (double | + | * Group 0: Thread=16 |
- | SSE2 mulsd (64bit x1) n8 : 0.500 38438.0 38438.0 | + | * SSE/AVX (DP fp) multi-thread |
- | SSE2 addsd (64bit x1) n8 : 0.299 64246.9 64246.9 | + | TIME(s) |
- | FMA vfmaddsd (64bit x1) n8 : | + | SSE2 mulsd (64bit x1) n8 : 0.583 59307.2 |
- | SSE2 mulpd (64bit x2) n8 : 0.305 | + | SSE2 addsd (64bit x1) n8 : 0.590 58559.5 3660.0 ( 16.0 1.0) |
- | SSE2 addpd (64bit x2) n8 : 0.293 | + | FMA vfmaddsd (64bit x1) n8 : |
- | SSE2 mul+addpd (64bit x2) n8 : 0.209 | + | FMA vfmaddsd (64bit x1) n12 : 0.908 |
- | FMA | + | FMA vfma+mlsd (64bit x1) n12 : 0.923 84260.4 |
- | SSE2 ml+ad+dpd (64bit x2) n6 : | + | FMA vfma+adsd (64bit x1) n12 : 1.072 72518.0 |
- | SSE2 mulsd (64bit x1) ns4 : | + | SSE2 mulpd (64bit x2) n8 : 0.593 |
- | SSE2 addsd (64bit x1) ns4 : | + | SSE2 addpd (64bit x2) n8 : 0.585 |
- | SSE2 mulpd (64bit x2) ns4 : | + | SSE2 mul+addpd (64bit x2) n8 : 0.368 |
- | SSE2 addpd (64bit x2) ns4 : | + | FMA |
- | AVX vmulpd (64bit x4) n8 : | + | FMA vfmaddpd (64bit x2) n12 : 0.921 |
- | AVX vaddpd (64bit x4) n8 : | + | FMA vfma+mlpd (64bit x2) n12 : 0.923 |
- | AVX vmul+addpd (64bit x4) n8 : 0.396 | + | FMA vfma+adpd (64bit x2) n12 : 1.073 |
- | FMA vfmaddpd (64bit x4) n8 : 0.579 | + | SSE2 ml+ad+dpd (64bit x2) n9 : |
- | AVX vml_ad_adpd (64bit x4) n6 : 0.420 | + | SSE2 mulsd (64bit x1) ns4 : |
- | Average | + | SSE2 addsd (64bit x1) ns4 : |
- | Highest | + | SSE2 mulpd (64bit x2) ns4 : |
+ | SSE2 addpd (64bit x2) ns4 : | ||
+ | AVX vmulpd (64bit x4) n8 : | ||
+ | AVX vaddpd (64bit x4) n8 : | ||
+ | AVX vmul+addpd (64bit x4) n8 : 0.697 | ||
+ | FMA vfmaddpd (64bit x4) n8 : | ||
+ | FMA vfmaddpd (64bit x4) n12 | ||
+ | FMA vfma+mlpd (64bit x4) n12 : 1.837 | ||
+ | FMA vfma+adpd (64bit x4) n12 : 1.534 | ||
+ | AVX vml_ad_adpd (64bit x4) n9 : 0.873 | ||
+ | Average | ||
+ | Highest | ||
+ | </ | ||
- | * Matrix 4x4 multi-thread | + | ++++ |
- | C++ code : 0.345 83031.4 | + | |
- | C++ Intrinsic | + | |
- | SSE mul/addps 128bit A : 0.201 | + | |
- | AVX vmul/ | + | |
- | Average | + | |
- | Highest | + | |
+ | |||
+ | ==== AMD Zen2 (AMD64 x86_64 x64) SSE4.2/ | ||
+ | |||
+ | |||
+ | ++++Ryzen 9 3950X 3.5GHz (4.7GHz) 16 core 32 thread Windows 10| | ||
+ | |||
+ | <code> | ||
+ | Date: 20200808 195918 | ||
+ | ARCH: x64 (x86_64) | ||
+ | FPU : SSE SSE2 SSSE3 SSE4.1 SSE4.2 AVX AVX2 FMA3 F16C | ||
+ | Name: AMD Ryzen 9 3950X 16-Core Processor | ||
+ | |||
+ | CPU Thread: 32 | ||
+ | CPU Core : 16 | ||
+ | CPU Group : 1 | ||
+ | Group 0: Thread=32 | ||
+ | SSE : yes | ||
+ | AVX : yes | ||
+ | FMA : yes | ||
+ | F16C : yes | ||
+ | AVX512: no | ||
+ | |||
+ | Total: | ||
+ | SingleThread HP max: - | ||
+ | SingleThread SP max: 128.305 GFLOPS | ||
+ | SingleThread DP max: | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | |||
+ | Group 0: Thread=32 | ||
+ | SingleThread HP max: - | ||
+ | SingleThread SP max: 128.305 GFLOPS | ||
+ | SingleThread DP max: | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | MultiThread | ||
+ | |||
+ | |||
+ | * Group 0: Thread=1 | ||
+ | * SSE/AVX (SP fp) | ||
+ | TIME(s) | ||
+ | SSE mulss (32bit x1) n8 : 0.235 | ||
+ | SSE addss (32bit x1) n8 : | ||
+ | FMA vfmaddss (32bit x1) n8 : 0.306 13713.8 | ||
+ | FMA vfmaddss (32bit x1) n12 : | ||
+ | FMA vfma+mlss (32bit x1) n12 : 0.352 13403.3 | ||
+ | FMA vfma+adss (32bit x1) n12 : 0.294 16051.8 | ||
+ | SSE mulps (32bit x4) n8 : | ||
+ | SSE addps (32bit x4) n8 : | ||
+ | SSE mul+addps (32bit x4) n8 : | ||
+ | FMA vfmaddps (32bit x4) n8 : 0.294 57079.2 | ||
+ | FMA vfmaddps (32bit x4) n12 : | ||
+ | FMA vfma+mlps (32bit x4) n12 : 0.354 53337.4 | ||
+ | FMA vfma+adps (32bit x4) n12 : 0.296 63794.3 | ||
+ | SSE ml+ad+adps (32bit x4) n9 : 0.211 44684.7 | ||
+ | SSE mulss (32bit x1) ns4 : | ||
+ | SSE addss (32bit x1) ns4 : 0.350 | ||
+ | SSE mulps (32bit x4) ns4 : 0.350 23943.3 | ||
+ | SSE addps (32bit x4) ns4 : 0.349 23994.9 | ||
+ | AVX vmulps (32bit x8) n8 : 0.248 67674.4 | ||
+ | AVX vaddps (32bit x8) n8 : 0.249 67317.9 | ||
+ | AVX vmul+addps (32bit x8) n8 : 0.152 | ||
+ | FMA vfmaddps (32bit x8) n8 : 0.306 | ||
+ | FMA vfmaddps (32bit x8) n12 : | ||
+ | FMA vfma+mlps (32bit x8) n12 : 0.391 96466.3 | ||
+ | FMA vfma+adps (32bit x8) n12 : 0.315 | ||
+ | AVX vml+ad+adps (32bit x8) n9 : 0.335 56261.1 | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 0: Thread=1 | ||
+ | * SSE/AVX (DP fp) | ||
+ | TIME(s) | ||
+ | SSE2 mulsd (64bit x1) n8 : 0.237 | ||
+ | SSE2 addsd (64bit x1) n8 | ||
+ | FMA vfmaddsd (64bit x1) n8 | ||
+ | FMA vfmaddsd (64bit x1) n12 : 0.354 17776.6 | ||
+ | FMA vfma+mlsd (64bit x1) n12 : 0.357 13220.3 | ||
+ | FMA vfma+adsd (64bit x1) n12 : 0.295 15973.3 | ||
+ | SSE2 mulpd (64bit x2) n8 : 0.236 17749.4 | ||
+ | SSE2 addpd (64bit x2) n8 : 0.237 17707.4 | ||
+ | SSE2 mul+addpd (64bit x2) n8 : 0.177 23667.6 | ||
+ | FMA vfmaddpd (64bit x2) n8 | ||
+ | FMA vfmaddpd (64bit x2) n12 : 0.353 35638.1 | ||
+ | FMA vfma+mlpd (64bit x2) n12 : 0.356 26526.1 | ||
+ | FMA vfma+adpd (64bit x2) n12 : 0.296 31889.2 | ||
+ | SSE2 ml+ad+dpd (64bit x2) n9 : 0.213 22149.1 | ||
+ | SSE2 mulsd (64bit x1) ns4 : | ||
+ | SSE2 addsd (64bit x1) ns4 : | ||
+ | SSE2 mulpd (64bit x2) ns4 : | ||
+ | SSE2 addpd (64bit x2) ns4 : | ||
+ | AVX vmulpd (64bit x4) n8 : 0.250 33522.5 | ||
+ | AVX vaddpd (64bit x4) n8 : 0.250 33518.6 | ||
+ | AVX vmul+addpd (64bit x4) n8 : 0.160 52309.3 | ||
+ | FMA vfmaddpd (64bit x4) n8 : 0.307 54577.4 | ||
+ | FMA vfmaddpd (64bit x4) n12 : | ||
+ | FMA vfma+mlpd (64bit x4) n12 : 0.394 47859.6 | ||
+ | FMA vfma+adpd (64bit x4) n12 : 0.316 59672.9 | ||
+ | AVX vml_ad_adpd (64bit x4) n9 : 0.188 50150.2 | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 0: Thread=32 | ||
+ | * SSE/AVX (SP fp) multi-thread | ||
+ | TIME(s) | ||
+ | SSE mulss (32bit x1) n8 : | ||
+ | SSE addss (32bit x1) n8 : | ||
+ | FMA vfmaddss (32bit x1) n8 : 0.504 | ||
+ | FMA vfmaddss (32bit x1) n12 : | ||
+ | FMA vfma+mlss (32bit x1) n12 : 0.766 | ||
+ | FMA vfma+adss (32bit x1) n12 : 0.859 | ||
+ | SSE mulps (32bit x4) n8 : | ||
+ | SSE addps (32bit x4) n8 : 0.500 | ||
+ | SSE mul+addps (32bit x4) n8 : | ||
+ | FMA vfmaddps (32bit x4) n8 : 0.506 1060612.9 | ||
+ | FMA vfmaddps (32bit x4) n12 : | ||
+ | FMA vfma+mlps (32bit x4) n12 : 0.770 | ||
+ | FMA vfma+adps (32bit x4) n12 : 0.831 | ||
+ | SSE ml+ad+adps (32bit x4) n9 : 0.386 | ||
+ | SSE mulss (32bit x1) ns4 : 0.499 | ||
+ | SSE addss (32bit x1) ns4 : 0.497 | ||
+ | SSE mulps (32bit x4) ns4 : 0.498 | ||
+ | SSE addps (32bit x4) ns4 : 0.500 | ||
+ | AVX vmulps (32bit x8) n8 : 0.514 1043773.8 | ||
+ | AVX vaddps (32bit x8) n8 : 0.518 1035798.2 | ||
+ | AVX vmul+addps (32bit x8) n8 : 0.354 1513704.0 | ||
+ | FMA vfmaddps (32bit x8) n8 : 0.568 1888789.4 | ||
+ | FMA vfmaddps (32bit x8) n12 : | ||
+ | FMA vfma+mlps (32bit x8) n12 : 0.834 1446926.4 | ||
+ | FMA vfma+adps (32bit x8) n12 : 0.689 1751695.3 | ||
+ | AVX vml+ad+adps (32bit x8) n9 : 0.456 1323814.1 | ||
+ | Average | ||
+ | Highest | ||
+ | |||
+ | |||
+ | * Group 0: Thread=32 | ||
+ | * SSE/AVX (DP fp) multi-thread | ||
+ | TIME(s) | ||
+ | SSE2 mulsd (64bit x1) n8 : 0.502 | ||
+ | SSE2 addsd (64bit x1) n8 : 0.504 | ||
+ | FMA vfmaddsd (64bit x1) n8 : | ||
+ | FMA vfmaddsd (64bit x1) n12 : 0.761 | ||
+ | FMA vfma+mlsd (64bit x1) n12 : 0.768 | ||
+ | FMA vfma+adsd (64bit x1) n12 : 0.838 | ||
+ | SSE2 mulpd (64bit x2) n8 : 0.497 | ||
+ | SSE2 addpd (64bit x2) n8 : 0.494 | ||
+ | SSE2 mul+addpd (64bit x2) n8 : 0.278 | ||
+ | FMA vfmaddpd (64bit x2) n8 : | ||
+ | FMA vfmaddpd (64bit x2) n12 : 0.757 | ||
+ | FMA vfma+mlpd (64bit x2) n12 : 0.768 | ||
+ | FMA vfma+adpd (64bit x2) n12 : 0.842 | ||
+ | SSE2 ml+ad+dpd (64bit x2) n9 : 0.386 | ||
+ | SSE2 mulsd (64bit x1) ns4 : | ||
+ | SSE2 addsd (64bit x1) ns4 : | ||
+ | SSE2 mulpd (64bit x2) ns4 : | ||
+ | SSE2 addpd (64bit x2) ns4 : | ||
+ | AVX vmulpd (64bit x4) n8 : 0.521 | ||
+ | AVX vaddpd (64bit x4) n8 : 0.527 | ||
+ | AVX vmul+addpd (64bit x4) n8 : 0.366 | ||
+ | FMA vfmaddpd (64bit x4) n8 : 0.571 | ||
+ | FMA vfmaddpd (64bit x4) n12 : | ||
+ | FMA vfma+mlpd (64bit x4) n12 : 0.839 | ||
+ | FMA vfma+adpd (64bit x4) n12 : 0.693 | ||
+ | AVX vml_ad_adpd (64bit x4) n9 : 0.370 | ||
+ | Average | ||
+ | Highest | ||
</ | </ | ||
++++ | ++++ | ||
+ | |||
+ | |||
+ | |||
opengl/vfpbenchlog.txt · 最終更新: 2020/12/30 23:46 by oga