差分

このページの2つのバージョン間の差分を表示します。

--- opengl:movilecpu [2011/07/08 16:07] – oga
+++ opengl:movilecpu [2011/11/05 21:10] – [参考にしたページ] oga
@@ 行 1: / 行 1: @@
-====== ARM vfp ======
+====== ARM vfp の種類 ======
-  * vfpv2 : ARMv6 時代の FPU 。vector mode が特徴だったが、次の ARMv7 で廃止予定。
+  * vfpv2 : ARMv6 世代。vector mode が特徴だったが、ARMv7 では段階的に廃止の方向へ。代わりに NEON がついた。
-  * vfpv3 : Double は 32 個まで持てる。特に NEON とセットならば 32個必須。
+  * vfpv3 : 無指定は NEON あり。double 32 個まで持てる。
   * vfpv3-fp16 : half 型 (16bit float) 変換拡張命令に対応している。
-  * vfpv3-d16 : 一見拡張機能のように見えるが逆。本来 32個持てる double レジスタが半分の 16個しかないという意味。つまり VFPv2 相当。
+  * vfpv3-d16 : d16 が付いている場合は NEON 無し。double レジスタが半分の 16個しかないという意味。vfpv2 相当。
-  * vfpv4 : half が標準搭載となった。IEEE754 対応の FMA も追加されている。
+  * vfpv4 : IEEE754 対応の FMA が追加されている。half (fp16) も搭載。
-  * vfpv4-d16 : double が半分のバージョン。おそらく NEON 無しのもの。組み込み向けの Cortex-A5 等に搭載されている。half/double は対応している。
+  * vfpv4-d16 : vfpv4 における double が半分のバージョン。NEON 無しのバリエーション。Cortex-A5 等にあり。half/double は対応している。
-  * fpv4-sp-d16 : vector 廃止(?) の sclar fpu。doulbe は 16個。Cortex-M4 専用？ vfpv4-d16 との違いがわからない。
+  * fpv4-sp-d16 : doulbe は 16個。Cortex-M4 専用？
-  * RB[3:0] = D32  (1=D16, 2=D32)
-  * SP[7:4] = Single precision supported in VFP
-  * DP[11:8] = Double precision supported in VFP
-  * TE[15:12] = Trap
-  * D[19:16] = VFP hardware divide supported
-  * SR[23:20] = VFP hardware square root supported
-  * SV[27:24] = VFP short vector supported
-  * RM[31:28] = All VFP rounding modes supported
-  * FZ[0:3] = Full denormal arithmetic supported for VFP
+^           ^               ^  MVFR0                             ^^^^^^^^  MVFR1                                ^^^^^^^^
-  * DN[7:4] = Propagation of NaN values supported for VFP
+^           ^               ^     ^  VFP     ^^    ^     ^     ^    ^    ^    ^     ^  NEON             ^^^^ VFP ^  ^
-  * NLS[11:8] = Load/store instructions supported for NEON
+^           ^               ^ D32 ^ VSP ^ VDP ^ TE ^ DIV ^ SQR ^ SV ^ RM ^ FZ ^ NaN ^ NLS ^ NI ^ NSP ^ NHP ^ VHP ^ FMA ^
-  * NI[15:12] = Integer instructions supported for NEON
+| ARM1176JZF-S | vfpv2      | -   | ◎  | ◎  | ◎ | ◎  | ◎  | ◎ |    |    |     |     |    |     |     |     |     |
-  * NSP[19:16] = Single precision floating-point instructions supported for NEON
+| Cortex-A8   | vfpv3+NEON  | ◎  | ◎  | ◎  | -  | ◎  | ◎  | ◎ | ◎ | ◎ | ◎  | ◎  | ◎ | ◎  | -   | -   | -   |
-  * NHP[23:20] = NEON half-precision operations supported
+| Cortex-A9   | vfpv3-D16   | -   | ◎  | ◎  | -  | ◎  | ◎  | ◎ | ◎ | ◎ | ◎  | -   | -  | -   | -   | ◎  | -   |
-  * VHP[27:24] = VFP half-precision operations supported
+| Cortex-A9   | vfpv3+NEON  | ◎  | ◎  | ◎  | -  | ◎  | ◎  | -  | ◎ | ◎ | ◎  | ◎  | ◎ | ◎  | ◎  | ◎  | -   |
-  * FMA[31:28] = Fused Multiply Accumulate suported
+| Cortex-A5   | vfpv4-D16   | -   | ◎  | ◎  | -  | ◎  | ◎  | -  | ◎ | ◎ | ◎  | -   | -  | -   | -   | ◎  | ◎  |
+| Cortex-A5   | vfpv4+NEON  | ◎  | ◎  | ◎  | -  | ◎  | ◎  | -  | ◎ | ◎ | ◎  | ◎  | ◎ | ◎  | ◎  | ◎  | ◎  |
+| Cortex-A15  | vfpv4+NEON  | ◎  | ◎  | ◎  | -  | ◎  | ◎  | -  | ◎ | ◎ | ◎  | ◎  | ◎ | ◎  | ◎  | ◎  | ◎  |
-^           ^             ^  MVFR0                             ^^^^^^^^  MVFR1                                ^^^^^^^^
-^           ^             ^     ^  VFP     ^^    ^     ^     ^    ^    ^    ^     ^  NEON             ^^^^ VFP ^  ^
-^           ^             ^ D32 ^ VSP ^ VDP ^ TE ^ DIV ^ SQR ^ SV ^ RM ^ FZ ^ NaN ^ NLS ^ NI ^ NSP ^ NHP ^ VHP ^ FMA ^
-| ARM1176JZF-S | vfpv2    | -   | ◎  | ◎  | ◎ | ◎  | ◎  | ◎ |    |    |     |     |    |     |     |     |     |
-| Cortex-A8 | vfpv3+NEON  | ◎  | ◎  | ◎  | -  | ◎  | ◎  | ◎ | ◎ | ◎ | ◎  | ◎  | ◎ | ◎  | -   | -   | -   |
-| Cortex-A9 | vfpv3-D16   | -   | ◎  | ◎  | -  | ◎  | ◎  | ◎ | ◎ | ◎ | ◎  | -   | -  | -   | -   | ◎  | -   |
-| Cortex-A9 | vfpv3+NEON  | ◎  | ◎  | ◎  | -  | ◎  | ◎  | -  | ◎ | ◎ | ◎  | ◎  | ◎ | ◎  | ◎  | ◎  | -   |
-| Cortex-A5 | vfpv4-D16   | -   | ◎  | ◎  | -  | ◎  | ◎  | -  | ◎ | ◎ | ◎  | -   | -  | -   | -   | ◎  | ◎  |
-| Cortex-A5 | vfpv4+NEON  | ◎  | ◎  | ◎  | -  | ◎  | ◎  | -  | ◎ | ◎ | ◎  | ◎  | ◎ | ◎  | ◎  | ◎  | ◎  |
+  * D32  0- 3: D16 / D32
+  * VSP  4- 7: VFP Single precision
+  * VDP  8-11: VFP double precision
+  * TE  12-15: Trap
+  * DIV 16-19: VFP hw divide
+  * SQR 20-23: VFP hw square root
+  * SV  24-27: VFP short vector
+  * RM  28-31: VFP all rounding mode supported
+  * FZ   0- 3: VFP Full denormal arithmetic
+  * Nan  4- 7: VFP Propagation of NaN values
+  * NLS  8-11: NEON Load/store instructions
+  * NI  12-15: NEON Integier instructions
+  * NSP 16-19: NEON single precision operations
+  * NHP 20-23: NEON half-precision operations
+  * VHP 24-27: VFP half-precision operations
+  * FMA 28-31: Fused Multiply Add
-MVFR0, MVFR1
+MVFR0/MVFR1 は特権命令からのアクセスのみ。
-SV 未対応の場合 FPSCR Len に設定すると例外が発生する。
+SV 未対応の場合 FPSCR Len に 0 以外を設定すると例外が発生する。互換性維持のためソフトウエアでエミュレーションする。
 FPSCR はユーザーモードでもアクセスできる。
+===== 参考にしたページ =====
+  * A8 http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0344k/Chdebegb.html
+  * A9 VFP-D16 http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0408e/Chdgjege.html
+  * A9 VFP NEON http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0409e/CHDEICCE.html
+  * A5 VFP NEON http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0450b/CHDEICCE.html
+  * A15 VFP NEON http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0438c/CDEFCBDC.html