差分

このページの2つのバージョン間の差分を表示します。

--- opengl:movilecpu [2011/07/08 15:47] – 作成 oga
+++ opengl:movilecpu [2011/11/05 21:10] – [参考にしたページ] oga
@@ 行 1: / 行 1: @@
-==== ARM vfp ====
+====== ARM vfp の種類 ======
-  * VFPv3 は VFPv2 互換だが VFPv3-D32 が存在している。NEON セットの場合のみ、倍精度(double) レジスタが 32個ある。
+  * vfpv2 : ARMv6 世代。vector mode が特徴だったが、ARMv7 では段階的に廃止の方向へ。代わりに NEON がついた。
-  * VFPv4 は VFPv3 上位互換だが、予告通りついにレガシーだった vector mode が廃止された。それ以外は、fp16 対応、FMA 対応が主な違いとなる。
+  * vfpv3 : 無指定は NEON あり。double 32 個まで持てる。
+  * vfpv3-fp16 : half 型 (16bit float) 変換拡張命令に対応している。
+  * vfpv3-d16 : d16 が付いている場合は NEON 無し。double レジスタが半分の 16個しかないという意味。vfpv2 相当。
+  * vfpv4 : IEEE754 対応の FMA が追加されている。half (fp16) も搭載。
+  * vfpv4-d16 : vfpv4 における double が半分のバージョン。NEON 無しのバリエーション。Cortex-A5 等にあり。half/double は対応している。
+  * fpv4-sp-d16 : doulbe は 16個。Cortex-M4 専用？
-vector mode が廃止されても名前は vfp のまま。
+^           ^               ^  MVFR0                             ^^^^^^^^  MVFR1                                ^^^^^^^^
+^           ^               ^     ^  VFP     ^^    ^     ^     ^    ^    ^    ^     ^  NEON             ^^^^ VFP ^  ^
+^           ^               ^ D32 ^ VSP ^ VDP ^ TE ^ DIV ^ SQR ^ SV ^ RM ^ FZ ^ NaN ^ NLS ^ NI ^ NSP ^ NHP ^ VHP ^ FMA ^
+| ARM1176JZF-S | vfpv2      | -   | ◎  | ◎  | ◎ | ◎  | ◎  | ◎ |    |    |     |     |    |     |     |     |     |
+| Cortex-A8   | vfpv3+NEON  | ◎  | ◎  | ◎  | -  | ◎  | ◎  | ◎ | ◎ | ◎ | ◎  | ◎  | ◎ | ◎  | -   | -   | -   |
+| Cortex-A9   | vfpv3-D16   | -   | ◎  | ◎  | -  | ◎  | ◎  | ◎ | ◎ | ◎ | ◎  | -   | -  | -   | -   | ◎  | -   |
+| Cortex-A9   | vfpv3+NEON  | ◎  | ◎  | ◎  | -  | ◎  | ◎  | -  | ◎ | ◎ | ◎  | ◎  | ◎ | ◎  | ◎  | ◎  | -   |
+| Cortex-A5   | vfpv4-D16   | -   | ◎  | ◎  | -  | ◎  | ◎  | -  | ◎ | ◎ | ◎  | -   | -  | -   | -   | ◎  | ◎  |
+| Cortex-A5   | vfpv4+NEON  | ◎  | ◎  | ◎  | -  | ◎  | ◎  | -  | ◎ | ◎ | ◎  | ◎  | ◎ | ◎  | ◎  | ◎  | ◎  |
+| Cortex-A15  | vfpv4+NEON  | ◎  | ◎  | ◎  | -  | ◎  | ◎  | -  | ◎ | ◎ | ◎  | ◎  | ◎ | ◎  | ◎  | ◎  | ◎  |
-  * vfpv2 : ARMv6 時代の FPU 。vector mode が特徴だったが、次の ARMv7 で廃止予定。CTR はこれを使わなければならない。
-  * vfpv3 : vfpv3 対応。Double は 32 個まで持てる。特に NEON とセットならば 32個必須。
-  * vfpv3-fp16 : half 型 (16bit float) 変換拡張命令に対応している。
-  * vfpv3-d16 : 一見拡張機能のように見えるが全くの逆。本来 32個持てる double レジスタが半分の 16個しかないという意味。つまり VFPv2 相当。
-  * vfpv4 : half が標準搭載となった。IEEE 標準化された積和命令、FMA も対応している。
-  * vfpv4-d16 : double が半分のバージョン。おそらく NEON 無しのもの。組み込み向けの Cortex-A5 等に搭載されている。half/double は対応している。
-  * fpv4-sp-d16 : vector 廃止(?) の sclar fpu。doulbe は 16個。Cortex-M4 専用？ vfpv4-d16 との違いがわからない。
-  * RB[3:0] = D32  (1=D16, 2=D32)
+  * D32  0- 3: D16 / D32
-  * SP[7:4] = Single precision supported in VFP
+  * VSP  4- 7: VFP Single precision
-  * DP[11:8] = Double precision supported in VFP
+  * VDP  8-11: VFP double precision
-  * TE[15:12] = Trap
+  * TE  12-15: Trap
-  * D[19:16] = VFP hardware divide supported
+  * DIV 16-19: VFP hw divide
-  * SR[23:20] = VFP hardware square root supported
+  * SQR 20-23: VFP hw square root
-  * SV[27:24] = VFP short vector supported
+  * SV  24-27: VFP short vector
-  * RM[31:28] = All VFP rounding modes supported
+  * RM  28-31: VFP all rounding mode supported
+  * FZ   0- 3: VFP Full denormal arithmetic
+  * Nan  4- 7: VFP Propagation of NaN values
+  * NLS  8-11: NEON Load/store instructions
+  * NI  12-15: NEON Integier instructions
+  * NSP 16-19: NEON single precision operations
+  * NHP 20-23: NEON half-precision operations
+  * VHP 24-27: VFP half-precision operations
+  * FMA 28-31: Fused Multiply Add
-  * FZ[0:3] = Full denormal arithmetic supported for VFP
+MVFR0/MVFR1 は特権命令からのアクセスのみ。
-  * DN[7:4] = Propagation of NaN values supported for VFP
-  * NLS[11:8] = Load/store instructions supported for NEON
-  * NI[15:12] = Integer instructions supported for NEON
-  * NSP[19:16] = Single precision floating-point instructions supported for NEON
-  * NHP[23:20] = NEON half-precision operations supported
-  * VHP[27:24] = VFP half-precision operations supported
-  * FMA[31:28] = Fused Multiply Accumulate suported
-^           ^             ^  MVFR0                             ^^^^^^^^  MVFR1                                ^^^^^^^^
+SV 未対応の場合 FPSCR Len に 0 以外を設定すると例外が発生する。互換性維持のためソフトウエアでエミュレーションする。
-^           ^             ^     ^  VFP     ^^    ^     ^     ^    ^    ^    ^     ^  NEON             ^^^^ VFP ^  ^
+FPSCR はユーザーモードでもアクセスできる。
-^           ^             ^ D32 ^ VSP ^ VDP ^ TE ^ DIV ^ SQR ^ SV ^ RM ^ FZ ^ NaN ^ NLS ^ NI ^ NSP ^ NHP ^ VHP ^ FMA ^
-| ARM1176JZF-S | vfpv2    | -   | ◎  | ◎  | ◎ | ◎  | ◎  | ◎ |    |    |     |     |    |     |     |     |     |
-| Cortex-A8 | vfpv3+NEON  | ◎  | ◎  | ◎  | -  | ◎  | ◎  | ◎ | ◎ | ◎ | ◎  | ◎  | ◎ | ◎  | -   | -   | -   |
-| Cortex-A9 | vfpv3-D16   | -   | ◎  | ◎  | -  | ◎  | ◎  | ◎ | ◎ | ◎ | ◎  | -   | -  | -   | -   | ◎  | -   |
-| Cortex-A9 | vfpv3+NEON  | ◎  | ◎  | ◎  | -  | ◎  | ◎  | -  | ◎ | ◎ | ◎  | ◎  | ◎ | ◎  | ◎  | ◎  | -   |
-| Cortex-A5 | vfpv4-D16   | -   | ◎  | ◎  | -  | ◎  | ◎  | -  | ◎ | ◎ | ◎  | -   | -  | -   | -   | ◎  | ◎  |
-| Cortex-A5 | vfpv4+NEON  | ◎  | ◎  | ◎  | -  | ◎  | ◎  | -  | ◎ | ◎ | ◎  | ◎  | ◎ | ◎  | ◎  | ◎  | ◎  |
-MVFR0, MVFR1
+===== 参考にしたページ =====
-SV 未対応の場合 FPSCR Len に設定すると例外が発生する。
+  * A8 http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0344k/Chdebegb.html
+  * A9 VFP-D16 http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0408e/Chdgjege.html
-FPSCR はユーザーモードでもアクセスできる。
+  * A9 VFP NEON http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0409e/CHDEICCE.html
+  * A5 VFP NEON http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0450b/CHDEICCE.html
+  * A15 VFP NEON http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0438c/CDEFCBDC.html