ユーザ用ツール

サイト用ツール


ai:ollama

Ollama でマルチ GPU 推論

ollama を使用して普通の PC 上でローカル LLM の推論を行っています。

Multi GPU : モデルサイズごとの比較

  • モデルはすべて Q4

● OS を直接インストールした場合

  • token/s の値が大きい方が高速
  • 使用した PC が 2種類あるので注意。GPU の割合が低い場合は Host PC の性能による差が生じる可能性あり
    • host = 3950X : Ryzen 9 3950X (Zen2), 16C 32T, DDR4-3200 51.2GB/s
    • host = 9700X : Ryzen 7 9700X (Zen5), 8C 16T, DDR5-5600 89.6GB/s

70b (llama3.3:70b)

VRAM Processor VRAM MEM CPU GPU token/s host os mpr
0GB CPU Ryzen 7 5700X (Zen3) 0GB 46GB 100% 0% 1.00 tps 5700X Win11 1.11
0GB CPU Ryzen 9 3950X (Zen2) 0GB 46GB 100% 0% 1.01 tps 3950X Linux 1.11
0GB CPU Core i7-13700 (RaptorLake) 0GB 46GB 100% 0% 1.02 tps 13700 Linux 1.11
4GB GPUx1 RADEON RX 6400 4GB 47GB 92% 8% 1.03 tps 3950X Linux 1.14
8GB GPUx1 GeForce RTX 2070 Super 8GB 47GB 85% 15% 1.04 tps 13700 Win11 WSL2 1.26
8GB GPUx2 RADEON RX 6400 + RX 6400 4+4=8GB 49GB 84% 16% 1.05 tps 3950X Linux 1.16
8GB GPUx1 RADEON RX 7600 8GB 47GB 88% 12% 1.06 tps 5700X Win11 1.21
0GB CPU Ryzen 7 7840HS (Zen4) 0GB 46GB 100% 0% 1.09 tps 7840HS Win11 WSL2 1.95
8GB GPUx1 RADEON RX 7600 8GB 47GB 83% 17% 1.12 tps 3950X Linux 1.27
0GB CPU Ryzen 7 9700X (Zen5) 0GB 46GB 100% 0% 1.20 tps 9700X Win11 1.95
0GB CPU Ryzen 7 9700X (Zen5) 0GB 46GB 100% 0% 1.22 tps 9700X Linux 1.95
8GB GPUx1 RADEON 780M 8GB 47GB 83% 17% 1.25 tps 7840HS Linux 1.91
0GB CPU Ryzen 7 7840HS (Zen4) 0GB 46GB 100% 0% 1.30 tps 7840HS Linux 1.95
16GB GPUx2 RADEON RX 7600 + RX Vega 56 8+8=16GB 48GB 65% 35% 1.30 tps 3950X Linux 1.52
8GB GPUx1 GeForce GTX 1070 8GB 47GB 83% 17% 1.31 tps 9700X Linux 2.14
0GB CPU Ryzen 7 9700X (Zen5) 0GB 46GB 100% 0% 1.32 tps 9700X Win11 WSL2 1.95
8GB GPUx1 GeForce GTX 1080 8GB 47GB 83% 17% 1.32 tps 9700X Linux 2.17
8GB GPUx1 RADEON RX Vega 64 8GB 47GB 83% 17% 1.34 tps 9700X Linux 2.21
8GB GPUx1 GeForce RTX 2070 Super 8GB 47GB 84% 16% 1.37 tps 9700X Linux 2.19
16GB GPUx2 GeForce GTX 1080 + GTX 1070 8+8=16GB 48GB 67% 33% 1.45 tps 9700X Linux 2.41
11GB GPUx1 GeForce RTX 2080Ti 11GB 47GB 77% 23% 1.48 tps 9700X Linux 2.37
24GB GPUx3 RADEON RX 7600 + RX Vega 64 + RX Vega 56 8+8+8=24GB 50GB 52% 48% 1.51 tps 3950X Linux 1.75
16GB GPUx1 GeForce RTX 4060Ti 16GB 46GB 66% 34% 1.54 tps 9700X Win11 2.54
16GB GPUx2 RADEON RX Vega 64 + RX Vega 56 8+8=16GB 48GB 65% 35% 1.55 tps 9700X Linux 2.59
16GB GPUx1 GeForce RTX 4060Ti 16GB 46GB 65% 35% 1.65 tps 9700X Linux 2.57
16GB GPUx1 GeForce RTX 4060Ti 16GB 46GB 66% 34% 1.70 tps 9700X Win11 WSL2 2.54
19GB GPUx2 GeForce RTX 2080Ti + RTX 2070 Super 11+8=19GB 48GB 62% 38% 1.71 tps 9700X Linux 2.73
27GB GPUx2 GeForce RTX 4060Ti + RTX 2080Ti 16+11=27GB 48GB 43% 57% 2.22 tps 9700X Linux 3.28
32GB GPUx2 GeForce RTX 4060Ti + RTX 4060Ti 16+16=32GB 47GB 32% 68% 2.22 tps 3950X Linux 2.47
40GB GPUx3 GeForce RTX 4060Ti + RTX 4060Ti + RTX 2070 Super 16+16+8=40GB 49GB 17% 83% 3.08 tps 3950X Linux 3.40
0GB CPU Apple M4 Pro CPU 0GB 46GB 100% 0% 4.18 tps M4 Pro CPU macOS 5.93
48GB GPUx4 GeForce RTX 4060Ti + RTX 4060Ti + RTX 2070 Super + GTX 1080 16+16+8+8=48GB 51GB 6% 94% 4.25 tps 3950X Linux 4.68
64GB GPUx1 Apple M4 Pro GPU 64GB 46GB 0% 100% 4.64 tps M4 Pro CPU macOS 5.93
56GB GPUx5 GeForce RTX 4060Ti + RTX 4060Ti + RTX 2070 Super + GTX 1080 + GTX 1070 16+16+8+8+8=56GB 55GB 0% 100% 5.10 tps 3950X Linux 5.50

32b (qwen2.5:32b)

VRAM Processor VRAM MEM CPU GPU token/s host os mpr
0GB CPU Ryzen 5 5560U (Zen3) 0GB 22GB 100% 0% 1.92 tps 5560U Win11 WSL2 2.33
0GB CPU Ryzen 7 5700X (Zen3) 0GB 22GB 100% 0% 2.14 tps 5700X Win11 2.33
0GB CPU Ryzen 9 3950X (Zen2) 0GB 22GB 100% 0% 2.16 tps 3950X Linux 2.33
0GB CPU Core i7-13700 (RaptorLake) 0GB 22GB 100% 0% 2.18 tps 13700 Linux 2.33
4GB GPUx1 RADEON RX 6400 4GB 22GB 83% 17% 2.29 tps 3950X Linux 2.59
8GB GPUx2 RADEON RX 6400 + RX 6400 4+4=8GB 24GB 66% 34% 2.42 tps 3950X Linux 2.68
8GB GPUx1 GeForce RTX 2070 Super 8GB 22GB 68% 32% 2.45 tps 13700 Win11 WSL2 3.25
0GB CPU Ryzen 7 9700X (Zen5) 0GB 22GB 100% 0% 2.55 tps 9700X Win11 4.07
8GB GPUx1 RADEON RX 7600 8GB 22GB 72% 28% 2.59 tps 5700X Win11 3.02
0GB CPU Ryzen 7 9700X (Zen5) 0GB 22GB 100% 0% 2.62 tps 9700X Linux 4.07
8GB GPUx1 RADEON 780M 8GB 22GB 64% 36% 2.62 tps 7840HS Linux 4.07
0GB CPU Ryzen 7 7840HS (Zen4) 0GB 22GB 100% 0% 2.77 tps 7840HS Win11 WSL2 4.07
0GB CPU Ryzen 7 9700X (Zen5) 0GB 22GB 100% 0% 2.84 tps 9700X Win11 WSL2 4.07
0GB CPU Ryzen 7 7840HS (Zen4) 0GB 22GB 100% 0% 2.86 tps 7840HS Linux 4.07
8GB GPUx1 RADEON RX 7600 8GB 22GB 63% 37% 2.86 tps 3950X Linux 3.34
8GB GPUx1 GeForce GTX 1070 8GB 22GB 64% 36% 3.18 tps 9700X Linux 5.32
8GB GPUx1 GeForce GTX 1080 8GB 22GB 64% 36% 3.27 tps 9700X Linux 5.50
8GB GPUx1 RADEON RX Vega 64 8GB 22GB 63% 37% 3.44 tps 9700X Linux 5.83
8GB GPUx1 GeForce RTX 2070 Super 8GB 22GB 65% 35% 3.51 tps 9700X Linux 5.66
16GB GPUx2 RADEON RX 7600 + RX Vega 56 8+8=16GB 23GB 31% 69% 4.34 tps 3950X Linux 5.37
11GB GPUx1 GeForce RTX 2080Ti 11GB 22GB 52% 48% 4.35 tps 9700X Linux 6.91
16GB GPUx2 GeForce GTX 1080 + GTX 1070 8+8=16GB 23GB 31% 69% 4.41 tps 9700X Linux 7.39
16GB GPUx2 RADEON RX Vega 64 + RX Vega 56 8+8=16GB 23GB 30% 70% 5.10 tps 9700X Linux 8.83
16GB GPUx1 GeForce RTX 4060Ti 16GB 21GB 28% 72% 5.81 tps 9700X Win11 8.47
16GB GPUx1 GeForce RTX 4060Ti 16GB 21GB 25% 75% 6.35 tps 9700X Linux 8.83
16GB GPUx1 GeForce RTX 4060Ti 16GB 21GB 28% 72% 6.41 tps 9700X Win11 WSL2 8.47
19GB GPUx2 GeForce RTX 2080Ti + RTX 2070 Super 11+8=19GB 23GB 20% 80% 7.68 tps 9700X Linux 11.64
0GB CPU Apple M4 Pro CPU 0GB 22GB 100% 0% 7.93 tps M4 Pro CPU macOS 12.41
24GB GPUx3 RADEON RX 7600 + RX Vega 64 + RX Vega 56 8+8+8=24GB 25GB 3% 97% 9.12 tps 3950X Linux 12.63
64GB GPUx1 Apple M4 Pro GPU 64GB 23GB 0% 100% 9.63 tps M4 Pro CPU macOS 11.87
32GB GPUx2 GeForce RTX 4060Ti + RTX 4060Ti 16+16=32GB 25GB 0% 100% 12.74 tps 3950X Linux 11.52
40GB GPUx3 GeForce RTX 4060Ti + RTX 4060Ti + RTX 2070 Super 16+16+8=40GB 26GB 0% 100% 13.41 tps 3950X Linux 11.93
27GB GPUx2 GeForce RTX 4060Ti + RTX 2080Ti 16+11=27GB 25GB 0% 100% 16.21 tps 9700X Linux 14.71

27b (gemma2:27b)

VRAM Processor VRAM MEM CPU GPU token/s host os mpr
0GB CPU Ryzen 5 5560U (Zen3) 0GB 19GB 100% 0% 2.19 tps 5560U Win11 WSL2 2.69
0GB CPU Ryzen 9 3950X (Zen2) 0GB 18GB 100% 0% 2.54 tps 3950X Linux 2.84
0GB CPU Ryzen 7 5700X (Zen3) 0GB 18GB 100% 0% 2.56 tps 5700X Win11 2.84
0GB CPU Core i7-13700 (RaptorLake) 0GB 18GB 100% 0% 2.65 tps 13700 Linux 2.84
4GB GPUx1 RADEON RX 6400 4GB 18GB 79% 21% 2.70 tps 3950X Linux 3.25
8GB GPUx2 RADEON RX 6400 + RX 6400 4+4=8GB 21GB 62% 38% 2.87 tps 3950X Linux 3.16
8GB GPUx1 GeForce RTX 2070 Super 8GB 18GB 61% 39% 2.92 tps 13700 Win11 WSL2 4.35
8GB GPUx1 RADEON RX 7600 8GB 18GB 68% 32% 3.05 tps 5700X Win11 3.86
0GB CPU Ryzen 7 9700X (Zen5) 0GB 18GB 100% 0% 3.12 tps 9700X Win11 4.98
0GB CPU Ryzen 7 9700X (Zen5) 0GB 18GB 100% 0% 3.14 tps 9700X Linux 4.98
0GB CPU Ryzen 7 7840HS (Zen4) 0GB 19GB 100% 0% 3.25 tps 7840HS Win11 WSL2 4.72
0GB CPU Ryzen 7 7840HS (Zen4) 0GB 19GB 100% 0% 3.40 tps 7840HS Linux 4.72
0GB CPU Ryzen 7 9700X (Zen5) 0GB 18GB 100% 0% 3.42 tps 9700X Win11 WSL2 4.98
8GB GPUx1 RADEON RX 7600 8GB 18GB 55% 45% 3.52 tps 3950X Linux 4.51
8GB GPUx1 RADEON 780M 8GB 18GB 57% 43% 3.62 tps 7840HS Linux 4.98
8GB GPUx1 GeForce GTX 1070 8GB 18GB 57% 43% 4.05 tps 9700X Linux 6.91
8GB GPUx1 GeForce GTX 1080 8GB 18GB 57% 43% 4.12 tps 9700X Linux 7.21
8GB GPUx1 GeForce RTX 2070 Super 8GB 18GB 59% 41% 4.35 tps 9700X Linux 7.41
8GB GPUx1 RADEON RX Vega 64 8GB 18GB 55% 45% 4.36 tps 9700X Linux 7.86
11GB GPUx1 GeForce RTX 2080Ti 11GB 18GB 43% 57% 5.74 tps 9700X Linux 9.70
16GB GPUx2 RADEON RX 7600 + RX Vega 56 8+8=16GB 21GB 22% 78% 6.00 tps 3950X Linux 7.21
16GB GPUx2 GeForce GTX 1080 + GTX 1070 8+8=16GB 21GB 22% 78% 6.24 tps 9700X Linux 9.16
16GB GPUx2 RADEON RX Vega 64 + RX Vega 56 8+8=16GB 21GB 21% 79% 7.45 tps 9700X Linux 11.55
16GB GPUx1 GeForce RTX 4060Ti 16GB 18GB 16% 84% 8.87 tps 9700X Win11 11.81
0GB CPU Apple M4 Pro CPU 0GB 18GB 100% 0% 9.26 tps M4 Pro CPU macOS 15.17
16GB GPUx1 GeForce RTX 4060Ti 16GB 18GB 16% 84% 9.71 tps 9700X Win11 WSL2 11.81
16GB GPUx1 GeForce RTX 4060Ti 16GB 18GB 12% 88% 10.14 tps 9700X Linux 12.64
19GB GPUx2 GeForce RTX 2080Ti + RTX 2070 Super 11+8=19GB 21GB 11% 89% 11.83 tps 9700X Linux 16.42
64GB GPUx1 Apple M4 Pro GPU 64GB 20GB 0% 100% 13.91 tps M4 Pro CPU macOS 13.65
24GB GPUx3 RADEON RX 7600 + RX Vega 64 + RX Vega 56 8+8+8=24GB 23GB 0% 100% 14.55 tps 3950X Linux 16.34
32GB GPUx2 GeForce RTX 4060Ti + RTX 4060Ti 16+16=32GB 23GB 0% 100% 15.51 tps 3950X Linux 12.52
40GB GPUx3 GeForce RTX 4060Ti + RTX 4060Ti + RTX 2070 Super 16+16+8=40GB 25GB 0% 100% 15.89 tps 3950X Linux 12.41
27GB GPUx2 GeForce RTX 4060Ti + RTX 2080Ti 16+11=27GB 23GB 0% 100% 20.15 tps 9700X Linux 15.99

27b (gemma3:27b)

VRAM Processor VRAM MEM CPU GPU token/s host os mpr
0GB CPU Ryzen 5 5560U (Zen3) 0GB 24GB 100% 0% 2.18 tps 5560U Win11 WSL2 2.13
0GB CPU Apple M4 Pro CPU 0GB 22GB 100% 0% 9.63 tps M4 Pro CPU macOS 12.41
64GB GPUx1 Apple M4 Pro GPU 64GB 24GB 0% 100% 10.88 tps M4 Pro CPU macOS 11.38

24b (mistral-small:24b)

VRAM Processor VRAM MEM CPU GPU token/s host os mpr
0GB CPU Ryzen 7 5700X (Zen3) 0GB 16GB 100% 0% 2.96 tps 5700X Win11 3.20
8GB GPUx1 GeForce RTX 2070 Super 8GB 16GB 57% 43% 3.62 tps 13700 Win11 WSL2 5.17
8GB GPUx1 RADEON RX 7600 8GB 16GB 61% 39% 3.99 tps 5700X Win11 4.71
0GB CPU Apple M4 Pro CPU 0GB 16GB 100% 0% 10.94 tps M4 Pro CPU macOS 17.06
64GB GPUx1 Apple M4 Pro GPU 64GB 16GB 0% 100% 13.55 tps M4 Pro CPU macOS 17.06
16GB GPUx1 GeForce RTX 4060Ti 16GB 15GB 3% 97% 15.38 tps 9700X Win11 18.00
16GB GPUx1 GeForce RTX 4060Ti 16GB 15GB 3% 97% 15.45 tps 9700X Win11 WSL2 18.00
16GB GPUx1 GeForce RTX 4060Ti 16GB 15GB 0% 100% 18.05 tps 3950X Linux VM 19.20

14b (phi4:14b)

VRAM Processor VRAM MEM CPU GPU token/s host os mpr
0GB CPU Core i5-1030NG7 0GB 11GB 100% 0% 1.72 tps 1030NG7 macOS 5.30
0GB CPU Intel N100 DDR4 0GB 11GB 100% 0% 1.89 tps N100 Win11 WSL2 2.33
16GB GPUx1 RADEON 610M (iGPU) 16GB 12GB 0% 100% 1.96 tps 9700X Linux 7.47
0GB CPU Ryzen 5 5560U (Zen3) 0GB 11GB 100% 0% 4.24 tps 5560U Win11 WSL2 4.65
0GB CPU Ryzen 9 3950X (Zen2) 0GB 10GB 100% 0% 4.63 tps 3950X Linux 5.12
0GB CPU Ryzen 7 5700X (Zen3) 0GB 10GB 100% 0% 4.63 tps 5700X Win11 5.12
0GB CPU Core i7-13700 (RaptorLake) 0GB 10GB 100% 0% 4.73 tps 13700 Linux 5.12
0GB CPU Apple M1 CPU 0GB 10GB 100% 0% 5.30 tps M1 CPU macOS 6.83
0GB CPU Ryzen 7 9700X (Zen5) 0GB 10GB 100% 0% 5.55 tps 9700X Win11 8.96
4GB GPUx1 RADEON RX 6400 4GB 10GB 61% 39% 5.60 tps 3950X Linux 6.68
0GB CPU Ryzen 7 9700X (Zen5) 0GB 10GB 100% 0% 5.73 tps 9700X Linux 8.96
0GB CPU Ryzen 7 7840HS (Zen4) 0GB 11GB 100% 0% 5.97 tps 7840HS Win11 WSL2 8.15
16GB GPUx1 Apple M1 GPU 16GB 10GB 0% 100% 6.16 tps M1 CPU macOS 6.83
0GB CPU Ryzen 7 9700X (Zen5) 0GB 10GB 100% 0% 6.21 tps 9700X Win11 WSL2 8.96
0GB CPU Ryzen 7 7840HS (Zen4) 0GB 11GB 100% 0% 6.22 tps 7840HS Linux 8.15
8GB GPUx1 RADEON 780M 8GB 10GB 20% 80% 6.59 tps 7840HS Linux 8.96
8GB GPUx2 RADEON RX 6400 + RX 6400 4+4=8GB 11GB 28% 72% 6.94 tps 3950X Linux 8.19
8GB GPUx1 GeForce RTX 2070 Super 8GB 10GB 29% 71% 8.94 tps 13700 Win11 WSL2 13.80
8GB GPUx1 RADEON RX 7600 8GB 10GB 29% 71% 9.12 tps 5700X Win11 12.30
8GB GPUx1 GeForce GTX 1070 8GB 10GB 20% 80% 10.76 tps 9700X Linux 18.67
8GB GPUx1 RADEON RX 7600 8GB 10GB 18% 82% 11.43 tps 3950X Linux 15.72
8GB GPUx1 GeForce GTX 1080 8GB 10GB 20% 80% 11.49 tps 9700X Linux 21.13
8GB GPUx1 RADEON RX Vega 64 8GB 10GB 18% 82% 13.97 tps 9700X Linux 27.00
8GB GPUx1 GeForce RTX 2070 Super 8GB 10GB 24% 76% 14.67 tps 9700X Linux 22.86
16GB GPUx2 GeForce GTX 1080 + GTX 1070 8+8=16GB 11GB 0% 100% 16.19 tps 9700X Linux 25.86
16GB GPUx2 RADEON RX 7600 + RX Vega 56 8+8=16GB 14GB 0% 100% 18.00 tps 3950X Linux 24.16
0GB CPU Apple M4 Pro CPU 0GB 10GB 100% 0% 18.43 tps M4 Pro CPU macOS 27.30
64GB GPUx1 Apple M4 Pro GPU 64GB 12GB 0% 100% 19.49 tps M4 Pro CPU macOS 22.75
16GB GPUx2 RADEON RX Vega 64 + RX Vega 56 8+8=16GB 14GB 0% 100% 22.71 tps 9700X Linux 31.69
32GB GPUx2 GeForce RTX 4060Ti + RTX 4060Ti 16+16=32GB 12GB 0% 100% 27.04 tps 3950X Linux 24.00
40GB GPUx3 GeForce RTX 4060Ti + RTX 4060Ti + RTX 2070 Super 16+16+8=40GB 12GB 0% 100% 27.14 tps 3950X Linux 25.85
16GB GPUx1 GeForce RTX 4060Ti 16GB 12GB 0% 100% 27.92 tps 9700X Win11 24.00
16GB GPUx1 GeForce RTX 4060Ti 16GB 12GB 0% 100% 29.15 tps 9700X Linux 24.00
27GB GPUx2 GeForce RTX 4060Ti + RTX 2080Ti 16+11=27GB 12GB 0% 100% 29.15 tps 9700X Linux 30.65
16GB GPUx1 GeForce RTX 4060Ti 16GB 12GB 0% 100% 31.00 tps 9700X Win11 WSL2 24.00
11GB GPUx1 GeForce RTX 2080Ti 11GB 10GB 0% 100% 51.33 tps 9700X Linux 61.60
19GB GPUx2 GeForce RTX 2080Ti + RTX 2070 Super 11+8=19GB 10GB 0% 100% 51.42 tps 9700X Linux 53.20

12b (gemma3:12b)

VRAM Processor VRAM MEM CPU GPU token/s host os mpr
0GB CPU Ryzen 5 5560U (Zen3) 0GB 13GB 100% 0% 4.70 tps 5560U Win11 WSL2 3.94
0GB CPU Ryzen 7 7840HS (Zen4) 0GB 13GB 100% 0% 6.23 tps 7840HS Win11 WSL2 6.89
0GB CPU Apple M1 CPU 0GB 12GB 100% 0% 6.56 tps M1 CPU macOS 5.69
16GB GPUx1 Apple M1 GPU 16GB 11GB 0% 100% 7.10 tps M1 CPU macOS 6.21
0GB CPU Apple M4 Pro CPU 0GB 12GB 100% 0% 20.51 tps M4 Pro CPU macOS 22.75
64GB GPUx1 Apple M4 Pro GPU 64GB 13GB 0% 100% 22.62 tps M4 Pro CPU macOS 21.00

9b (gemma2:9b)

VRAM Processor VRAM MEM CPU GPU token/s host os mpr
0GB CPU Core i5-1030NG7 0GB 9.0GB 100% 0% 2.03 tps 1030NG7 macOS 6.48
0GB CPU Intel N97 DDR5 0GB 9.0GB 100% 0% 2.37 tps N97 Win11 WSL2 4.27
0GB CPU Intel N100 DDR4 0GB 9.0GB 100% 0% 2.54 tps N100 Win11 WSL2 2.84
0GB CPU Ryzen 5 5560U (Zen3) 0GB 9.0GB 100% 0% 5.81 tps 5560U Win11 WSL2 5.69
16GB GPUx1 RADEON 610M (iGPU) 16GB 7.3GB 0% 100% 5.88 tps 9700X Linux 12.27
0GB CPU Ryzen 9 3950X (Zen2) 0GB 7.7GB 100% 0% 6.94 tps 3950X Linux 6.65
0GB CPU Ryzen 7 5700X (Zen3) 0GB 7.7GB 100% 0% 7.04 tps 5700X Win11 6.65
0GB CPU Core i7-13700 (RaptorLake) 0GB 7.7GB 100% 0% 7.25 tps 13700 Linux 6.65
4GB GPUx1 RADEON RX 6400 4GB 8.0GB 49% 51% 8.52 tps 3950X Linux 9.22
0GB CPU Ryzen 7 7840HS (Zen4) 0GB 9.0GB 100% 0% 8.59 tps 7840HS Win11 WSL2 9.96
0GB CPU Ryzen 7 9700X (Zen5) 0GB 7.7GB 100% 0% 8.66 tps 9700X Win11 11.64
0GB CPU Ryzen 7 9700X (Zen5) 0GB 7.7GB 100% 0% 8.71 tps 9700X Linux 11.64
0GB CPU Apple M1 CPU 0GB 7.7GB 100% 0% 8.97 tps M1 CPU macOS 8.87
0GB CPU Ryzen 7 7840HS (Zen4) 0GB 9.0GB 100% 0% 9.02 tps 7840HS Linux 9.96
16GB GPUx1 Apple M1 GPU 16GB 9.5GB 0% 100% 9.36 tps M1 CPU macOS 7.19
0GB CPU Ryzen 7 9700X (Zen5) 0GB 7.7GB 100% 0% 9.46 tps 9700X Win11 WSL2 11.64
8GB GPUx2 RADEON RX 6400 + RX 6400 4+4=8GB 9.9GB 15% 85% 11.70 tps 3950X Linux 10.55
8GB GPUx1 RADEON 780M 8GB 7.3GB 0% 100% 12.89 tps 7840HS Linux 12.27
8GB GPUx1 RADEON RX 7600 8GB 8.0GB 13% 87% 18.82 tps 5700X Win11 22.48
8GB GPUx1 GeForce RTX 2070 Super 8GB 8.0GB 9% 91% 25.00 tps 13700 Win11 WSL2 32.99
0GB CPU Apple M4 Pro CPU 0GB 7.7GB 100% 0% 25.24 tps M4 Pro CPU macOS 35.45
8GB GPUx1 GeForce GTX 1070 8GB 7.3GB 0% 100% 25.40 tps 9700X Linux 35.07
16GB GPUx2 GeForce GTX 1080 + GTX 1070 8+8=16GB 7.3GB 0% 100% 26.77 tps 9700X Linux 38.96
8GB GPUx1 GeForce GTX 1080 8GB 7.3GB 0% 100% 29.94 tps 9700X Linux 43.84
8GB GPUx1 GeForce RTX 2070 Super 8GB 8.0GB 9% 91% 30.26 tps 9700X Linux 41.18
8GB GPUx1 RADEON RX 7600 8GB 7.3GB 0% 100% 30.77 tps 3950X Linux 39.45
64GB GPUx1 Apple M4 Pro GPU 64GB 9.5GB 0% 100% 33.75 tps M4 Pro CPU macOS 28.74
8GB GPUx1 RADEON RX Vega 64 8GB 7.3GB 0% 100% 37.07 tps 9700X Linux 66.27
32GB GPUx2 GeForce RTX 4060Ti + RTX 4060Ti 16+16=32GB 9.4GB 0% 100% 37.26 tps 3950X Linux 30.64
16GB GPUx2 RADEON RX Vega 64 + RX Vega 56 8+8=16GB 7.3GB 0% 100% 37.71 tps 9700X Linux 60.77
16GB GPUx1 GeForce RTX 4060Ti 16GB 9.4GB 0% 100% 41.05 tps 9700X Win11 30.64
16GB GPUx1 GeForce RTX 4060Ti 16GB 9.4GB 0% 100% 43.19 tps 9700X Linux 30.64
27GB GPUx2 GeForce RTX 4060Ti + RTX 2080Ti 16+11=27GB 9.4GB 0% 100% 43.35 tps 9700X Linux 39.13
16GB GPUx1 GeForce RTX 4060Ti 16GB 9.4GB 0% 100% 45.63 tps 9700X Win11 WSL2 30.64
19GB GPUx2 GeForce RTX 2080Ti + RTX 2070 Super 11+8=19GB 9.4GB 0% 100% 69.96 tps 9700X Linux 56.60
11GB GPUx1 GeForce RTX 2080Ti 11GB 9.4GB 0% 100% 70.08 tps 9700X Linux 65.53

7b (qwen2.5:7b)

VRAM Processor VRAM MEM CPU GPU token/s host os mpr
0GB CPU Core i5-1030NG7 0GB 5.5GB 100% 0% 3.03 tps 1030NG7 macOS 10.60
0GB CPU Intel N100 DDR4 0GB 5.5GB 100% 0% 3.74 tps N100 Win11 WSL2 4.65
0GB CPU Intel N97 DDR5 0GB 5.5GB 100% 0% 3.91 tps N97 Win11 WSL2 6.98
0GB CPU Ryzen 5 5560U (Zen3) 0GB 5.5GB 100% 0% 8.39 tps 5560U Win11 WSL2 9.31
0GB CPU Ryzen 7 5700X (Zen3) 0GB 4.8GB 100% 0% 9.26 tps 5700X Win11 10.67
0GB CPU Ryzen 9 3950X (Zen2) 0GB 4.8GB 100% 0% 9.27 tps 3950X Linux 10.67
0GB CPU Core i7-13700 (RaptorLake) 0GB 4.8GB 100% 0% 9.42 tps 13700 Linux 10.67
0GB CPU Ryzen 7 9700X (Zen5) 0GB 4.8GB 100% 0% 11.08 tps 9700X Win11 18.67
0GB CPU Ryzen 7 9700X (Zen5) 0GB 4.8GB 100% 0% 11.40 tps 9700X Linux 18.67
0GB CPU Ryzen 7 7840HS (Zen4) 0GB 5.5GB 100% 0% 11.68 tps 7840HS Win11 WSL2 16.29
16GB GPUx1 Apple M1 GPU 16GB 6.0GB 0% 100% 11.92 tps M1 CPU macOS 11.38
0GB CPU Apple M1 CPU 0GB 4.8GB 100% 0% 12.02 tps M1 CPU macOS 14.23
0GB CPU Ryzen 7 7840HS (Zen4) 0GB 5.5GB 100% 0% 12.21 tps 7840HS Linux 16.29
0GB CPU Ryzen 7 9700X (Zen5) 0GB 4.8GB 100% 0% 12.28 tps 9700X Win11 WSL2 18.67
4GB GPUx1 RADEON RX 6400 4GB 5.9GB 31% 69% 13.05 tps 3950X Linux 14.81
8GB GPUx1 RADEON 780M 8GB 6.0GB 0% 100% 14.34 tps 7840HS Linux 14.93
8GB GPUx2 RADEON RX 6400 + RX 6400 4+4=8GB 7.7GB 0% 100% 18.52 tps 3950X Linux 16.62
8GB GPUx1 GeForce GTX 1070 8GB 6.0GB 0% 100% 28.45 tps 9700X Linux 42.67
0GB CPU Apple M4 Pro CPU 0GB 4.8GB 100% 0% 38.03 tps M4 Pro CPU macOS 56.88
64GB GPUx1 Apple M4 Pro GPU 64GB 6.0GB 0% 100% 38.41 tps M4 Pro CPU macOS 45.50
8GB GPUx1 RADEON RX 7600 8GB 6.0GB 0% 100% 43.50 tps 5700X Win11 48.00
16GB GPUx1 GeForce RTX 4060Ti 16GB 6.0GB 0% 100% 53.77 tps 9700X Win11 48.00
27GB GPUx2 GeForce RTX 4060Ti + RTX 2080Ti 16+11=27GB 6.0GB 0% 100% 56.10 tps 9700X Linux 61.30
16GB GPUx1 GeForce RTX 4060Ti 16GB 6.0GB 0% 100% 56.14 tps 9700X Linux 48.00
16GB GPUx1 GeForce RTX 4060Ti 16GB 6.0GB 0% 100% 59.08 tps 9700X Win11 WSL2 48.00
8GB GPUx1 GeForce RTX 2070 Super 8GB 6.0GB 0% 100% 64.03 tps 13700 Win11 WSL2 74.67
8GB GPUx1 GeForce RTX 2070 Super 8GB 6.0GB 0% 100% 71.02 tps 9700X Linux 74.67

4b (gemma3:4b)

VRAM Processor VRAM MEM CPU GPU token/s host os mpr
0GB CPU Intel N97 DDR5 0GB 6.2GB 100% 0% 6.14 tps N97 Win11 WSL2 6.19
0GB CPU Intel N100 DDR4 0GB 6.2GB 100% 0% 6.29 tps N100 Win11 WSL2 4.13
0GB CPU Ryzen 5 5560U (Zen3) 0GB 6.2GB 100% 0% 12.83 tps 5560U Win11 WSL2 8.26
0GB CPU Ryzen 7 7840HS (Zen4) 0GB 6.2GB 100% 0% 16.73 tps 7840HS Win11 WSL2 14.45
16GB GPUx1 Apple M1 GPU 16GB 6.7GB 0% 100% 18.22 tps M1 CPU macOS 10.19
0GB CPU Apple M1 CPU 0GB 5.2GB 100% 0% 18.77 tps M1 CPU macOS 13.13
0GB CPU Apple M4 Pro CPU 0GB 5.2GB 100% 0% 52.97 tps M4 Pro CPU macOS 52.50
64GB GPUx1 Apple M4 Pro GPU 64GB 6.7GB 0% 100% 56.27 tps M4 Pro CPU macOS 40.75

2b (gemma2:2b)

VRAM Processor VRAM MEM CPU GPU token/s host os mpr
0GB CPU Core i5-1030NG7 0GB 3.1GB 100% 0% 6.38 tps 1030NG7 macOS 18.81
0GB CPU Intel N100 DDR4 0GB 3.1GB 100% 0% 7.36 tps N100 Win11 WSL2 8.26
0GB CPU Intel N97 DDR5 0GB 3.1GB 100% 0% 8.12 tps N97 Win11 WSL2 12.39
16GB GPUx1 RADEON 610M (iGPU) 16GB 3.6GB 0% 100% 16.10 tps 9700X Linux 24.89
0GB CPU Ryzen 5 5560U (Zen3) 0GB 3.1GB 100% 0% 17.05 tps 5560U Win11 WSL2 16.52
0GB CPU Ryzen 9 3950X (Zen2) 0GB 2.1GB 100% 0% 19.73 tps 3950X Linux 24.38
0GB CPU Ryzen 7 5700X (Zen3) 0GB 2.1GB 100% 0% 21.64 tps 5700X Win11 24.38
0GB CPU Core i7-13700 (RaptorLake) 0GB 2.1GB 100% 0% 21.86 tps 13700 Linux 24.38
0GB CPU Ryzen 7 7840HS (Zen4) 0GB 3.1GB 100% 0% 24.45 tps 7840HS Win11 WSL2 28.90
16GB GPUx1 Apple M1 GPU 16GB 3.6GB 0% 100% 25.75 tps M1 CPU macOS 18.97
0GB CPU Ryzen 7 7840HS (Zen4) 0GB 3.1GB 100% 0% 26.92 tps 7840HS Linux 28.90
0GB CPU Ryzen 7 9700X (Zen5) 0GB 2.1GB 100% 0% 26.96 tps 9700X Linux 42.67
0GB CPU Ryzen 7 9700X (Zen5) 0GB 2.1GB 100% 0% 27.13 tps 9700X Win11 42.67
0GB CPU Ryzen 7 9700X (Zen5) 0GB 2.1GB 100% 0% 28.28 tps 9700X Win11 WSL2 42.67
0GB CPU Apple M1 CPU 0GB 2.1GB 100% 0% 29.47 tps M1 CPU macOS 32.52
8GB GPUx1 RADEON 780M 8GB 3.6GB 0% 100% 37.49 tps 7840HS Linux 24.89
8GB GPUx2 RADEON RX 6400 + RX 6400 4+4=8GB 3.6GB 0% 100% 40.60 tps 3950X Linux 35.56
4GB GPUx1 RADEON RX 6400 4GB 3.6GB 0% 100% 46.33 tps 3950X Linux 35.56
16GB GPUx2 GeForce GTX 1080 + GTX 1070 8+8=16GB 3.6GB 0% 100% 57.52 tps 9700X Linux 79.01
8GB GPUx1 GeForce GTX 1070 8GB 3.6GB 0% 100% 61.08 tps 9700X Linux 71.11
8GB GPUx1 RADEON RX 7600 8GB 3.6GB 0% 100% 62.90 tps 3950X Linux 80.00
8GB GPUx1 GeForce GTX 1080 8GB 3.6GB 0% 100% 65.25 tps 9700X Linux 88.89
8GB GPUx1 RADEON RX Vega 64 8GB 3.6GB 0% 100% 75.42 tps 9700X Linux 134.39
0GB CPU Apple M4 Pro CPU 0GB 2.1GB 100% 0% 75.81 tps M4 Pro CPU macOS 130.00
16GB GPUx2 RADEON RX Vega 64 + RX Vega 56 8+8=16GB 3.6GB 0% 100% 84.10 tps 9700X Linux 123.23
64GB GPUx1 Apple M4 Pro GPU 64GB 3.6GB 0% 100% 87.06 tps M4 Pro CPU macOS 75.83
8GB GPUx1 RADEON RX 7600 8GB 3.6GB 0% 100% 89.82 tps 5700X Win11 80.00
8GB GPUx1 GeForce RTX 2070 Super 8GB 3.6GB 0% 100% 97.18 tps 13700 Win11 WSL2 124.44
32GB GPUx2 GeForce RTX 4060Ti + RTX 4060Ti 16+16=32GB 3.6GB 0% 100% 97.22 tps 3950X Linux 80.00
16GB GPUx1 GeForce RTX 4060Ti 16GB 3.6GB 0% 100% 113.00 tps 9700X Win11 80.00
16GB GPUx1 GeForce RTX 4060Ti 16GB 3.6GB 0% 100% 118.49 tps 9700X Linux 80.00
27GB GPUx2 GeForce RTX 4060Ti + RTX 2080Ti 16+11=27GB 3.6GB 0% 100% 118.49 tps 9700X Linux 102.16
16GB GPUx1 GeForce RTX 4060Ti 16GB 3.6GB 0% 100% 121.12 tps 9700X Win11 WSL2 80.00
8GB GPUx1 GeForce RTX 2070 Super 8GB 3.6GB 0% 100% 127.82 tps 9700X Linux 124.44
19GB GPUx2 GeForce RTX 2080Ti + RTX 2070 Super 11+8=19GB 3.6GB 0% 100% 159.22 tps 9700X Linux 147.78
11GB GPUx1 GeForce RTX 2080Ti 11GB 3.6GB 0% 100% 159.58 tps 9700X Linux 171.11

1b (gemma3:1b)

VRAM Processor VRAM MEM CPU GPU token/s host os mpr
0GB CPU Intel N100 DDR4 0GB 1.6GB 100% 0% 15.78 tps N100 Win11 WSL2 16.00
0GB CPU Intel N97 DDR5 0GB 1.6GB 100% 0% 16.23 tps N97 Win11 WSL2 24.00
0GB CPU Ryzen 5 5560U (Zen3) 0GB 1.6GB 100% 0% 34.94 tps 5560U Win11 WSL2 32.00
16GB GPUx1 Apple M1 GPU 16GB 2.1GB 0% 100% 40.86 tps M1 CPU macOS 32.52
0GB CPU Ryzen 7 7840HS (Zen4) 0GB 1.6GB 100% 0% 48.95 tps 7840HS Win11 WSL2 56.00
0GB CPU Apple M1 CPU 0GB 0.752GB 100% 0% 54.39 tps M1 CPU macOS 90.82
64GB GPUx1 Apple M4 Pro GPU 64GB 2.1GB 0% 100% 129.38 tps M4 Pro CPU macOS 130.00
0GB CPU Apple M4 Pro CPU 0GB 0.752GB 100% 0% 135.03 tps M4 Pro CPU macOS 363.03

● Proxmox 上の VM に GPU パススルーした場合

内容を見る


Multi GPU : 動作環境ごとの比較

OS を直接インストール

CPU (0GB): Ryzen 9 3950X (Zen2) : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 46GB 100% 0% 1.01 tps 3950X 1.11
qwen2.5:32b 22GB 100% 0% 2.16 tps 3950X 2.33
gemma2:27b 18GB 100% 0% 2.54 tps 3950X 2.84
phi4:14b 10GB 100% 0% 4.63 tps 3950X 5.12
gemma2:9b 7.7GB 100% 0% 6.94 tps 3950X 6.65
qwen2.5:7b 4.8GB 100% 0% 9.27 tps 3950X 10.67
gemma2:2b 2.1GB 100% 0% 19.73 tps 3950X 24.38
CPU (0GB): Ryzen 7 9700X (Zen5) : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 46GB 100% 0% 1.22 tps 9700X 1.95
qwen2.5:32b 22GB 100% 0% 2.62 tps 9700X 4.07
gemma2:27b 18GB 100% 0% 3.14 tps 9700X 4.98
phi4:14b 10GB 100% 0% 5.73 tps 9700X 8.96
gemma2:9b 7.7GB 100% 0% 8.71 tps 9700X 11.64
qwen2.5:7b 4.8GB 100% 0% 11.40 tps 9700X 18.67
gemma2:2b 2.1GB 100% 0% 26.96 tps 9700X 42.67
CPU (0GB): Ryzen 7 9700X (Zen5) : Win11
Model Memory CPU GPU token/s host mpr
llama3.3:70b 46GB 100% 0% 1.20 tps 9700X 1.95
qwen2.5:32b 22GB 100% 0% 2.55 tps 9700X 4.07
gemma2:27b 18GB 100% 0% 3.12 tps 9700X 4.98
phi4:14b 10GB 100% 0% 5.55 tps 9700X 8.96
gemma2:9b 7.7GB 100% 0% 8.66 tps 9700X 11.64
qwen2.5:7b 4.8GB 100% 0% 11.08 tps 9700X 18.67
gemma2:2b 2.1GB 100% 0% 27.13 tps 9700X 42.67
CPU (0GB): Ryzen 7 9700X (Zen5) : Win11 WSL2
Model Memory CPU GPU token/s host mpr
llama3.3:70b 46GB 100% 0% 1.32 tps 9700X 1.95
qwen2.5:32b 22GB 100% 0% 2.84 tps 9700X 4.07
gemma2:27b 18GB 100% 0% 3.42 tps 9700X 4.98
phi4:14b 10GB 100% 0% 6.21 tps 9700X 8.96
gemma2:9b 7.7GB 100% 0% 9.46 tps 9700X 11.64
qwen2.5:7b 4.8GB 100% 0% 12.28 tps 9700X 18.67
gemma2:2b 2.1GB 100% 0% 28.28 tps 9700X 42.67
CPU (0GB): Core i7-13700 (RaptorLake) : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 46GB 100% 0% 1.02 tps 13700 1.11
qwen2.5:32b 22GB 100% 0% 2.18 tps 13700 2.33
gemma2:27b 18GB 100% 0% 2.65 tps 13700 2.84
phi4:14b 10GB 100% 0% 4.73 tps 13700 5.12
gemma2:9b 7.7GB 100% 0% 7.25 tps 13700 6.65
qwen2.5:7b 4.8GB 100% 0% 9.42 tps 13700 10.67
gemma2:2b 2.1GB 100% 0% 21.86 tps 13700 24.38
CPU (0GB): Ryzen 7 7840HS (Zen4) : Win11 WSL2
Model Memory CPU GPU token/s host mpr
llama3.3:70b 46GB 100% 0% 1.09 tps 7840HS 1.95
qwen2.5:32b 22GB 100% 0% 2.77 tps 7840HS 4.07
gemma2:27b 19GB 100% 0% 3.25 tps 7840HS 4.72
phi4:14b 11GB 100% 0% 5.97 tps 7840HS 8.15
gemma3:12b 13GB 100% 0% 6.23 tps 7840HS 6.89
gemma2:9b 9.0GB 100% 0% 8.59 tps 7840HS 9.96
qwen2.5:7b 5.5GB 100% 0% 11.68 tps 7840HS 16.29
gemma3:4b 6.2GB 100% 0% 16.73 tps 7840HS 14.45
gemma2:2b 3.1GB 100% 0% 24.45 tps 7840HS 28.90
gemma3:1b 1.6GB 100% 0% 48.95 tps 7840HS 56.00
CPU (0GB): Ryzen 7 7840HS (Zen4) : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 46GB 100% 0% 1.30 tps 7840HS 1.95
qwen2.5:32b 22GB 100% 0% 2.86 tps 7840HS 4.07
gemma2:27b 19GB 100% 0% 3.40 tps 7840HS 4.72
phi4:14b 11GB 100% 0% 6.22 tps 7840HS 8.15
gemma2:9b 9.0GB 100% 0% 9.02 tps 7840HS 9.96
qwen2.5:7b 5.5GB 100% 0% 12.21 tps 7840HS 16.29
gemma2:2b 3.1GB 100% 0% 26.92 tps 7840HS 28.90
GPUx1 (8GB): RADEON 780M : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 47GB 83% 17% 1.25 tps 7840HS 1.91
qwen2.5:32b 22GB 64% 36% 2.62 tps 7840HS 4.07
gemma2:27b 18GB 57% 43% 3.62 tps 7840HS 4.98
phi4:14b 10GB 20% 80% 6.59 tps 7840HS 8.96
gemma2:9b 7.3GB 0% 100% 12.89 tps 7840HS 12.27
qwen2.5:7b 6.0GB 0% 100% 14.34 tps 7840HS 14.93
gemma2:2b 3.6GB 0% 100% 37.49 tps 7840HS 24.89
CPU (0GB): Core i5-1030NG7 : macOS
Model Memory CPU GPU token/s host mpr
phi4:14b 11GB 100% 0% 1.72 tps 1030NG7 5.30
gemma2:9b 9.0GB 100% 0% 2.03 tps 1030NG7 6.48
qwen2.5:7b 5.5GB 100% 0% 3.03 tps 1030NG7 10.60
gemma2:2b 3.1GB 100% 0% 6.38 tps 1030NG7 18.81
CPU (0GB): Ryzen 5 5560U (Zen3) : Win11 WSL2
Model Memory CPU GPU token/s host mpr
qwen2.5:32b 22GB 100% 0% 1.92 tps 5560U 2.33
gemma3:27b 24GB 100% 0% 2.18 tps 5560U 2.13
gemma2:27b 19GB 100% 0% 2.19 tps 5560U 2.69
phi4:14b 11GB 100% 0% 4.24 tps 5560U 4.65
gemma3:12b 13GB 100% 0% 4.70 tps 5560U 3.94
gemma2:9b 9.0GB 100% 0% 5.81 tps 5560U 5.69
qwen2.5:7b 5.5GB 100% 0% 8.39 tps 5560U 9.31
gemma3:4b 6.2GB 100% 0% 12.83 tps 5560U 8.26
gemma2:2b 3.1GB 100% 0% 17.05 tps 5560U 16.52
gemma3:1b 1.6GB 100% 0% 34.94 tps 5560U 32.00
CPU (0GB): Intel N100 DDR4 : Win11 WSL2
Model Memory CPU GPU token/s host mpr
phi4:14b 11GB 100% 0% 1.89 tps N100 2.33
gemma2:9b 9.0GB 100% 0% 2.54 tps N100 2.84
qwen2.5:7b 5.5GB 100% 0% 3.74 tps N100 4.65
gemma3:4b 6.2GB 100% 0% 6.29 tps N100 4.13
gemma2:2b 3.1GB 100% 0% 7.36 tps N100 8.26
gemma3:1b 1.6GB 100% 0% 15.78 tps N100 16.00
CPU (0GB): Intel N97 DDR5 : Win11 WSL2
Model Memory CPU GPU token/s host mpr
gemma2:9b 9.0GB 100% 0% 2.37 tps N97 4.27
qwen2.5:7b 5.5GB 100% 0% 3.91 tps N97 6.98
gemma3:4b 6.2GB 100% 0% 6.14 tps N97 6.19
gemma2:2b 3.1GB 100% 0% 8.12 tps N97 12.39
gemma3:1b 1.6GB 100% 0% 16.23 tps N97 24.00
CPU (0GB): Apple M1 CPU : macOS
Model Memory CPU GPU token/s host mpr
phi4:14b 10GB 100% 0% 5.30 tps M1 CPU 6.83
gemma3:12b 12GB 100% 0% 6.56 tps M1 CPU 5.69
gemma2:9b 7.7GB 100% 0% 8.97 tps M1 CPU 8.87
qwen2.5:7b 4.8GB 100% 0% 12.02 tps M1 CPU 14.23
gemma3:4b 5.2GB 100% 0% 18.77 tps M1 CPU 13.13
gemma2:2b 2.1GB 100% 0% 29.47 tps M1 CPU 32.52
gemma3:1b 0.752GB 100% 0% 54.39 tps M1 CPU 90.82
GPUx1 (16GB): Apple M1 GPU : macOS
Model Memory CPU GPU token/s host mpr
phi4:14b 10GB 0% 100% 6.16 tps M1 CPU 6.83
gemma3:12b 11GB 0% 100% 7.10 tps M1 CPU 6.21
gemma2:9b 9.5GB 0% 100% 9.36 tps M1 CPU 7.19
qwen2.5:7b 6.0GB 0% 100% 11.92 tps M1 CPU 11.38
gemma3:4b 6.7GB 0% 100% 18.22 tps M1 CPU 10.19
gemma2:2b 3.6GB 0% 100% 25.75 tps M1 CPU 18.97
gemma3:1b 2.1GB 0% 100% 40.86 tps M1 CPU 32.52
CPU (0GB): Apple M4 Pro CPU : macOS
Model Memory CPU GPU token/s host mpr
llama3.3:70b 46GB 100% 0% 4.18 tps M4 Pro CPU 5.93
qwen2.5:32b 22GB 100% 0% 7.93 tps M4 Pro CPU 12.41
gemma3:27b 22GB 100% 0% 9.63 tps M4 Pro CPU 12.41
gemma2:27b 18GB 100% 0% 9.26 tps M4 Pro CPU 15.17
mistral-small:24b 16GB 100% 0% 10.94 tps M4 Pro CPU 17.06
phi4:14b 10GB 100% 0% 18.43 tps M4 Pro CPU 27.30
gemma3:12b 12GB 100% 0% 20.51 tps M4 Pro CPU 22.75
gemma2:9b 7.7GB 100% 0% 25.24 tps M4 Pro CPU 35.45
qwen2.5:7b 4.8GB 100% 0% 38.03 tps M4 Pro CPU 56.88
gemma3:4b 5.2GB 100% 0% 52.97 tps M4 Pro CPU 52.50
gemma2:2b 2.1GB 100% 0% 75.81 tps M4 Pro CPU 130.00
gemma3:1b 0.752GB 100% 0% 135.03 tps M4 Pro CPU 363.03
GPUx1 (64GB): Apple M4 Pro GPU : macOS
Model Memory CPU GPU token/s host mpr
llama3.3:70b 46GB 0% 100% 4.64 tps M4 Pro CPU 5.93
qwen2.5:32b 23GB 0% 100% 9.63 tps M4 Pro CPU 11.87
gemma3:27b 24GB 0% 100% 10.88 tps M4 Pro CPU 11.38
gemma2:27b 20GB 0% 100% 13.91 tps M4 Pro CPU 13.65
mistral-small:24b 16GB 0% 100% 13.55 tps M4 Pro CPU 17.06
phi4:14b 12GB 0% 100% 19.49 tps M4 Pro CPU 22.75
gemma3:12b 13GB 0% 100% 22.62 tps M4 Pro CPU 21.00
gemma2:9b 9.5GB 0% 100% 33.75 tps M4 Pro CPU 28.74
qwen2.5:7b 6.0GB 0% 100% 38.41 tps M4 Pro CPU 45.50
gemma3:4b 6.7GB 0% 100% 56.27 tps M4 Pro CPU 40.75
gemma2:2b 3.6GB 0% 100% 87.06 tps M4 Pro CPU 75.83
gemma3:1b 2.1GB 0% 100% 129.38 tps M4 Pro CPU 130.00
GPUx1 (16GB): RADEON 610M (iGPU) : Linux
Model Memory CPU GPU token/s host mpr
phi4:14b 12GB 0% 100% 1.96 tps 9700X 7.47
gemma2:9b 7.3GB 0% 100% 5.88 tps 9700X 12.27
gemma2:2b 3.6GB 0% 100% 16.10 tps 9700X 24.89
GPUx1 (4GB): RADEON RX 6400 : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 47GB 92% 8% 1.03 tps 3950X 1.14
qwen2.5:32b 22GB 83% 17% 2.29 tps 3950X 2.59
gemma2:27b 18GB 79% 21% 2.70 tps 3950X 3.25
phi4:14b 10GB 61% 39% 5.60 tps 3950X 6.68
gemma2:9b 8.0GB 49% 51% 8.52 tps 3950X 9.22
qwen2.5:7b 5.9GB 31% 69% 13.05 tps 3950X 14.81
gemma2:2b 3.6GB 0% 100% 46.33 tps 3950X 35.56
GPUx2 (4+4=8GB): RADEON RX 6400 + RX 6400 : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 49GB 84% 16% 1.05 tps 3950X 1.16
qwen2.5:32b 24GB 66% 34% 2.42 tps 3950X 2.68
gemma2:27b 21GB 62% 38% 2.87 tps 3950X 3.16
phi4:14b 11GB 28% 72% 6.94 tps 3950X 8.19
gemma2:9b 9.9GB 15% 85% 11.70 tps 3950X 10.55
qwen2.5:7b 7.7GB 0% 100% 18.52 tps 3950X 16.62
gemma2:2b 3.6GB 0% 100% 40.60 tps 3950X 35.56
GPUx1 (8GB): RADEON RX 7600 : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 47GB 83% 17% 1.12 tps 3950X 1.27
qwen2.5:32b 22GB 63% 37% 2.86 tps 3950X 3.34
gemma2:27b 18GB 55% 45% 3.52 tps 3950X 4.51
phi4:14b 10GB 18% 82% 11.43 tps 3950X 15.72
gemma2:9b 7.3GB 0% 100% 30.77 tps 3950X 39.45
gemma2:2b 3.6GB 0% 100% 62.90 tps 3950X 80.00
CPU (0GB): Ryzen 7 5700X (Zen3) : Win11
Model Memory CPU GPU token/s host mpr
llama3.3:70b 46GB 100% 0% 1.00 tps 5700X 1.11
qwen2.5:32b 22GB 100% 0% 2.14 tps 5700X 2.33
gemma2:27b 18GB 100% 0% 2.56 tps 5700X 2.84
mistral-small:24b 16GB 100% 0% 2.96 tps 5700X 3.20
phi4:14b 10GB 100% 0% 4.63 tps 5700X 5.12
gemma2:9b 7.7GB 100% 0% 7.04 tps 5700X 6.65
qwen2.5:7b 4.8GB 100% 0% 9.26 tps 5700X 10.67
gemma2:2b 2.1GB 100% 0% 21.64 tps 5700X 24.38
GPUx1 (8GB): RADEON RX 7600 : Win11
Model Memory CPU GPU token/s host mpr
llama3.3:70b 47GB 88% 12% 1.06 tps 5700X 1.21
qwen2.5:32b 22GB 72% 28% 2.59 tps 5700X 3.02
gemma2:27b 18GB 68% 32% 3.05 tps 5700X 3.86
mistral-small:24b 16GB 61% 39% 3.99 tps 5700X 4.71
phi4:14b 10GB 29% 71% 9.12 tps 5700X 12.30
gemma2:9b 8.0GB 13% 87% 18.82 tps 5700X 22.48
qwen2.5:7b 6.0GB 0% 100% 43.50 tps 5700X 48.00
gemma2:2b 3.6GB 0% 100% 89.82 tps 5700X 80.00
GPUx2 (8+8=16GB): RADEON RX 7600 + RX Vega 56 : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 48GB 65% 35% 1.30 tps 3950X 1.52
qwen2.5:32b 23GB 31% 69% 4.34 tps 3950X 5.37
gemma2:27b 21GB 22% 78% 6.00 tps 3950X 7.21
phi4:14b 14GB 0% 100% 18.00 tps 3950X 24.16
GPUx3 (8+8+8=24GB): RADEON RX 7600 + RX Vega 64 + RX Vega 56 : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 50GB 52% 48% 1.51 tps 3950X 1.75
qwen2.5:32b 25GB 3% 97% 9.12 tps 3950X 12.63
gemma2:27b 23GB 0% 100% 14.55 tps 3950X 16.34
GPUx1 (8GB): GeForce RTX 2070 Super : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 47GB 84% 16% 1.37 tps 9700X 2.19
qwen2.5:32b 22GB 65% 35% 3.51 tps 9700X 5.66
gemma2:27b 18GB 59% 41% 4.35 tps 9700X 7.41
phi4:14b 10GB 24% 76% 14.67 tps 9700X 22.86
gemma2:9b 8.0GB 9% 91% 30.26 tps 9700X 41.18
qwen2.5:7b 6.0GB 0% 100% 71.02 tps 9700X 74.67
gemma2:2b 3.6GB 0% 100% 127.82 tps 9700X 124.44
GPUx1 (8GB): GeForce RTX 2070 Super : Win11 WSL2
Model Memory CPU GPU token/s host mpr
llama3.3:70b 47GB 85% 15% 1.04 tps 13700 1.26
qwen2.5:32b 22GB 68% 32% 2.45 tps 13700 3.25
gemma2:27b 18GB 61% 39% 2.92 tps 13700 4.35
mistral-small:24b 16GB 57% 43% 3.62 tps 13700 5.17
phi4:14b 10GB 29% 71% 8.94 tps 13700 13.80
gemma2:9b 8.0GB 9% 91% 25.00 tps 13700 32.99
qwen2.5:7b 6.0GB 0% 100% 64.03 tps 13700 74.67
gemma2:2b 3.6GB 0% 100% 97.18 tps 13700 124.44
GPUx1 (11GB): GeForce RTX 2080Ti : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 47GB 77% 23% 1.48 tps 9700X 2.37
qwen2.5:32b 22GB 52% 48% 4.35 tps 9700X 6.91
gemma2:27b 18GB 43% 57% 5.74 tps 9700X 9.70
phi4:14b 10GB 0% 100% 51.33 tps 9700X 61.60
gemma2:9b 9.4GB 0% 100% 70.08 tps 9700X 65.53
gemma2:2b 3.6GB 0% 100% 159.58 tps 9700X 171.11
GPUx2 (11+8=19GB): GeForce RTX 2080Ti + RTX 2070 Super : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 48GB 62% 38% 1.71 tps 9700X 2.73
qwen2.5:32b 23GB 20% 80% 7.68 tps 9700X 11.64
gemma2:27b 21GB 11% 89% 11.83 tps 9700X 16.42
phi4:14b 10GB 0% 100% 51.42 tps 9700X 53.20
gemma2:9b 9.4GB 0% 100% 69.96 tps 9700X 56.60
gemma2:2b 3.6GB 0% 100% 159.22 tps 9700X 147.78
GPUx1 (8GB): RADEON RX Vega 64 : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 47GB 83% 17% 1.34 tps 9700X 2.21
qwen2.5:32b 22GB 63% 37% 3.44 tps 9700X 5.83
gemma2:27b 18GB 55% 45% 4.36 tps 9700X 7.86
phi4:14b 10GB 18% 82% 13.97 tps 9700X 27.00
gemma2:9b 7.3GB 0% 100% 37.07 tps 9700X 66.27
gemma2:2b 3.6GB 0% 100% 75.42 tps 9700X 134.39
GPUx2 (8+8=16GB): RADEON RX Vega 64 + RX Vega 56 : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 48GB 65% 35% 1.55 tps 9700X 2.59
qwen2.5:32b 23GB 30% 70% 5.10 tps 9700X 8.83
gemma2:27b 21GB 21% 79% 7.45 tps 9700X 11.55
phi4:14b 14GB 0% 100% 22.71 tps 9700X 31.69
gemma2:9b 7.3GB 0% 100% 37.71 tps 9700X 60.77
gemma2:2b 3.6GB 0% 100% 84.10 tps 9700X 123.23
GPUx1 (8GB): GeForce GTX 1070 : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 47GB 83% 17% 1.31 tps 9700X 2.14
qwen2.5:32b 22GB 64% 36% 3.18 tps 9700X 5.32
gemma2:27b 18GB 57% 43% 4.05 tps 9700X 6.91
phi4:14b 10GB 20% 80% 10.76 tps 9700X 18.67
gemma2:9b 7.3GB 0% 100% 25.40 tps 9700X 35.07
qwen2.5:7b 6.0GB 0% 100% 28.45 tps 9700X 42.67
gemma2:2b 3.6GB 0% 100% 61.08 tps 9700X 71.11
GPUx1 (8GB): GeForce GTX 1080 : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 47GB 83% 17% 1.32 tps 9700X 2.17
qwen2.5:32b 22GB 64% 36% 3.27 tps 9700X 5.50
gemma2:27b 18GB 57% 43% 4.12 tps 9700X 7.21
phi4:14b 10GB 20% 80% 11.49 tps 9700X 21.13
gemma2:9b 7.3GB 0% 100% 29.94 tps 9700X 43.84
gemma2:2b 3.6GB 0% 100% 65.25 tps 9700X 88.89
GPUx2 (8+8=16GB): GeForce GTX 1080 + GTX 1070 : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 48GB 67% 33% 1.45 tps 9700X 2.41
qwen2.5:32b 23GB 31% 69% 4.41 tps 9700X 7.39
gemma2:27b 21GB 22% 78% 6.24 tps 9700X 9.16
phi4:14b 11GB 0% 100% 16.19 tps 9700X 25.86
gemma2:9b 7.3GB 0% 100% 26.77 tps 9700X 38.96
gemma2:2b 3.6GB 0% 100% 57.52 tps 9700X 79.01
GPUx1 (16GB): GeForce RTX 4060Ti : Win11
Model Memory CPU GPU token/s host mpr
llama3.3:70b 46GB 66% 34% 1.54 tps 9700X 2.54
qwen2.5:32b 21GB 28% 72% 5.81 tps 9700X 8.47
gemma2:27b 18GB 16% 84% 8.87 tps 9700X 11.81
mistral-small:24b 15GB 3% 97% 15.38 tps 9700X 18.00
phi4:14b 12GB 0% 100% 27.92 tps 9700X 24.00
gemma2:9b 9.4GB 0% 100% 41.05 tps 9700X 30.64
qwen2.5:7b 6.0GB 0% 100% 53.77 tps 9700X 48.00
gemma2:2b 3.6GB 0% 100% 113.00 tps 9700X 80.00
GPUx1 (16GB): GeForce RTX 4060Ti : Win11 WSL2
Model Memory CPU GPU token/s host mpr
llama3.3:70b 46GB 66% 34% 1.70 tps 9700X 2.54
qwen2.5:32b 21GB 28% 72% 6.41 tps 9700X 8.47
gemma2:27b 18GB 16% 84% 9.71 tps 9700X 11.81
mistral-small:24b 15GB 3% 97% 15.45 tps 9700X 18.00
phi4:14b 12GB 0% 100% 31.00 tps 9700X 24.00
gemma2:9b 9.4GB 0% 100% 45.63 tps 9700X 30.64
qwen2.5:7b 6.0GB 0% 100% 59.08 tps 9700X 48.00
gemma2:2b 3.6GB 0% 100% 121.12 tps 9700X 80.00
GPUx1 (16GB): GeForce RTX 4060Ti : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 46GB 65% 35% 1.65 tps 9700X 2.57
qwen2.5:32b 21GB 25% 75% 6.35 tps 9700X 8.83
gemma2:27b 18GB 12% 88% 10.14 tps 9700X 12.64
phi4:14b 12GB 0% 100% 29.15 tps 9700X 24.00
gemma2:9b 9.4GB 0% 100% 43.19 tps 9700X 30.64
qwen2.5:7b 6.0GB 0% 100% 56.14 tps 9700X 48.00
gemma2:2b 3.6GB 0% 100% 118.49 tps 9700X 80.00
GPUx1 (16GB): GeForce RTX 4060Ti : Linux VM
Model Memory CPU GPU token/s host mpr
mistral-small:24b 15GB 0% 100% 18.05 tps 3950X 19.20
GPUx2 (16+11=27GB): GeForce RTX 4060Ti + RTX 2080Ti : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 48GB 43% 57% 2.22 tps 9700X 3.28
qwen2.5:32b 25GB 0% 100% 16.21 tps 9700X 14.71
gemma2:27b 23GB 0% 100% 20.15 tps 9700X 15.99
phi4:14b 12GB 0% 100% 29.15 tps 9700X 30.65
gemma2:9b 9.4GB 0% 100% 43.35 tps 9700X 39.13
qwen2.5:7b 6.0GB 0% 100% 56.10 tps 9700X 61.30
gemma2:2b 3.6GB 0% 100% 118.49 tps 9700X 102.16
GPUx2 (16+16=32GB): GeForce RTX 4060Ti + RTX 4060Ti : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 47GB 32% 68% 2.22 tps 3950X 2.47
qwen2.5:32b 25GB 0% 100% 12.74 tps 3950X 11.52
gemma2:27b 23GB 0% 100% 15.51 tps 3950X 12.52
phi4:14b 12GB 0% 100% 27.04 tps 3950X 24.00
gemma2:9b 9.4GB 0% 100% 37.26 tps 3950X 30.64
gemma2:2b 3.6GB 0% 100% 97.22 tps 3950X 80.00
GPUx3 (16+16+8=40GB): GeForce RTX 4060Ti + RTX 4060Ti + RTX 2070 Super : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 49GB 17% 83% 3.08 tps 3950X 3.40
qwen2.5:32b 26GB 0% 100% 13.41 tps 3950X 11.93
gemma2:27b 25GB 0% 100% 15.89 tps 3950X 12.41
phi4:14b 12GB 0% 100% 27.14 tps 3950X 25.85
GPUx4 (16+16+8+8=48GB): GeForce RTX 4060Ti + RTX 4060Ti + RTX 2070 Super + GTX 1080 : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 51GB 6% 94% 4.25 tps 3950X 4.68
GPUx5 (16+16+8+8+8=56GB): GeForce RTX 4060Ti + RTX 4060Ti + RTX 2070 Super + GTX 1080 + GTX 1070 : Linux
Model Memory CPU GPU token/s host mpr
llama3.3:70b 55GB 0% 100% 5.10 tps 3950X 5.50

以前のデータ (proxmox vm)

内容を見る


使用した PC の Spec

PC

CPU Ryzen 9 3950X Ryzen 7 9700X (105W mode) Core i7-13700 Ryzen 7 5700X Core i7-7840HS Apple M1 Core i5-1030NG7 Apple M4 Pro
Core Zen2 16C 32T Zen5 8C 16T RaptorLake 16C 24T Zen3 8C 16T Zen4 8C 16T M1 CPU 8C 8T IceLake 4C 8T M4 CPU 12C 12T
RAM DDR4-3200 96GB DDR5-5600 96GB DDR4-3200 64GB DDR4-3200 64GB DDR5-5600 64GB LPDDR4X-4266 16GB LPDDR4X-3733 16GB LPDDR5X-8533 64GB
Mother TUF GAMING X570-Plus TUF GAMING B650M-E B760M-ITX/D4 DeskMeet X300 GMKTec NucBox K6 MacBook Air MacBook Air Mac mini M4 Pro
GPU1 PCIe 4.0 x16 PCIe 4.0 x16 PCIe 4.0 x16 PCIe 4.0 x16 iGPU RADEON 780M iGPU iGPU
GPU2 PCIe 4.0 x4 PCIe 4.0 x4 M.2 DEG1 OCulink
GPU3 PCIe 4.0 x4 M.2 DEG1 OCulink PCIe 4.0 x4 M.2 DEG1 OCulink
GPU4 PCIe 4.0 x4 M.2 DEG1 OCulink
GPU5 PCIe 3.0 x1
GPU6 PCIe 3.0 x1
OS Ubuntu 22.04/24.04 Ubuntu 22.04/24.04/Win11 Ubuntu 24.04/Win11 Win11 Ubuntu 24.04/Win11 macOS macOS macOS

GPU

GPU VRAM clock Mem B/W PCIe SM/CU sp Shader fp32 TensorCore TensorCore int8
GeForce RTX 4060 Ti Ada Lovelace 16GB 2540 MHz 288.0 GB/s PCIe 4.0 x8 32 4352 sp 22108.16 GFLOPS 136 176.87 TOPS
GeForce RTX 2080 Ti Turing 11GB 1545 MHz 616.0 GB/s PCIe 3.0 x16 68 4352 sp 13447.68 GFLOPS 544 215.16 TOPS
GeForce RTX 2070 Super Turing 8GB 1770 MHz 448.0 GB/s PCIe 3.0 x16 40 2560 sp 9062.40 GFLOPS 320 145.00 TOPS
GeForce GTX 1080 Pascal 8GB 1733 MHz 320.0 GB/s PCIe 3.0 x16 20 2560 sp 8872.96 GFLOPS
GeForce GTX 1070 Pascal 8GB 1683 MHz 256.0 GB/s PCIe 3.0 x16 15 1920 sp 6462.72 GFLOPS
RADEON RX 7600 RDNA3 navi33 gfx1102 8GB 2655 MHz 288.0 GB/s PCIe 4.0 x8 32 2048 sp 21749.76 GFLOPS
RADEON 780M (iGPU) RDNA3 gfx1103 8GB 2700 MHz 89.6 GB/s 12 768 sp 8294.40 GFLOPS
RADEON RX 6400 RDNA2 navi24 gfx1034 4GB 2321 MHz 128.0 GB/s PCIe 4.0 x4 12 768 sp 3565.06 GFLOPS
RADEON 610M (iGPU) RDNA2 gfx1037 16GB 1900 MHz 89.6 GB/s 4 128 sp 486.40 GFLOPS
RADEON RX Vega 64 GCN5 vega10 gfx900 8GB 1546 MHz 483.8 GB/s PCIe 3.0 x16 64 4096 sp 12664.83 GFLOPS
RADEON RX Vega 56 GCN5 vega10 gfx900 8GB 1471 MHz 409.6 GB/s PCIe 3.0 x16 56 3584 sp 10544.13 GFLOPS
Apple M1 C8/G8 68.3 GB/s
Apple M4 Pro C12/G16 273.0 GB/s

Ollama Install 時のメモ

  • Ryzen 7 9700X で GeForce 接続時に先に内蔵 GPU (610M) が認識されて ROCm 版がインストールされてしまう場合は、UEFI (BIOS) で内蔵 GPU を無効化しておく
  • RADEON 780M は環境変数 HSA_OVERRIDE_GFX_VERSION=11.0.2 を設定
  • RADEON RX 6400 と RADEON 610M は環境変数 HSA_OVERRIDE_GFX_VERSION=10.3.0 を設定
  • Linux では ollama を install するだけで ROCm 対応 GPU が動作するが rocm-smi コマンドは使えない (追加で RADEON ドライバを install すれば使える)
  • WSL2 でメモリ不足になる場合は WSL Settings から割り当てるメモリを増やしておく必要あり
  • WSL2 には Linux 版をそのままインストールできる
  • 速度は大雑把に Linux > WSL2 > Win11、ただし GPU 割合が増えるとあまり差がないかもしれない

OS と Multi GPU

GeForce Radeon
OS CUDA CUDA Multi ROCm ROCm Multi
Linux ◯ Vega 以上
Windows11 ◯ RDNA2 以上 ?
Windows11 WSL2
ai/ollama.txt · 最終更新: 2025/03/28 21:28 by oga

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki