RTX 4070 SUPER

Benchmark Intel ARC

RTX 4070 SUPERProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 4060AMD RX 7600Intel Core i9-13900K @ 5.50GHz (24 Cores / 32 Threads)ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS)Intel Device 7a2732GB4001GB Seagate ZP4000GP304001ASUS NVIDIA GeForce RTX 4070 SUPER 12GBRealtek ALC1220ARZOPAIntel I226-V + Intel Device 7a70EndeavourOS rolling6.7.1-arch1-1 (x86_64)KDE Plasma 5.27.10X Server 1.21.1.11NVIDIA 550.40.074.6.0OpenCL 3.0 CUDA 12.4.74GCC 13.2.1 20230801ext41920x1080Intel Core Ultra 9 285K @ 5.10GHz (24 Cores)MSI MEG Z890 UNIFY-X (MS-7E20) v1.0 (1.A10 BIOS)Intel Device ae7f2 x 16GB DDR5-6000MT/s Corsair CMH32GX5M2B6000Z301024GB Wodposit NVMe SSDIntel Arc A750 DG2 8GBIntel DG2 AudioPiKVM V3Realtek Device 5000 + Intel Wi-Fi 7Ubuntu 24.106.12.1-061201-generic (x86_64)GNOME Shell 47.0X Server + Wayland4.6 Mesa 24.3.1 kisak-mesa PPAOpenCL 3.0GCC 14.2.0Intel Arc A580 DG2 8GB1280x720Intel BMG G21 12GBIntel Device e2f74.6 Mesa 25.0.0-devel (git-1027b071f9)GCC 14.2.0 + Clang 19.1.1 + LLVM 19.1.11920x1080Intel Arc A770 DG2 8GBIntel DG2 AudiollvmpipeNVIDIA AD107 HD Audio6.12.3-061203-generic (x86_64)X Server 1.21.1.13 + WaylandNVIDIA 565.774.5 Mesa 25.0.0-devel (git-1027b071f9) (LLVM 19.1.1 256 bits)OpenCL 3.0 CUDA 12.7.33Sapphire AMD Radeon RX 7600AMD Navi 31 HDMI/DP4.6 Mesa 25.0.0-devel (git-1027b071f9) (LLVM 19.1.1 DRM 3.59)OpenCL 1.1 Mesa 24.3.1 kisak-mesa PPA + OpenCL 3.0Phoronix Test SuiteKernel Details- NVIDIA RTX 4070 SUPER: Transparent Huge Pages: always- Intel ARC A750: Transparent Huge Pages: madvise- Intel ARC A580: Transparent Huge Pages: madvise- Intel ARC B580: Transparent Huge Pages: madvise- Intel ARC A770 8GB: Transparent Huge Pages: madvise- NVIDIA RTX 4060: Transparent Huge Pages: madvise- AMD RX 7600: Transparent Huge Pages: madviseCompiler Details- NVIDIA RTX 4070 SUPER: --disable-libssp --disable-libstdcxx-pch --disable-werror --enable-__cxa_atexit --enable-bootstrap --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=ada,c,c++,d,fortran,go,lto,objc,obj-c++ --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-build-config=bootstrap-lto --with-linker-hash-style=gnu - Intel ARC A750: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Intel ARC A580: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Intel ARC B580: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Intel ARC A770 8GB: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - NVIDIA RTX 4060: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - AMD RX 7600: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- NVIDIA RTX 4070 SUPER: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x11d- Intel ARC A750: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x110 - Thermald 2.5.8- Intel ARC A580: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x110 - Thermald 2.5.8- Intel ARC B580: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x110 - Thermald 2.5.8- Intel ARC A770 8GB: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x110 - Thermald 2.5.8- NVIDIA RTX 4060: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x110 - Thermald 2.5.8- AMD RX 7600: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x110 - Thermald 2.5.8Graphics Details- NVIDIA RTX 4070 SUPER: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.04.69.00.c1- NVIDIA RTX 4060: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 95.07.31.80.31- AMD RX 7600: BAR1 / Visible vRAM Size: 8176 MBSecurity Details- NVIDIA RTX 4070 SUPER: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected - Intel ARC A750: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected - Intel ARC A580: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected - Intel ARC B580: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected - Intel ARC A770 8GB: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected - NVIDIA RTX 4060: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected - AMD RX 7600: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected Python Details- Intel ARC A750, Intel ARC A580, Intel ARC B580, Intel ARC A770 8GB, NVIDIA RTX 4060, AMD RX 7600: Python 3.12.7OpenCL Details- NVIDIA RTX 4060: GPU Compute Cores: 3072

RTX 4070 SUPERopencl-benchmark: FP64 Computeopencl-benchmark: FP32 Computeopencl-benchmark: INT64 Computeopencl-benchmark: INT32 Computeopencl-benchmark: INT16 Computeopencl-benchmark: INT8 Computeopencl-benchmark: Memory Bandwidth Coalesced Readopencl-benchmark: Memory Bandwidth Coalesced Writerealsr-ncnn: 4x - Norealsr-ncnn: 4x - Yeswaifu2x-ncnn: 2x - 3 - Yesvkfft: FFT + iFFT R2C / C2Rvkfft: FFT + iFFT C2C 1D batched in half precisionvkfft: FFT + iFFT C2C Bluestein in single precisionvkfft: FFT + iFFT C2C 1D batched in double precisionvkfft: FFT + iFFT C2C 1D batched in single precisionvkfft: FFT + iFFT C2C multidimensional in single precisionvkfft: FFT + iFFT C2C Bluestein benchmark in double precisionvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflinghashcat: MD5hashcat: SHA1hashcat: 7-Ziphashcat: SHA-512hashcat: TrueCrypt RIPEMD160 + XTScl-mem: Copycl-mem: Readcl-mem: Writevkresample: 2x - Doublevkresample: 2x - Singlefahbench: clpeak: Integer Compute INTclpeak: Single-Precision Floatclpeak: Double-Precision Doubleclpeak: Global Memory Bandwidthfinancebench: Black-Scholes OpenCLviennacl: CPU BLAS - sCOPYviennacl: CPU BLAS - sAXPYviennacl: CPU BLAS - sDOTviennacl: CPU BLAS - dCOPYviennacl: CPU BLAS - dAXPYviennacl: CPU BLAS - dDOTviennacl: CPU BLAS - dGEMV-Nviennacl: CPU BLAS - dGEMV-Tviennacl: CPU BLAS - dGEMM-NNviennacl: CPU BLAS - dGEMM-NTviennacl: CPU BLAS - dGEMM-TNviennacl: CPU BLAS - dGEMM-TTviennacl: OpenCL BLAS - sCOPYviennacl: OpenCL BLAS - sAXPYviennacl: OpenCL BLAS - sDOTviennacl: OpenCL BLAS - dCOPYviennacl: OpenCL BLAS - dAXPYviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dGEMV-Nviennacl: OpenCL BLAS - dGEMV-Tviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMM-NTviennacl: OpenCL BLAS - dGEMM-TNviennacl: OpenCL BLAS - dGEMM-TTblender: BMW27 - NVIDIA OptiXblender: Classroom - NVIDIA OptiXblender: Fishy Cat - NVIDIA OptiXblender: Barbershop - NVIDIA OptiXblender: Pabellon Barcelona - NVIDIA OptiXindigobench: OpenCL GPU - Bedroomindigobench: OpenCL GPU - Supercarmandelgpu: GPUneatbench: GPUtensorflow: GPU - 1 - VGG-16tensorflow: GPU - 1 - AlexNettensorflow: GPU - 16 - VGG-16tensorflow: GPU - 32 - VGG-16tensorflow: GPU - 16 - AlexNettensorflow: GPU - 32 - AlexNettensorflow: GPU - 1 - GoogLeNettensorflow: GPU - 1 - ResNet-50tensorflow: GPU - 16 - GoogLeNettensorflow: GPU - 16 - ResNet-50tensorflow: GPU - 32 - GoogLeNettensorflow: GPU - 32 - ResNet-50ncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - FastestDetvkpeak: fp32-scalarvkpeak: fp32-vec4vkpeak: fp16-scalarvkpeak: fp16-vec4vkpeak: int32-scalarvkpeak: int32-vec4vkpeak: int16-scalarvkpeak: int16-vec4ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3unigine-valley: 1920 x 1080 - Fullscreen - OpenGLvkpeak: fp32-scalarvkpeak: fp32-vec4vkpeak: fp16-scalarvkpeak: fp16-vec4vkfft: FFT + iFFT R2C / C2Rvkfft: FFT + iFFT C2C 1D batched in half precisionvkfft: FFT + iFFT C2C Bluestein in single precisionvkfft: FFT + iFFT C2C 1D batched in double precisionvkfft: FFT + iFFT C2C 1D batched in single precisionvkfft: FFT + iFFT C2C multidimensional in single precisionvkfft: FFT + iFFT C2C Bluestein benchmark in double precisionvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflingopenarena: 1920 x 1080unigine-heaven: 1920 x 1080 - Fullscreen - OpenGLxonotic: 1920 x 1080 - Lowxonotic: 1920 x 1080 - Highxonotic: 1920 x 1080 - Ultraxonotic: 1920 x 1080 - Ultimateparaview: Many Spheres - 1920 x 1080paraview: Many Spheres - 1920 x 1080paraview: Wavelet Volume - 1920 x 1080paraview: Wavelet Volume - 1920 x 1080paraview: Wavelet Contour - 1920 x 1080paraview: Wavelet Contour - 1920 x 1080indigobench: CPU - Bedroomindigobench: CPU - Supercarluxmark: GPU - Hotelluxmark: CPU+GPU - Hotelluxmark: GPU - Microphoneluxmark: GPU - Luxball HDRluxmark: CPU+GPU - Microphoneluxmark: CPU+GPU - Luxball HDRspecviewperf2020: 1920 x 1080 - SNX-04specviewperf2020: 1920 x 1080 - CREO-03specviewperf2020: 1920 x 1080 - MAYA-06specviewperf2020: 1920 x 1080 - CATIA-06specviewperf2020: 1920 x 1080 - ENERGY-03specviewperf2020: 1920 x 1080 - MEDICAL-O3specviewperf2020: 1920 x 1080 - SOLIDWORKS-07darktable: Boat - OpenCLdarktable: Boat - CPU-onlydarktable: Masskrug - OpenCLdarktable: Masskrug - CPU-onlydarktable: Server Rack - OpenCLdarktable: Server Room - OpenCLdarktable: Server Rack - CPU-onlydarktable: Server Room - CPU-onlyshoc: OpenCL - S3Dshoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Reductionshoc: OpenCL - GEMM SGEMM_Nshoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthopencl-benchmark: FP16 Computevkpeak: fp64-scalarvkpeak: fp64-vec4vkpeak: fp64-scalarvkpeak: fp64-vec4vkpeak: int32-scalarvkpeak: int32-vec4vkpeak: int16-scalarvkpeak: int16-vec4NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 4060AMD RX 76000.62138.5944.21419.88917.17014.307464.86455.016.32334.8852.8555479413170515166243177392950299445175078675830333332213260000011764673232733333802967331.8446.2407.5339.59318.489366.057618170.5435492.69630.11437.655.91213215616570.887.296.81021091191171151223343923704234374582103895775845996135.5712.609.4551.3014.2919.80152.813587219538.240701.3513.921.481.5031.5933.412.624.3515.675.4615.615.518.623.032.252.313.855.070.8411.04117.818.9716.1746.2663.826.8611.11844.612.8610.2510.9635.16025.1089.491204.23400.7910.35666.0665.291325441004265573588003301164004310105000005401850000246633943466667328400269.6153.7280.118.8944885.3611380.91396.724.84283.712281.356.478.776.710110513412513913398.888.51329.30219.708279683403.21.7015.792.392.4239.8342.7915.485.2724.057.8326.478.0780.2557.8611.7111.4424.10101.9128.29102.7546.4447.5423.3694.1349.60101.26432.61119.0094.3815200.279779.1921377.7433768.624081.954242.108052.948426.4280.25226.62116868.9313898.3221379.2222840.7331218705715425586253153263113469.5224.518932.7899439777.9338302735.3422777551.350394772.277245.447243.833901.283278.152898.6744.93212.527132361326245988602514584460539172.0362.65165.7647.8531.1740.9773.132.9002.9051.6901.6890.1721.3620.1731.381224.71718.03551187.3622.724974.52722049.20280485218.656522.4417883.46215.6619.0021.0134.09624.0568.701187.20408.9011.09972.2825.59429941698955093589863019663530265392000004576025000249667807266667277900259.7144.0292.820.4124133.839758.86388.854.81683.312281.356.578.677.810110513312713713311498.31999.25720.166248630520.11.6915.782.392.4239.5542.8515.715.2324.107.8326.4680.5351.0429.0437.0515.70106.5624.06106.3246.1747.5823.4193.4849.56101.57462.80119.0093.5413055.928389.6218359.4928957.523504.133636.356905.617230.5480.53216.33114354.2111908.7318324.9219567.5430388723465134584263052163157524.2205.2501051.5133602856.7241455809.4562041575.514904865.036519.452230.753692.084252.662633.0444.97812.858110151101038157505883822450788168.7561.80159.4245.8427.7439.1370.722.9162.9201.6801.6880.1761.3850.1721.373216.38517.77291192.9919.528069.47511777.18234522418.655922.4252762.70313.6350.89414.2430.6384.53037.35424.1732766.40991.079.04852.3744.6722436733333324043333332362673506666673880674.8574.6934539.9614529.14909.31417.5618.37446785.512380.556.378.677.110110613312713713217226226334845952083.8284484491467486287498206.41.7215.882.412.4239.3843.9215.755.2324.217.7926.538.1181.0266.8213.697.0522.1492.9327.90106.7846.3747.7923.5195.3049.76102.28454.93119.0594.957522.127285.2518812.6715029.633473.883603.1310654.9511881.6381.02257.3117291.0010725.6121097.3323932.87339.9246.517802.2147431714.4414221682.9868661544.257934447.144726.021378.426054.662318.403318.1214.91012.6407890790425097329922500332949182.0447.15190.4156.7137.1749.05224.202.9122.9131.6861.6820.1741.3740.1741.380106.25812.48161082.4016.770163.18274966.7770937813.452214.2548812.32726.605799.10777.69801.44781.863029.973110.9310817.7213545.3411.9031.2815.93530.81411.527219.30416.219.75561.8275.132330657947658055862035011636973777060000065689500002461001158266667407400300.5236.4407.318.1995978.8913012.40390.715.38582.8122104.656.478.777.5101106132127136133159154.62039.46020.189294762004.61.7115.702.42.4215.5680.5353.0914.124.8036.9492.0423.09104.5146.2947.7423.3294.5949.3299.83454.34119.0494.0217338.0011171.5524466.7238575.264665.674842.469201.549625.4080.53242.15119353.0815881.6224444.5026100.47325907752435109327.8228.95932.0088230713.3372263697.0750397541.200797981.498169.687224.333589.283299.423120.35317.8810.27417.0982.1578.8097.6706.443260.74266.0411.62669.9464.8473675984351135411258942924415132513436453101313333398877000005637331261366667378670221.4247.9245.8500.00632.485240.26567558.4914737.91275.47245.9112.81783.212281.256.478.675.510110513312813713023425425225325726120325926526027127410.15830.103310180509.140601.6815.672.392.4339.0243.1815.985.2124.237.8726.618.1282.3047.7616.1725.7427.79107.6428.89109.6846.9949.3624.1497.1350.38104.61475.92121.5798.498847.0011702.988750.2517491.398863.418809.595886.517614.6482.308858.9111436.138610.319922.2936692843461370012570428924143425034361837.733.380055240.75651.9371.8118.92589928992315783977231547397132.8992.9001.7111.6950.1721.3850.1741.398160.27412.3511743.90519.6273256.1114843.3917046.312.721513.18921828.21279.09278.72277.452778861.268852.785830.556345.32Phoronix Test Suite

ProjectPhysX OpenCL-Benchmark

Operation: FP64 Compute

Phoronix Test SuiteTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP64 ComputeNVIDIA RTX 4070 SUPERIntel ARC B580NVIDIA RTX 40600.20120.40240.60360.80481.006SE +/- 0.000, N = 3SE +/- 0.001, N = 3SE +/- 0.001, N = 30.6210.8940.2721. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: FP32 Compute

Phoronix Test SuiteTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP32 ComputeNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 4060918273645SE +/- 0.031, N = 3SE +/- 0.010, N = 3SE +/- 0.013, N = 3SE +/- 0.011, N = 3SE +/- 0.026, N = 3SE +/- 0.083, N = 338.59410.2289.00214.24311.82817.0081. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT64 Compute

Phoronix Test SuiteTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT64 ComputeNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 40600.94821.89642.84463.79284.741SE +/- 0.015, N = 3SE +/- 0.038, N = 3SE +/- 0.004, N = 3SE +/- 0.000, N = 3SE +/- 0.007, N = 3SE +/- 0.003, N = 34.2140.9451.0130.6381.2732.1571. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT32 Compute

Phoronix Test SuiteTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT32 ComputeNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 4060510152025SE +/- 0.002, N = 3SE +/- 0.011, N = 3SE +/- 0.018, N = 3SE +/- 0.016, N = 3SE +/- 0.079, N = 3SE +/- 0.006, N = 319.8895.1074.0964.5306.0278.8391. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT16 Compute

Phoronix Test SuiteTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT16 ComputeNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 4060918273645SE +/- 0.003, N = 3SE +/- 1.033, N = 3SE +/- 0.170, N = 3SE +/- 0.106, N = 3SE +/- 0.114, N = 3SE +/- 0.003, N = 317.17025.39024.05637.35430.7207.6911. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT8 Compute

Phoronix Test SuiteTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT8 ComputeNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 4060612182430SE +/- 0.046, N = 3SE +/- 0.057, N = 3SE +/- 0.057, N = 3SE +/- 0.304, N = 3SE +/- 0.048, N = 3SE +/- 0.041, N = 314.3079.4708.70124.17311.4716.4681. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: Memory Bandwidth Coalesced Read

Phoronix Test SuiteGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced ReadNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 40606001200180024003000SE +/- 0.01, N = 3SE +/- 0.30, N = 3SE +/- 0.63, N = 3SE +/- 4.24, N = 3SE +/- 0.17, N = 3SE +/- 0.00, N = 3464.86203.73187.202766.40219.12260.891. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: Memory Bandwidth Coalesced Write

Phoronix Test SuiteGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced WriteNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 40602004006008001000SE +/- 0.14, N = 3SE +/- 0.56, N = 3SE +/- 0.52, N = 3SE +/- 8.57, N = 3SE +/- 1.35, N = 3SE +/- 0.04, N = 3455.01398.15408.90991.07413.77266.191. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

RealSR-NCNN

Scale: 4x - TAA: No

Phoronix Test SuiteSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: NoNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40603691215SE +/- 0.150, N = 15SE +/- 0.018, N = 3SE +/- 0.008, N = 3SE +/- 0.021, N = 3SE +/- 0.052, N = 3SE +/- 0.028, N = 36.3239.80810.35611.0999.04811.626

RealSR-NCNN

Scale: 4x - TAA: Yes

Phoronix Test SuiteSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: YesNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40601632486480SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 334.8961.8666.0772.2852.3769.95

Waifu2x-NCNN Vulkan

Scale: 2x - Denoise: 3 - TAA: Yes

Phoronix Test SuiteSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: YesNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40601.25872.51743.77615.03486.2935SE +/- 0.014, N = 3SE +/- 0.019, N = 3SE +/- 0.006, N = 3SE +/- 0.011, N = 3SE +/- 0.010, N = 3SE +/- 0.017, N = 32.8555.1465.2915.5944.6724.847

VkFFT

Test: FFT + iFFT R2C / C2R

Phoronix Test SuiteBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT R2C / C2RNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC A770 8GBNVIDIA RTX 406012K24K36K48K60KSE +/- 702.53, N = 15SE +/- 57.59, N = 3SE +/- 257.48, N = 15SE +/- 398.68, N = 3SE +/- 10.97, N = 35479432544299413306536759-lrt1. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in half precision

Phoronix Test SuiteBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in half precisionNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC A770 8GBNVIDIA RTX 406030K60K90K120K150KSE +/- 159.17, N = 3SE +/- 82.39, N = 3SE +/- 1784.34, N = 12SE +/- 2172.50, N = 15SE +/- 4.48, N = 3131705100426698957947684351-lrt1. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C Bluestein in single precision

Phoronix Test SuiteBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C Bluestein in single precisionNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC A770 8GBNVIDIA RTX 40603K6K9K12K15KSE +/- 102.52, N = 3SE +/- 3.76, N = 3SE +/- 66.00, N = 12SE +/- 44.40, N = 10SE +/- 162.40, N = 31516655735093580513541-lrt1. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in double precision

Phoronix Test SuiteBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in double precisionNVIDIA RTX 4070 SUPERNVIDIA RTX 40605K10K15K20K25KSE +/- 146.69, N = 3SE +/- 19.43, N = 32431712589-lrt1. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision

Phoronix Test SuiteBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in single precisionNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC A770 8GBNVIDIA RTX 406016K32K48K64K80KSE +/- 7.94, N = 3SE +/- 38.85, N = 3SE +/- 54.76, N = 3SE +/- 58.21, N = 3SE +/- 1.33, N = 37392958800589865862042924-lrt1. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C multidimensional in single precision

Phoronix Test SuiteBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C multidimensional in single precisionNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC A770 8GBNVIDIA RTX 406011K22K33K44K55KSE +/- 407.19, N = 15SE +/- 17.37, N = 3SE +/- 408.80, N = 13SE +/- 288.54, N = 9SE +/- 88.79, N = 35029933011301963501141513-lrt1. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C Bluestein benchmark in double precision

Phoronix Test SuiteBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C Bluestein benchmark in double precisionNVIDIA RTX 4070 SUPERNVIDIA RTX 406010002000300040005000SE +/- 12.55, N = 3SE +/- 0.33, N = 344512513-lrt1. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling

Phoronix Test SuiteBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC A770 8GBNVIDIA RTX 406016K32K48K64K80KSE +/- 37.77, N = 3SE +/- 48.89, N = 3SE +/- 4.48, N = 3SE +/- 115.25, N = 3SE +/- 1.20, N = 37507864004635306369743645-lrt1. (CXX) g++ options: -O3

Hashcat

Benchmark: MD5

Phoronix Test SuiteH/s, More Is BetterHashcat 6.2.4Benchmark: MD5NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 406014000M28000M42000M56000M70000MSE +/- 22430807.19, N = 3SE +/- 260097949.50, N = 3SE +/- 254278908.55, N = 3SE +/- 17415063.72, N = 3SE +/- 312582597.08, N = 3SE +/- 25273262.11, N = 3675830333333101050000026539200000243673333333777060000031013133333

Hashcat

Benchmark: SHA1

Phoronix Test SuiteH/s, More Is BetterHashcat 6.2.4Benchmark: SHA1NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 40605000M10000M15000M20000M25000MSE +/- 5140363.15, N = 3SE +/- 65365351.42, N = 4SE +/- 49049760.02, N = 4SE +/- 3626905.63, N = 3SE +/- 78813212.72, N = 4SE +/- 3637306.70, N = 32213260000054018500004576025000240433333365689500009887700000

Hashcat

Benchmark: 7-Zip

Phoronix Test SuiteH/s, More Is BetterHashcat 6.2.4Benchmark: 7-ZipNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 4060300K600K900K1200K1500KSE +/- 1991.93, N = 3SE +/- 240.37, N = 3SE +/- 66.67, N = 3SE +/- 409.61, N = 3SE +/- 503.32, N = 3SE +/- 1790.10, N = 31176467246633249667236267246100563733

Hashcat

Benchmark: SHA-512

Phoronix Test SuiteH/s, More Is BetterHashcat 6.2.4Benchmark: SHA-512NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 4060700M1400M2100M2800M3500MSE +/- 1530068.99, N = 3SE +/- 5228554.08, N = 3SE +/- 1166666.67, N = 3SE +/- 648930.74, N = 3SE +/- 2397452.35, N = 3SE +/- 466666.67, N = 3323273333394346666780726666735066666711582666671261366667

Hashcat

Benchmark: TrueCrypt RIPEMD160 + XTS

Phoronix Test SuiteH/s, More Is BetterHashcat 6.2.4Benchmark: TrueCrypt RIPEMD160 + XTSNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 4060200K400K600K800K1000KSE +/- 633.33, N = 3SE +/- 200.00, N = 3SE +/- 57.74, N = 3SE +/- 523.87, N = 3SE +/- 680.69, N = 3SE +/- 2870.27, N = 10802967328400277900388067407400378670

cl-mem

Benchmark: Copy

Phoronix Test SuiteGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC A770 8GBNVIDIA RTX 406070140210280350SE +/- 0.03, N = 3SE +/- 0.13, N = 3SE +/- 0.03, N = 3SE +/- 0.17, N = 3SE +/- 0.09, N = 3331.8269.6259.7300.5221.41. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

Phoronix Test SuiteGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC A770 8GBNVIDIA RTX 4060100200300400500SE +/- 0.12, N = 3SE +/- 0.10, N = 3SE +/- 0.03, N = 3SE +/- 0.00, N = 3SE +/- 0.03, N = 3446.2153.7144.0236.4247.91. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

Phoronix Test SuiteGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC A770 8GBNVIDIA RTX 406090180270360450SE +/- 1.11, N = 3SE +/- 0.15, N = 3SE +/- 0.03, N = 3SE +/- 0.31, N = 3SE +/- 0.12, N = 3407.5280.1292.8407.3245.81. (CC) gcc options: -O2 -flto -lOpenCL

VkResample

Upscale: 2x - Precision: Double

Phoronix Test Suitems, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: DoubleNVIDIA RTX 4070 SUPERIntel ARC B580NVIDIA RTX 4060110220330440550SE +/- 0.303, N = 3SE +/- 0.221, N = 12SE +/- 0.001, N = 3339.5934.857500.0061. (CXX) g++ options: -O3

VkResample

Upscale: 2x - Precision: Single

Phoronix Test Suitems, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: SingleNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060816243240SE +/- 0.002, N = 3SE +/- 0.002, N = 3SE +/- 0.036, N = 3SE +/- 0.025, N = 3SE +/- 0.668, N = 15SE +/- 0.006, N = 318.48918.19718.89420.4124.69332.4851. (CXX) g++ options: -O3

FAHBench

Phoronix Test SuiteNs Per Day, More Is BetterFAHBench 2.3.2NVIDIA RTX 4070 SUPERNVIDIA RTX 406080160240320400SE +/- 0.39, N = 3SE +/- 0.15, N = 3366.06240.27

clpeak

OpenCL Test: Integer Compute INT

Phoronix Test SuiteGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer Compute INTNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 40604K8K12K16K20KSE +/- 3.14, N = 3SE +/- 2.34, N = 3SE +/- 3.17, N = 3SE +/- 0.41, N = 3SE +/- 1.37, N = 3SE +/- 4.68, N = 318170.544885.364133.834539.965978.897558.491. (CXX) g++ options: -O3

clpeak

OpenCL Test: Single-Precision Float

Phoronix Test SuiteGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Single-Precision FloatNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 40608K16K24K32K40KSE +/- 0.99, N = 3SE +/- 3.31, N = 3SE +/- 1.40, N = 3SE +/- 1.47, N = 3SE +/- 1.67, N = 3SE +/- 7.09, N = 335492.6911380.919758.8614529.1413012.4014737.911. (CXX) g++ options: -O3

clpeak

OpenCL Test: Double-Precision Double

Phoronix Test SuiteGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision DoubleNVIDIA RTX 4070 SUPERIntel ARC B580NVIDIA RTX 40602004006008001000SE +/- 0.98, N = 3SE +/- 0.08, N = 3SE +/- 1.01, N = 3630.11909.31275.471. (CXX) g++ options: -O3

clpeak

OpenCL Test: Global Memory Bandwidth

Phoronix Test SuiteGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory BandwidthNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 406090180270360450SE +/- 0.02, N = 3SE +/- 0.13, N = 3SE +/- 0.13, N = 3SE +/- 0.05, N = 3SE +/- 0.19, N = 3SE +/- 0.02, N = 3437.65396.72388.85417.56390.71245.911. (CXX) g++ options: -O3

FinanceBench

Benchmark: Black-Scholes OpenCL

Phoronix Test Suitems, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Black-Scholes OpenCLNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 4060510152025SE +/- 0.114129, N = 15SE +/- 0.212949, N = 12SE +/- 0.279112, N = 15SE +/- 0.165363, N = 15SE +/- 0.046820, N = 3SE +/- 0.009770, N = 35.9120004.8420004.81600018.3744675.38500012.8170001. (CXX) g++ options: -O3 -march=native -fopenmp

ViennaCL

Test: CPU BLAS - sCOPY

Phoronix Test SuiteGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sCOPYNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060306090120150SE +/- 1.20, N = 3SE +/- 0.41, N = 3SE +/- 0.20, N = 3SE +/- 0.20, N = 3SE +/- 1.18, N = 3SE +/- 0.44, N = 3132.083.983.783.385.583.21. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sAXPY

Phoronix Test SuiteGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sAXPYNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060306090120150SE +/- 2.19, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 3SE +/- 0.88, N = 3SE +/- 0.88, N = 3SE +/- 0.33, N = 31561221221221231221. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sDOT

Phoronix Test SuiteGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sDOTNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40604080120160200SE +/- 2.73, N = 3SE +/- 0.27, N = 3SE +/- 0.17, N = 3SE +/- 0.92, N = 3SE +/- 0.52, N = 3SE +/- 1.47, N = 3165.080.681.381.380.581.21. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dCOPY

Phoronix Test SuiteGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dCOPYNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40601632486480SE +/- 0.32, N = 3SE +/- 0.12, N = 3SE +/- 0.06, N = 3SE +/- 0.07, N = 3SE +/- 0.06, N = 3SE +/- 0.03, N = 370.856.856.456.556.356.41. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dAXPY

Phoronix Test SuiteGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dAXPYNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406020406080100SE +/- 0.12, N = 3SE +/- 0.03, N = 3SE +/- 0.09, N = 3SE +/- 0.03, N = 3SE +/- 0.18, N = 3SE +/- 0.03, N = 387.278.878.778.678.678.61. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dDOT

Phoronix Test SuiteGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dDOTNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406020406080100SE +/- 0.09, N = 3SE +/- 0.53, N = 3SE +/- 0.15, N = 3SE +/- 0.27, N = 3SE +/- 0.42, N = 3SE +/- 0.57, N = 396.877.676.777.877.175.51. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-N

Phoronix Test SuiteGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-NNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406020406080100SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 31021011011011011011. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-T

Phoronix Test SuiteGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-TNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406020406080100SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.00, N = 31091061051051061051. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NN

Phoronix Test SuiteGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NNNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060306090120150SE +/- 4.04, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 3SE +/- 0.67, N = 3SE +/- 1.15, N = 3SE +/- 0.88, N = 31191341341331331331. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NT

Phoronix Test SuiteGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NTNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060306090120150SE +/- 2.08, N = 3SE +/- 0.33, N = 3SE +/- 1.53, N = 3SE +/- 1.20, N = 3SE +/- 0.58, N = 3SE +/- 0.58, N = 31171271251271271281. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TN

Phoronix Test SuiteGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TNNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060306090120150SE +/- 1.00, N = 2SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 1.86, N = 3SE +/- 1.15, N = 3SE +/- 1.15, N = 31151381391371371371. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TT

Phoronix Test SuiteGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TTNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060306090120150SE +/- 2.08, N = 3SE +/- 0.58, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.58, N = 3SE +/- 2.00, N = 31221331331331321301. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sCOPY

Phoronix Test SuiteGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPYNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 406070140210280350SE +/- 0.33, N = 3SE +/- 0.09, N = 3SE +/- 1.53, N = 3SE +/- 4.34, N = 15SE +/- 4.11, N = 15SE +/- 0.00, N = 3334.098.0116.0172.0157.0234.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sAXPY

Phoronix Test SuiteGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPYNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 406090180270360450SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 1.62, N = 3SE +/- 12.61, N = 15SE +/- 3.84, N = 15SE +/- 0.00, N = 3392.088.194.8262.0160.0254.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sDOT

Phoronix Test SuiteGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOTNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 406080160240320400SE +/- 0.00, N = 3SE +/- 0.33, N = 3SE +/- 2.40, N = 3SE +/- 13.14, N = 15SE +/- 2.91, N = 15SE +/- 0.00, N = 33701331862632082521. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dCOPY

Phoronix Test SuiteGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPYNVIDIA RTX 4070 SUPERIntel ARC B580NVIDIA RTX 406090180270360450SE +/- 0.33, N = 3SE +/- 0.92, N = 15SE +/- 0.00, N = 34233482531. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dAXPY

Phoronix Test SuiteGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPYNVIDIA RTX 4070 SUPERIntel ARC B580NVIDIA RTX 4060100200300400500SE +/- 0.00, N = 3SE +/- 0.40, N = 15SE +/- 0.00, N = 34374592571. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dDOT

Phoronix Test SuiteGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOTNVIDIA RTX 4070 SUPERIntel ARC B580NVIDIA RTX 4060110220330440550SE +/- 0.00, N = 3SE +/- 3.40, N = 15SE +/- 0.00, N = 34585202611. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-N

Phoronix Test SuiteGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-NNVIDIA RTX 4070 SUPERIntel ARC B580NVIDIA RTX 406050100150200250SE +/- 0.33, N = 3SE +/- 0.14, N = 15SE +/- 0.00, N = 3210.083.8203.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-T

Phoronix Test SuiteGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-TNVIDIA RTX 4070 SUPERIntel ARC B580NVIDIA RTX 406080160240320400SE +/- 0.00, N = 3SE +/- 0.58, N = 15SE +/- 0.00, N = 33892842591. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NN

Phoronix Test SuiteGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NNNVIDIA RTX 4070 SUPERIntel ARC B580NVIDIA RTX 4060120240360480600SE +/- 0.00, N = 3SE +/- 0.27, N = 15SE +/- 0.33, N = 35774842651. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NT

Phoronix Test SuiteGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NTNVIDIA RTX 4070 SUPERIntel ARC B580NVIDIA RTX 4060130260390520650SE +/- 0.00, N = 3SE +/- 0.17, N = 15SE +/- 0.00, N = 35844912601. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TN

Phoronix Test SuiteGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TNNVIDIA RTX 4070 SUPERIntel ARC B580NVIDIA RTX 4060130260390520650SE +/- 0.00, N = 3SE +/- 0.17, N = 15SE +/- 0.00, N = 35994672711. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TT

Phoronix Test SuiteGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TTNVIDIA RTX 4070 SUPERIntel ARC B580NVIDIA RTX 4060130260390520650SE +/- 0.00, N = 3SE +/- 0.19, N = 15SE +/- 0.00, N = 36134862741. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

Blender

Blend File: BMW27 - Compute: NVIDIA OptiX

Phoronix Test SuiteSeconds, Fewer Is BetterBlender 4.0Blend File: BMW27 - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER1.25332.50663.75995.01326.2665SE +/- 0.06, N = 135.57

Blender

Blend File: Classroom - Compute: NVIDIA OptiX

Phoronix Test SuiteSeconds, Fewer Is BetterBlender 4.0Blend File: Classroom - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER3691215SE +/- 0.00, N = 312.60

Blender

Blend File: Fishy Cat - Compute: NVIDIA OptiX

Phoronix Test SuiteSeconds, Fewer Is BetterBlender 4.0Blend File: Fishy Cat - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER3691215SE +/- 0.06, N = 139.45

Blender

Blend File: Barbershop - Compute: NVIDIA OptiX

Phoronix Test SuiteSeconds, Fewer Is BetterBlender 4.0Blend File: Barbershop - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER1224364860SE +/- 0.10, N = 351.30

Blender

Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX

Phoronix Test SuiteSeconds, Fewer Is BetterBlender 4.0Blend File: Pabellon Barcelona - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER48121620SE +/- 0.03, N = 314.29

IndigoBench

Acceleration: OpenCL GPU - Scene: Bedroom

Phoronix Test SuiteM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: BedroomNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC A770 8GBNVIDIA RTX 4060510152025SE +/- 0.009, N = 3SE +/- 0.011, N = 3SE +/- 0.004, N = 3SE +/- 0.008, N = 3SE +/- 0.003, N = 319.8019.3029.2579.46010.158

IndigoBench

Acceleration: OpenCL GPU - Scene: Supercar

Phoronix Test SuiteM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: SupercarNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC A770 8GBNVIDIA RTX 40601224364860SE +/- 0.03, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 352.8119.7120.1720.1930.10

MandelGPU

OpenCL Device: GPU

Phoronix Test SuiteSamples/sec, More Is BetterMandelGPU 1.3pts1OpenCL Device: GPUNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 4060130M260M390M520M650MSE +/- 467034.80, N = 3SE +/- 5085622.05, N = 15SE +/- 600066.13, N = 3SE +/- 1909218.58, N = 3SE +/- 164211.82, N = 3SE +/- 772976.70, N = 3587219538.2279683403.2248630520.1287498206.4294762004.6310180509.11. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

NeatBench

Acceleration: GPU

Phoronix Test SuiteFPS, More Is BetterNeatBench 5Acceleration: GPUNVIDIA RTX 4070 SUPERNVIDIA RTX 40609001800270036004500SE +/- 0.00, N = 3SE +/- 0.00, N = 340704060

TensorFlow

Device: GPU - Batch Size: 1 - Model: VGG-16

Phoronix Test Suiteimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: VGG-16NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 40600.3870.7741.1611.5481.935SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 31.351.701.691.721.711.68

TensorFlow

Device: GPU - Batch Size: 1 - Model: AlexNet

Phoronix Test Suiteimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: AlexNetNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 406048121620SE +/- 0.22, N = 2SE +/- 0.16, N = 3SE +/- 0.16, N = 3SE +/- 0.11, N = 3SE +/- 0.21, N = 3SE +/- 0.22, N = 313.9215.7915.7815.8815.7015.67

TensorFlow

Device: GPU - Batch Size: 16 - Model: VGG-16

Phoronix Test Suiteimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: VGG-16NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 40600.54231.08461.62692.16922.7115SE +/- 0.00, N = 2SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.482.392.392.412.402.39

TensorFlow

Device: GPU - Batch Size: 32 - Model: VGG-16

Phoronix Test Suiteimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: VGG-16NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 40600.54681.09361.64042.18722.734SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 21.502.422.422.422.422.43

TensorFlow

Device: GPU - Batch Size: 16 - Model: AlexNet

Phoronix Test Suiteimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: AlexNetNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060918273645SE +/- 0.17, N = 3SE +/- 0.42, N = 3SE +/- 0.40, N = 3SE +/- 0.43, N = 3SE +/- 0.18, N = 331.5939.8339.5539.3839.02

TensorFlow

Device: GPU - Batch Size: 32 - Model: AlexNet

Phoronix Test Suiteimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: AlexNetNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40601020304050SE +/- 0.15, N = 2SE +/- 0.19, N = 3SE +/- 0.03, N = 3SE +/- 0.06, N = 3SE +/- 0.43, N = 333.4042.7942.8543.9243.18

TensorFlow

Device: GPU - Batch Size: 1 - Model: GoogLeNet

Phoronix Test Suiteimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: GoogLeNetNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 406048121620SE +/- 0.17, N = 2SE +/- 0.13, N = 3SE +/- 0.05, N = 3SE +/- 0.03, N = 3SE +/- 0.18, N = 312.6215.4815.7115.7515.5615.98

TensorFlow

Device: GPU - Batch Size: 1 - Model: ResNet-50

Phoronix Test Suiteimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: ResNet-50NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40601.18582.37163.55744.74325.929SE +/- 0.06, N = 3SE +/- 0.04, N = 8SE +/- 0.04, N = 3SE +/- 0.06, N = 34.355.275.235.235.21

TensorFlow

Device: GPU - Batch Size: 16 - Model: GoogLeNet

Phoronix Test Suiteimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: GoogLeNetNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060612182430SE +/- 0.03, N = 3SE +/- 0.08, N = 3SE +/- 0.20, N = 3SE +/- 0.23, N = 3SE +/- 0.09, N = 315.6724.0524.1024.2124.23

TensorFlow

Device: GPU - Batch Size: 16 - Model: ResNet-50

Phoronix Test Suiteimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: ResNet-50NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060246810SE +/- 0.00, N = 2SE +/- 0.03, N = 3SE +/- 0.02, N = 3SE +/- 0.03, N = 3SE +/- 0.06, N = 35.467.837.837.797.87

TensorFlow

Device: GPU - Batch Size: 32 - Model: GoogLeNet

Phoronix Test Suiteimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: GoogLeNetNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060612182430SE +/- 0.01, N = 2SE +/- 0.04, N = 3SE +/- 0.06, N = 3SE +/- 0.11, N = 3SE +/- 0.11, N = 315.6126.4726.4626.5326.61

TensorFlow

Device: GPU - Batch Size: 32 - Model: ResNet-50

Phoronix Test Suiteimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: ResNet-50NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC B580NVIDIA RTX 4060246810SE +/- 0.01, N = 2SE +/- 0.03, N = 3SE +/- 0.02, N = 3SE +/- 0.03, N = 35.518.078.118.12

NCNN

Target: Vulkan GPU - Model: mobilenet

Phoronix Test Suitems, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mobilenetNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406020406080100SE +/- 0.47, N = 9SE +/- 0.36, N = 3SE +/- 0.84, N = 3SE +/- 0.40, N = 3SE +/- 0.29, N = 3SE +/- 0.20, N = 38.6279.7976.4480.5381.0282.30MIN: 6.42 / MAX: 1101.3MIN: 21.72 / MAX: 84.6MIN: 9.56 / MAX: 84MIN: 12.67 / MAX: 84.4MIN: 19.19 / MAX: 85.81MIN: 9.35 / MAX: 86.41. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

Phoronix Test Suitems, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2NVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40601530456075SE +/- 0.44, N = 9SE +/- 5.70, N = 3SE +/- 2.78, N = 3SE +/- 2.81, N = 3SE +/- 0.73, N = 3SE +/- 9.04, N = 33.0357.3038.3351.0466.8247.76MIN: 4.06 / MAX: 71.75MIN: 4.08 / MAX: 72.04MIN: 4.14 / MAX: 72.47MIN: 4.25 / MAX: 73.44MIN: 3.99 / MAX: 73.471. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

Phoronix Test Suitems, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3NVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060714212835SE +/- 0.16, N = 9SE +/- 5.47, N = 3SE +/- 3.80, N = 3SE +/- 13.05, N = 3SE +/- 5.35, N = 3SE +/- 1.27, N = 32.2515.469.3729.0413.6916.17MIN: 4.32 / MAX: 85.45MIN: 4.43 / MAX: 84.15MIN: 4.39 / MAX: 85.51MIN: 4.33 / MAX: 84.56MIN: 4.45 / MAX: 86.831. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

Phoronix Test Suitems, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: shufflenet-v2NVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060918273645SE +/- 0.34, N = 8SE +/- 0.64, N = 3SE +/- 0.19, N = 3SE +/- 23.69, N = 3SE +/- 1.45, N = 3SE +/- 11.38, N = 32.315.935.0437.057.0525.74MIN: 4.68 / MAX: 91.15MIN: 4.69 / MAX: 90.57MIN: 4.64 / MAX: 94.33MIN: 4.7 / MAX: 93.12MIN: 4.79 / MAX: 96.171. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mnasnet

Phoronix Test Suitems, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mnasnetNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060816243240SE +/- 1.31, N = 9SE +/- 4.53, N = 3SE +/- 0.84, N = 3SE +/- 7.50, N = 3SE +/- 9.99, N = 3SE +/- 8.42, N = 33.8517.2912.0015.7022.1427.79MIN: 1.89 / MAX: 1093.29MIN: 3.85 / MAX: 70.64MIN: 4.04 / MAX: 70.29MIN: 3.96 / MAX: 70.57MIN: 4.05 / MAX: 71.13MIN: 4.01 / MAX: 72.921. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

Phoronix Test Suitems, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: efficientnet-b0NVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406020406080100SE +/- 0.97, N = 9SE +/- 1.81, N = 3SE +/- 2.82, N = 3SE +/- 3.44, N = 3SE +/- 2.65, N = 3SE +/- 3.15, N = 35.0795.5554.56106.5692.93107.64MIN: 6.7 / MAX: 121.28MIN: 6.64 / MAX: 119.78MIN: 6.77 / MAX: 121.55MIN: 6.8 / MAX: 123MIN: 6.78 / MAX: 124.61. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: blazeface

Phoronix Test Suitems, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: blazefaceNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060816243240SE +/- 0.04, N = 9SE +/- 13.01, N = 3SE +/- 2.95, N = 3SE +/- 9.84, N = 3SE +/- 12.01, N = 3SE +/- 14.52, N = 30.8427.0111.5124.0627.9028.89MIN: 2.51 / MAX: 56.02MIN: 2.57 / MAX: 56.91MIN: 2.55 / MAX: 57MIN: 2.6 / MAX: 57.14MIN: 2.58 / MAX: 59.621. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: googlenet

Phoronix Test Suitems, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: googlenetNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406020406080100SE +/- 1.21, N = 9SE +/- 0.15, N = 3SE +/- 2.20, N = 3SE +/- 1.35, N = 3SE +/- 0.76, N = 3SE +/- 0.17, N = 311.04105.3593.10106.32106.78109.68MIN: 5.28 / MAX: 1769.19MIN: 8.44 / MAX: 114.62MIN: 8.42 / MAX: 113.98MIN: 8.23 / MAX: 114.52MIN: 8.63 / MAX: 115.63MIN: 8.43 / MAX: 117.991. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vgg16

Phoronix Test Suitems, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vgg16NVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060306090120150SE +/- 29.60, N = 9SE +/- 0.14, N = 3SE +/- 0.12, N = 3SE +/- 0.28, N = 3SE +/- 0.21, N = 3SE +/- 0.14, N = 3117.8146.0445.5446.1746.3746.99MIN: 17.16 / MAX: 647.67MIN: 28.48 / MAX: 48.53MIN: 23.39 / MAX: 48.99MIN: 29.78 / MAX: 48.34MIN: 28.14 / MAX: 48.7MIN: 29.31 / MAX: 49.111. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet18

Phoronix Test Suitems, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet18NVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40601122334455SE +/- 3.49, N = 9SE +/- 0.89, N = 3SE +/- 0.87, N = 3SE +/- 0.19, N = 3SE +/- 0.23, N = 3SE +/- 0.23, N = 38.9747.1143.8147.5847.7949.36MIN: 3.94 / MAX: 922.04MIN: 4.95 / MAX: 51.24MIN: 5.02 / MAX: 51.96MIN: 5.09 / MAX: 52.05MIN: 5 / MAX: 52.16MIN: 5.08 / MAX: 52.551. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: alexnet

Phoronix Test Suitems, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: alexnetNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060612182430SE +/- 5.86, N = 9SE +/- 0.18, N = 3SE +/- 0.16, N = 3SE +/- 0.05, N = 3SE +/- 0.15, N = 3SE +/- 0.14, N = 316.1723.3022.0023.4123.5124.14MIN: 3.52 / MAX: 436.52MIN: 3.6 / MAX: 25.53MIN: 3.54 / MAX: 25.29MIN: 3.58 / MAX: 25.27MIN: 3.56 / MAX: 25.58MIN: 3.73 / MAX: 25.741. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet50

Phoronix Test Suitems, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet50NVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406020406080100SE +/- 14.70, N = 9SE +/- 1.06, N = 3SE +/- 1.03, N = 3SE +/- 0.83, N = 3SE +/- 0.67, N = 3SE +/- 0.32, N = 346.2693.1887.5693.4895.3097.13MIN: 7.71 / MAX: 1829.99MIN: 10.54 / MAX: 100.7MIN: 10.81 / MAX: 101.73MIN: 10.45 / MAX: 101.32MIN: 10.72 / MAX: 102.36MIN: 10.48 / MAX: 103.161. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

Phoronix Test Suitems, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: yolov4-tinyNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40601428425670SE +/- 10.56, N = 9SE +/- 0.10, N = 3SE +/- 0.09, N = 3SE +/- 0.02, N = 3SE +/- 0.10, N = 3SE +/- 0.22, N = 363.8249.1548.7749.5649.7650.38MIN: 10.28 / MAX: 858.44MIN: 20.41 / MAX: 52.43MIN: 16.94 / MAX: 52.55MIN: 18.28 / MAX: 52.1MIN: 23.41 / MAX: 52.91MIN: 20.65 / MAX: 52.871. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

Phoronix Test Suitems, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: squeezenet_ssdNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406020406080100SE +/- 1.76, N = 9SE +/- 0.61, N = 3SE +/- 1.33, N = 3SE +/- 0.58, N = 3SE +/- 0.48, N = 3SE +/- 0.58, N = 36.86100.4089.54101.57102.28104.61MIN: 7.63 / MAX: 107.86MIN: 7.64 / MAX: 108.48MIN: 7.63 / MAX: 108.58MIN: 7.68 / MAX: 109.63MIN: 7.74 / MAX: 110.91. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: regnety_400m

Phoronix Test Suitems, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: regnety_400mNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060100200300400500SE +/- 3.28, N = 9SE +/- 11.18, N = 3SE +/- 13.66, N = 3SE +/- 2.21, N = 3SE +/- 3.35, N = 3SE +/- 2.46, N = 311.11453.80243.66462.80454.93475.92MIN: 23.74 / MAX: 528.85MIN: 23.78 / MAX: 525.86MIN: 23.85 / MAX: 530.99MIN: 23.67 / MAX: 535.54MIN: 24.24 / MAX: 546.711. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vision_transformer

Phoronix Test Suitems, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vision_transformerNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40602004006008001000SE +/- 87.53, N = 9SE +/- 0.24, N = 3SE +/- 0.39, N = 3SE +/- 0.39, N = 3SE +/- 0.48, N = 3SE +/- 0.39, N = 3844.61119.05116.35119.00119.05121.57MIN: 46.34 / MAX: 1866.931. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: FastestDet

Phoronix Test Suitems, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: FastestDetNVIDIA RTX 4070 SUPERIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406020406080100SE +/- 0.29, N = 9SE +/- 2.65, N = 3SE +/- 4.25, N = 3SE +/- 1.06, N = 3SE +/- 1.73, N = 3SE +/- 0.40, N = 32.8688.9961.4793.5494.9598.49MIN: 5.39 / MAX: 101.65MIN: 5.39 / MAX: 102MIN: 5.44 / MAX: 102.43MIN: 5.37 / MAX: 102.57MIN: 5.5 / MAX: 104.081. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

vkpeak

fp32-scalar

Phoronix Test SuiteGFLOPS, More Is Bettervkpeak 20230730fp32-scalarIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40604K8K12K16K20KSE +/- 3.56, N = 3SE +/- 0.08, N = 3SE +/- 0.91, N = 3SE +/- 0.34, N = 3SE +/- 13.34, N = 317337.5415200.6213055.927522.128847.00

vkpeak

fp32-vec4

Phoronix Test SuiteGFLOPS, More Is Bettervkpeak 20230730fp32-vec4Intel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40603K6K9K12K15KSE +/- 0.27, N = 3SE +/- 0.07, N = 3SE +/- 0.09, N = 3SE +/- 0.14, N = 3SE +/- 21.71, N = 311171.139779.718389.627285.2511702.98

vkpeak

fp16-scalar

Phoronix Test SuiteGFLOPS, More Is Bettervkpeak 20230730fp16-scalarIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40605K10K15K20K25KSE +/- 0.22, N = 3SE +/- 0.30, N = 3SE +/- 0.10, N = 3SE +/- 0.80, N = 3SE +/- 5.98, N = 324468.2721413.2918359.4918812.678750.25

vkpeak

fp16-vec4

Phoronix Test SuiteGFLOPS, More Is Bettervkpeak 20230730fp16-vec4Intel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40608K16K24K32K40KSE +/- 0.43, N = 3SE +/- 0.80, N = 3SE +/- 0.72, N = 3SE +/- 1.51, N = 3SE +/- 2.36, N = 338575.1133769.3828957.5215029.6317491.39

vkpeak

int32-scalar

Phoronix Test SuiteGIOPS, More Is Bettervkpeak 20230730int32-scalarIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40602K4K6K8K10KSE +/- 0.04, N = 3SE +/- 0.14, N = 3SE +/- 0.11, N = 3SE +/- 0.26, N = 3SE +/- 0.62, N = 34675.064082.113504.133473.888863.41

vkpeak

int32-vec4

Phoronix Test SuiteGIOPS, More Is Bettervkpeak 20230730int32-vec4Intel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40602K4K6K8K10KSE +/- 0.03, N = 3SE +/- 0.15, N = 3SE +/- 0.06, N = 3SE +/- 0.28, N = 3SE +/- 13.86, N = 34853.464242.303636.353603.138809.59

vkpeak

int16-scalar

Phoronix Test SuiteGIOPS, More Is Bettervkpeak 20230730int16-scalarIntel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40602K4K6K8K10KSE +/- 0.19, N = 3SE +/- 0.05, N = 3SE +/- 0.07, N = 3SE +/- 0.79, N = 3SE +/- 1.37, N = 39201.648053.086905.6110654.955886.51

vkpeak

int16-vec4

Phoronix Test SuiteGIOPS, More Is Bettervkpeak 20230730int16-vec4Intel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40603K6K9K12K15KSE +/- 0.11, N = 3SE +/- 0.15, N = 3SE +/- 0.13, N = 3SE +/- 0.75, N = 3SE +/- 12.29, N = 39624.668426.597230.5411881.637614.64

NCNN

Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3

Phoronix Test Suitems, Fewer Is BetterNCNN 20230517Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3Intel ARC A770 8GBIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406020406080100SE +/- 0.36, N = 3SE +/- 0.84, N = 3SE +/- 0.40, N = 3SE +/- 0.29, N = 3SE +/- 0.20, N = 379.7976.4480.5381.0282.30MIN: 21.72 / MAX: 84.6MIN: 9.56 / MAX: 84MIN: 12.67 / MAX: 84.4MIN: 19.19 / MAX: 85.81MIN: 9.35 / MAX: 86.41. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Unigine Valley

Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL

Phoronix Test SuiteFrames Per Second, More Is BetterUnigine Valley 1.0Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGLIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GB60120180240300SE +/- 0.79, N = 3SE +/- 0.15, N = 3SE +/- 2.45, N = 3SE +/- 0.55, N = 3226.17216.33257.31242.15

vkpeak

fp32-scalar

Phoronix Test SuiteGFLOPS, More Is Bettervkpeak 20240505fp32-scalarIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 40604K8K12K16K20KSE +/- 26.85, N = 3SE +/- 6.23, N = 3SE +/- 0.18, N = 3SE +/- 0.16, N = 3SE +/- 35.38, N = 316868.9314354.217291.0019353.088858.91

vkpeak

fp32-vec4

Phoronix Test SuiteGFLOPS, More Is Bettervkpeak 20240505fp32-vec4Intel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 40603K6K9K12K15KSE +/- 0.79, N = 3SE +/- 0.40, N = 3SE +/- 0.49, N = 3SE +/- 0.28, N = 3SE +/- 29.12, N = 313898.3211908.7310725.6115881.6211436.13

vkpeak

fp16-scalar

Phoronix Test SuiteGFLOPS, More Is Bettervkpeak 20240505fp16-scalarIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 40605K10K15K20K25KSE +/- 1.00, N = 3SE +/- 0.21, N = 3SE +/- 1.14, N = 3SE +/- 2.29, N = 3SE +/- 12.10, N = 321379.2218324.9221097.3324444.508610.31

vkpeak

fp16-vec4

Phoronix Test SuiteGFLOPS, More Is Bettervkpeak 20240505fp16-vec4Intel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 40606K12K18K24K30KSE +/- 0.55, N = 3SE +/- 0.47, N = 3SE +/- 0.75, N = 3SE +/- 0.11, N = 3SE +/- 16.19, N = 322840.7319567.5423932.8726100.479922.29

VkFFT

Test: FFT + iFFT R2C / C2R

Phoronix Test SuiteBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT R2C / C2RIntel ARC A750Intel ARC A580Intel ARC A770 8GBNVIDIA RTX 40608K16K24K32K40KSE +/- 232.00, N = 15SE +/- 118.37, N = 3SE +/- 314.58, N = 6SE +/- 2.52, N = 3312183038832590366921. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in half precision

Phoronix Test SuiteBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in half precisionIntel ARC A750Intel ARC A580Intel ARC A770 8GBNVIDIA RTX 406020K40K60K80K100KSE +/- 2534.09, N = 15SE +/- 1441.23, N = 12SE +/- 901.86, N = 4SE +/- 5.78, N = 3705717234677524843461. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C Bluestein in single precision

Phoronix Test SuiteBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein in single precisionIntel ARC A750Intel ARC A580NVIDIA RTX 40603K6K9K12K15KSE +/- 58.43, N = 3SE +/- 18.21, N = 3SE +/- 11.15, N = 354255134137001. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in double precision

Phoronix Test SuiteBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in double precisionNVIDIA RTX 40603K6K9K12K15KSE +/- 12.66, N = 3125701. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision

Phoronix Test SuiteBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precisionIntel ARC A750Intel ARC A580NVIDIA RTX 406013K26K39K52K65KSE +/- 70.72, N = 3SE +/- 583.21, N = 3SE +/- 18.75, N = 35862558426428921. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C multidimensional in single precision

Phoronix Test SuiteBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C multidimensional in single precisionIntel ARC A750Intel ARC A580Intel ARC A770 8GBNVIDIA RTX 40609K18K27K36K45KSE +/- 251.38, N = 12SE +/- 20.17, N = 3SE +/- 47.16, N = 3315323052135109414341. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C Bluestein benchmark in double precision

Phoronix Test SuiteBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein benchmark in double precisionNVIDIA RTX 40605001000150020002500SE +/- 2.00, N = 325031. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling

Phoronix Test SuiteBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingIntel ARC A750Intel ARC A580NVIDIA RTX 406014K28K42K56K70KSE +/- 527.31, N = 3SE +/- 36.04, N = 3SE +/- 1.53, N = 36311363157436181. (CXX) g++ options: -O3

OpenArena

Resolution: 1920 x 1080

Phoronix Test SuiteFrames Per Second, More Is BetterOpenArena 0.8.8Resolution: 1920 x 1080Intel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 4060110220330440550SE +/- 6.89, N = 15SE +/- 4.78, N = 15SE +/- 5.90, N = 15SE +/- 3.84, N = 4SE +/- 0.23, N = 3469.5524.2339.9327.837.7MIN: 1MIN: 1MIN: 1MIN: 1

OpenArena

Resolution: 1920 x 1080 - Total Frame Time

Phoronix Test SuiteMilliseconds, Fewer Is BetterOpenArena 0.8.8Resolution: 1920 x 1080 - Total Frame TimeIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 40601224364860Min: 1 / Avg: 2.05 / Max: 14Min: 1 / Avg: 1.99 / Max: 14Min: 1 / Avg: 2.81 / Max: 63Min: 1 / Avg: 3.03 / Max: 15Min: 5 / Avg: 26.19 / Max: 60

Unigine Heaven

Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL

Phoronix Test SuiteFrames Per Second, More Is BetterUnigine Heaven 4.0Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGLIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GB50100150200250SE +/- 0.14, N = 3SE +/- 0.23, N = 3SE +/- 0.78, N = 3SE +/- 1.39, N = 3224.52205.25246.52228.95

Xonotic

Resolution: 1920 x 1080 - Effects Quality: Low

Phoronix Test SuiteFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: LowIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 40602004006008001000SE +/- 7.95, N = 3SE +/- 11.78, N = 3SE +/- 11.01, N = 15SE +/- 1.61, N = 3SE +/- 8.14, N = 12932.791051.51802.21932.0133.38MIN: 597 / MAX: 1516MIN: 704 / MAX: 1770MIN: 175 / MAX: 1379MIN: 596 / MAX: 1412

Xonotic

Resolution: 1920 x 1080 - Effects Quality: High

Phoronix Test SuiteFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: HighIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GB2004006008001000SE +/- 3.18, N = 3SE +/- 6.14, N = 15SE +/- 10.51, N = 15SE +/- 6.69, N = 15777.93856.72714.44713.34MIN: 437 / MAX: 1200MIN: 458 / MAX: 1375MIN: 160 / MAX: 1153MIN: 407 / MAX: 1199

Xonotic

Resolution: 1920 x 1080 - Effects Quality: Ultra

Phoronix Test SuiteFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: UltraIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GB2004006008001000SE +/- 8.54, N = 3SE +/- 3.16, N = 3SE +/- 8.37, N = 3SE +/- 3.20, N = 3735.34809.46682.99697.08MIN: 308 / MAX: 1171MIN: 319 / MAX: 1303MIN: 261 / MAX: 1086MIN: 264 / MAX: 1174

Xonotic

Resolution: 1920 x 1080 - Effects Quality: Ultimate

Phoronix Test SuiteFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: UltimateIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GB120240360480600SE +/- 2.85, N = 3SE +/- 5.72, N = 3SE +/- 4.54, N = 3SE +/- 4.39, N = 3551.35575.51544.26541.20MIN: 110 / MAX: 1221MIN: 106 / MAX: 1264MIN: 101 / MAX: 1092MIN: 106 / MAX: 1117

ParaView

Test: Many Spheres - Resolution: 1920 x 1080

Phoronix Test SuiteFrames / Sec, More Is BetterParaView 5.13Test: Many Spheres - Resolution: 1920 x 1080Intel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GB20406080100SE +/- 0.80, N = 5SE +/- 0.98, N = 15SE +/- 0.08, N = 3SE +/- 1.49, N = 1572.2765.0347.1481.49

ParaView

Test: Many Spheres - Resolution: 1920 x 1080

Phoronix Test SuiteMiPolys / Sec, More Is BetterParaView 5.13Test: Many Spheres - Resolution: 1920 x 1080Intel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GB2K4K6K8K10KSE +/- 80.48, N = 5SE +/- 98.62, N = 15SE +/- 8.04, N = 3SE +/- 149.42, N = 157245.456519.454726.028169.69

ParaView

Test: Wavelet Volume - Resolution: 1920 x 1080

Phoronix Test SuiteFrames / Sec, More Is BetterParaView 5.13Test: Wavelet Volume - Resolution: 1920 x 1080Intel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 406080160240320400SE +/- 4.59, N = 15SE +/- 2.19, N = 15SE +/- 6.20, N = 15SE +/- 2.76, N = 15SE +/- 0.40, N = 6243.83230.75378.42224.3340.75

ParaView

Test: Wavelet Volume - Resolution: 1920 x 1080

Phoronix Test SuiteMiVoxels / Sec, More Is BetterParaView 5.13Test: Wavelet Volume - Resolution: 1920 x 1080Intel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 406013002600390052006500SE +/- 73.43, N = 15SE +/- 35.11, N = 15SE +/- 99.16, N = 15SE +/- 44.24, N = 15SE +/- 6.43, N = 63901.283692.086054.663589.28651.94

ParaView

Test: Wavelet Contour - Resolution: 1920 x 1080

Phoronix Test SuiteFrames / Sec, More Is BetterParaView 5.13Test: Wavelet Contour - Resolution: 1920 x 1080Intel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 406070140210280350SE +/- 0.99, N = 3SE +/- 2.75, N = 4SE +/- 2.56, N = 3SE +/- 0.85, N = 3SE +/- 0.00, N = 3278.15252.66318.40299.421.81

ParaView

Test: Wavelet Contour - Resolution: 1920 x 1080

Phoronix Test SuiteMiPolys / Sec, More Is BetterParaView 5.13Test: Wavelet Contour - Resolution: 1920 x 1080Intel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 40607001400210028003500SE +/- 10.34, N = 3SE +/- 28.63, N = 4SE +/- 26.72, N = 3SE +/- 8.88, N = 3SE +/- 0.03, N = 32898.672633.043318.123120.3518.93

IndigoBench

Acceleration: CPU - Scene: Bedroom

Phoronix Test SuiteM samples/s, More Is BetterIndigoBench 4.4Acceleration: CPU - Scene: BedroomIntel ARC A750Intel ARC A580Intel ARC B5801.12012.24023.36034.48045.6005SE +/- 0.046, N = 3SE +/- 0.015, N = 3SE +/- 0.038, N = 34.9324.9784.910

IndigoBench

Acceleration: CPU - Scene: Supercar

Phoronix Test SuiteM samples/s, More Is BetterIndigoBench 4.4Acceleration: CPU - Scene: SupercarIntel ARC A750Intel ARC A580Intel ARC B5803691215SE +/- 0.03, N = 3SE +/- 0.13, N = 3SE +/- 0.05, N = 312.5312.8612.64

LuxMark

OpenCL Device: GPU - Scene: Hotel

Phoronix Test SuiteScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: HotelIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40603K6K9K12K15KSE +/- 25.67, N = 3SE +/- 11.33, N = 3SE +/- 15.33, N = 3SE +/- 0.33, N = 3132361101578908992

LuxMark

OpenCL Device: CPU+GPU - Scene: Hotel

Phoronix Test SuiteScore, More Is BetterLuxMark 3.1OpenCL Device: CPU+GPU - Scene: HotelIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40603K6K9K12K15KSE +/- 0.33, N = 3SE +/- 1.76, N = 3SE +/- 0.33, N = 3SE +/- 0.88, N = 3132621101079048992

LuxMark

OpenCL Device: GPU - Scene: Microphone

Phoronix Test SuiteScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: MicrophoneIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406010K20K30K40K50KSE +/- 191.00, N = 3SE +/- 58.92, N = 3SE +/- 30.20, N = 3SE +/- 34.98, N = 345988381572509731578

LuxMark

OpenCL Device: GPU - Scene: Luxball HDR

Phoronix Test SuiteScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Luxball HDRIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406013K26K39K52K65KSE +/- 144.44, N = 3SE +/- 167.49, N = 3SE +/- 56.75, N = 3SE +/- 103.67, N = 360251505883299239772

LuxMark

OpenCL Device: CPU+GPU - Scene: Microphone

Phoronix Test SuiteScore, More Is BetterLuxMark 3.1OpenCL Device: CPU+GPU - Scene: MicrophoneIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406010K20K30K40K50KSE +/- 14.17, N = 3SE +/- 9.50, N = 3SE +/- 14.88, N = 3SE +/- 8.88, N = 345844382242500331547

LuxMark

OpenCL Device: CPU+GPU - Scene: Luxball HDR

Phoronix Test SuiteScore, More Is BetterLuxMark 3.1OpenCL Device: CPU+GPU - Scene: Luxball HDRIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406013K26K39K52K65KSE +/- 154.09, N = 3SE +/- 62.27, N = 3SE +/- 7.13, N = 3SE +/- 32.19, N = 360539507883294939713

SPECViewPerf 2020

Resolution: 1920 x 1080 - Viewset: SNX-04

Phoronix Test SuiteComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: SNX-04Intel ARC A750Intel ARC A580Intel ARC B5804080120160200SE +/- 0.08, N = 3SE +/- 0.33, N = 3SE +/- 0.59, N = 3172.03168.75182.04

SPECViewPerf 2020

Resolution: 1920 x 1080 - Viewset: CREO-03

Phoronix Test SuiteComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: CREO-03Intel ARC A750Intel ARC A580Intel ARC B5801428425670SE +/- 0.06, N = 3SE +/- 0.04, N = 3SE +/- 0.04, N = 362.6561.8047.15

SPECViewPerf 2020

Resolution: 1920 x 1080 - Viewset: MAYA-06

Phoronix Test SuiteComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: MAYA-06Intel ARC A750Intel ARC A580Intel ARC B5804080120160200SE +/- 0.32, N = 3SE +/- 0.08, N = 3SE +/- 2.14, N = 4165.76159.42190.41

SPECViewPerf 2020

Resolution: 1920 x 1080 - Viewset: CATIA-06

Phoronix Test SuiteComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: CATIA-06Intel ARC A750Intel ARC A580Intel ARC B5801326395265SE +/- 0.05, N = 3SE +/- 0.10, N = 3SE +/- 0.04, N = 347.8545.8456.71

SPECViewPerf 2020

Resolution: 1920 x 1080 - Viewset: ENERGY-03

Phoronix Test SuiteComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: ENERGY-03Intel ARC A750Intel ARC A580Intel ARC B580918273645SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 331.1727.7437.17

SPECViewPerf 2020

Resolution: 1920 x 1080 - Viewset: MEDICAL-O3

Phoronix Test SuiteComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: MEDICAL-O3Intel ARC A750Intel ARC A580Intel ARC B5801122334455SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 340.9739.1349.05

SPECViewPerf 2020

Resolution: 1920 x 1080 - Viewset: SOLIDWORKS-07

Phoronix Test SuiteComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: SOLIDWORKS-07Intel ARC A750Intel ARC A580Intel ARC B58050100150200250SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.02, N = 373.1370.72224.20

Darktable

Test: Boat - Acceleration: OpenCL

Phoronix Test SuiteSeconds, Fewer Is BetterDarktable 4.8.1Test: Boat - Acceleration: OpenCLIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40600.65611.31221.96832.62443.2805SE +/- 0.011, N = 3SE +/- 0.004, N = 3SE +/- 0.003, N = 3SE +/- 0.011, N = 32.9002.9162.9122.899

Darktable

Test: Boat - Acceleration: CPU-only

Phoronix Test SuiteSeconds, Fewer Is BetterDarktable 4.8.1Test: Boat - Acceleration: CPU-onlyIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40600.6571.3141.9712.6283.285SE +/- 0.012, N = 3SE +/- 0.005, N = 3SE +/- 0.013, N = 3SE +/- 0.002, N = 32.9052.9202.9132.900

Darktable

Test: Masskrug - Acceleration: OpenCL

Phoronix Test SuiteSeconds, Fewer Is BetterDarktable 4.8.1Test: Masskrug - Acceleration: OpenCLIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40600.3850.771.1551.541.925SE +/- 0.005, N = 3SE +/- 0.006, N = 3SE +/- 0.005, N = 3SE +/- 0.001, N = 31.6901.6801.6861.711

Darktable

Test: Masskrug - Acceleration: CPU-only

Phoronix Test SuiteSeconds, Fewer Is BetterDarktable 4.8.1Test: Masskrug - Acceleration: CPU-onlyIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40600.38140.76281.14421.52561.907SE +/- 0.009, N = 3SE +/- 0.005, N = 3SE +/- 0.008, N = 3SE +/- 0.021, N = 41.6891.6881.6821.695

Darktable

Test: Server Rack - Acceleration: OpenCL

Phoronix Test SuiteSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Rack - Acceleration: OpenCLIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40600.03960.07920.11880.15840.198SE +/- 0.000, N = 3SE +/- 0.002, N = 3SE +/- 0.001, N = 3SE +/- 0.001, N = 30.1720.1760.1740.172

Darktable

Test: Server Room - Acceleration: OpenCL

Phoronix Test SuiteSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Room - Acceleration: OpenCLIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40600.31160.62320.93481.24641.558SE +/- 0.001, N = 3SE +/- 0.005, N = 3SE +/- 0.004, N = 3SE +/- 0.019, N = 31.3621.3851.3741.385

Darktable

Test: Server Rack - Acceleration: CPU-only

Phoronix Test SuiteSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Rack - Acceleration: CPU-onlyIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40600.03920.07840.11760.15680.196SE +/- 0.001, N = 3SE +/- 0.001, N = 3SE +/- 0.000, N = 3SE +/- 0.002, N = 40.1730.1720.1740.174

Darktable

Test: Server Room - Acceleration: CPU-only

Phoronix Test SuiteSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Room - Acceleration: CPU-onlyIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 40600.31460.62920.94381.25841.573SE +/- 0.008, N = 3SE +/- 0.004, N = 3SE +/- 0.001, N = 3SE +/- 0.008, N = 31.3811.3731.3801.398

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: S3D

Phoronix Test SuiteGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3DIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406050100150200250SE +/- 0.74, N = 3SE +/- 0.20, N = 3SE +/- 0.03, N = 3SE +/- 0.01, N = 3224.72216.39106.26160.271. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

Phoronix Test SuiteGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: TriadIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406048121620SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.03, N = 3SE +/- 0.00, N = 318.0417.7712.4812.351. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

Phoronix Test SuiteGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406030060090012001500SE +/- 9.13, N = 3SE +/- 12.21, N = 3SE +/- 0.72, N = 3SE +/- 0.17, N = 31187.361192.991082.40743.911. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

Phoronix Test SuiteGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 HashIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060510152025SE +/- 0.01, N = 3SE +/- 0.03, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 322.7219.5316.7719.631. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Reduction

Phoronix Test SuiteGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: ReductionIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406060120180240300SE +/- 0.04, N = 3SE +/- 0.64, N = 7SE +/- 0.11, N = 3SE +/- 0.04, N = 374.5369.4863.18256.111. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: GEMM SGEMM_N

Phoronix Test SuiteGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_NIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 406011002200330044005500SE +/- 15.19, N = 15SE +/- 3.79, N = 3SE +/- 60.21, N = 3SE +/- 39.56, N = 32049.201777.184966.774843.391. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

Phoronix Test SuiteGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP FlopsIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060600K1200K1800K2400K3000KSE +/- 132865.40, N = 12SE +/- 154699.96, N = 15SE +/- 33666.55, N = 15SE +/- 80.37, N = 32804852.02345224.0709378.017046.31. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

Phoronix Test SuiteGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed DownloadIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060510152025SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 318.6618.6613.4512.721. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

Phoronix Test SuiteGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed ReadbackIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060510152025SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 322.4422.4314.2513.191. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

Phoronix Test SuiteGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read BandwidthIntel ARC A750Intel ARC A580Intel ARC B580NVIDIA RTX 4060400800120016002000SE +/- 0.34, N = 3SE +/- 0.14, N = 3SE +/- 0.33, N = 3SE +/- 2.48, N = 3883.46762.70812.331828.211. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

ProjectPhysX OpenCL-Benchmark

Operation: FP16 Compute

Phoronix Test SuiteTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP16 ComputeIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GB612182430SE +/- 0.00, N = 3SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.07, N = 315.6213.6426.6118.041. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

vkpeak

fp64-scalar

Phoronix Test SuiteGFLOPS, More Is Bettervkpeak 20230730fp64-scalarIntel ARC B580NVIDIA RTX 40602004006008001000SE +/- 0.03, N = 3SE +/- 0.01, N = 3799.10279.09

vkpeak

fp64-vec4

Phoronix Test SuiteGFLOPS, More Is Bettervkpeak 20230730fp64-vec4Intel ARC B580NVIDIA RTX 40602004006008001000SE +/- 0.06, N = 3SE +/- 0.00, N = 3777.69278.72

vkpeak

fp64-scalar

Phoronix Test SuiteGFLOPS, More Is Bettervkpeak 20240505fp64-scalarIntel ARC B580NVIDIA RTX 40602004006008001000SE +/- 0.03, N = 3SE +/- 0.46, N = 3801.44277.45

vkpeak

fp64-vec4

Phoronix Test SuiteGFLOPS, More Is Bettervkpeak 20240505fp64-vec4Intel ARC B580NVIDIA RTX 40602004006008001000SE +/- 0.03, N = 3781.86277.00

vkpeak

int32-scalar

Phoronix Test SuiteGIOPS, More Is Bettervkpeak 20240505int32-scalarIntel ARC B580NVIDIA RTX 40602K4K6K8K10KSE +/- 0.10, N = 3SE +/- 10.19, N = 33029.978861.26

vkpeak

int32-vec4

Phoronix Test SuiteGIOPS, More Is Bettervkpeak 20240505int32-vec4Intel ARC B580NVIDIA RTX 40602K4K6K8K10KSE +/- 0.12, N = 3SE +/- 10.04, N = 33110.938852.78

vkpeak

int16-scalar

Phoronix Test SuiteGIOPS, More Is Bettervkpeak 20240505int16-scalarIntel ARC B580NVIDIA RTX 40602K4K6K8K10KSE +/- 0.43, N = 3SE +/- 5.48, N = 310817.725830.55

vkpeak

int16-vec4

Phoronix Test SuiteGIOPS, More Is Bettervkpeak 20240505int16-vec4Intel ARC B580NVIDIA RTX 40603K6K9K12K15KSE +/- 3.16, N = 3SE +/- 3.33, N = 313545.346345.32

Geometric Mean Of All Test Results

Result Composite - RTX 4070 SUPER

Phoronix Test SuiteGeometric Mean, More Is BetterGeometric Mean Of All Test ResultsResult Composite - RTX 4070 SUPERNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC B580Intel ARC A770 8GBNVIDIA RTX 406091827364531.0032.1230.3537.3435.1532.31

Number Of First Place Finishes

Wins - 160 Tests

NVIDIA RTX 4070 SUPER62 [38.8%]Intel ARC B58031 [19.4%]Intel ARC A75023 [14.4%]NVIDIA RTX 406020 [12.5%]Intel ARC A58013 [8.1%]Intel ARC A770 8GB11 [6.9%]Number Of First Place FinishesWins - 160 TestsPhoronix Test Suite

Number Of Last Place Finishes

Losses - 160 Tests

NVIDIA RTX 406058 [36.3%]Intel ARC A58036 [22.5%]Intel ARC B58032 [20.0%]NVIDIA RTX 4070 SUPER19 [11.9%]Intel ARC A770 8GB8 [5.0%]Intel ARC A7507 [4.4%]Number Of Last Place FinishesLosses - 160 TestsPhoronix Test Suite


Phoronix Test Suite v10.8.5