Just another site

GPU compute speed with Nvidia CUDA

I recently upgraded to a “GeForce GT 640” graphics card, which have Nvidia chipsets. As explained in the two previous posts, I reinstalled the version 5.5 CUDA package from Nvidia.

Now, I compiled the sample programs that come with CUDA 5.5, and tested matrix product computation performance on the GPU, or graphics card. I get 279 GFlops , about 30 times faster than with my previous GPU that did about 9 GFlops , so CUDA lives up to its promise …

$ /usr/local/cuda-5.5/samples/bin/x86_64/linux/release/matrixMulCUBLAS  [ENTER]

[Matrix Multiply CUBLAS] – Starting…
GPU Device 0: “GeForce GT 640” with compute capability 3.0

MatrixA(320,640), MatrixB(320,640), MatrixC(320,640)
Computing result using CUBLAS…done.
Performance= 279.37 GFlop/s, Time= 0.469 msec, Size= 131072000 Ops
Computing result using host CPU…done.
Comparing CUBLAS Matrix Multiply with CPU results: PASS



Written by meditationatae

January 30, 2014 at 5:24 am

Posted in History

Tagged with ,

%d bloggers like this: