To build:

If needed run first autogen.sh script to generate the configure file.
Then type:

./configure --with-cuda=/usr/local/cuda50 --enable-mpi

make

This will build an executable src/testHelloMpiCuda.
You can run it on a multi-GPU cluster with

mpirun -np N_PROC_MPI ./testHelloMpiCuda

