Managed CuDNN copy data to GPU problem!

Jul 20, 2016 at 2:12 PM

I have a problem with udnerstandig of data copy from host to GPU memory using ponter CUdeviceptr.

To run cudnnConvolutionForward it is neccessery to have data copied form host to device (GPU). According cuDNN library variable pointer CUdeviceptr is used.

But it is not clear how to write data array to DATA_IN.

An example:

ManagedCuda.BasicTypes.SizeT Dydis = 0;

ManagedCuda.CudaDNN.CudaDNNNativeMethods.cudnnGetConvolutionForwardWorkspaceSize(CudNN, pInputDesc, pFilterDesc, pConvDesc, pOutputDesc, cudnnConvolutionFwdAlgo.GEMM, ref Size);

ManagedCuda.BasicTypes.SizeT Size = 64;
CUdeviceptr DATA_IN = new CUdeviceptr(Size); <--- How to send data using this pointer?

ManagedCuda.CudaDNN.CudaDNNNativeMethods.cudnnConvolutionForward(CudNN, ref alpha, pInputDesc, DATA_IN, pFilterDesc, filder_data, pConvDesc, cudnnConvolutionFwdAlgo.GEMM, workSpace_Size, size, ref beta, pOutputDesc, dest_data);

This samne problem with workSpace_Size and filder_data...
Jul 20, 2016 at 9:16 PM

CUdeviceptr is only the placeholder to store the address of some GPU memory. Neither does it allocate the memory, nor can you use CUdeviceptr as is. You could allocate the memory manually using the CUDA-API, e.g. cuMemAlloc(...).
A more comfortable way would be the use of CudaDeviceVariable of the managedCuda package that wraps all the allocations in a C#-class. Creating a new instance of this class will allocate the memory, member methods provide you the copy host to device and back functionality. Have a look at the simpler samples provided in the download section to see how it works.

BTW: managedCuda moved to Github a long time ago ;-)

Jul 21, 2016 at 9:37 AM

Thanks for reply. I am not strong in C# thus I is a little complicated for me.
I have working example with managed Cuda by using CopyToDevice. I can run my custom kernels on GPU and get results to host..

I just tried to use cuDNN convolutions instead my own...

At the moment I am facing problem by retrieving Data address pointer on the Device after CopyToDevice()

CUdeviceptr Data_Address_in_DEVICE = new CUdeviceptr(); // <--- Declaration of CUdeviceptr because it is necessary to function: cudnnConvolutionForward()
float[] DATA = new float[64];                                                    //<----  declare some data storadge on Host
CudaDeviceVariable<float> DATA_to_SEND=DATA; //<--- Creating Cuda varibale

DATA_to_SEND.CopyToDevice(DATA_to_SEND); // Sending data to GPU

<<---------- And Here I have a problem. How to get pointer of data on GPU and assign to CUdeviceptr ------>>

using unsafe did not helped:
Data_Address_in_DEVICE.Pointer = &DATA_to_SEND; //<-- Get error: Cannot take the address of, get the size of, or declare a pointer to a managed type ('ManagedCuda.CudaDeviceVariable<float>')

So to copy data is easy but it is not clear how to get Pointer form GPU memory and assign to CUdeviceptr Data_Address_in_DEVICE.

Because cudnnConvolutionForward requires pointer of CUdeviceptr type.

Sorry for silly question..

Best regards,
Jul 21, 2016 at 10:20 AM

when using CudaDeviceVariable you don't need to have a CUdeviceptr declared explicitly, you can use the internal pointer of CudaDeviceVariable.
So in your case something like this should do it:
//Allocate memory on host:
float[] DATA = new float[64];
//TODO: fill it with some meaningful data...

//Allocate memory on device and copy the data from host to device memory:
CudaDeviceVariable<float> DATA_ON_DEVICE = DATA; //This already includes a copy step!

//To allocate memory without copying, use instead:
CudaDeviceVariable<float> DATA_ON_DEVICE2 = new CudaDeviceVariable<float>(64); //This only allocates the memory, no copying

//To copy from host to device:

//This part from your code doesn't make sense: it will copy to itself...
//DATA_to_SEND.CopyToDevice(DATA_to_SEND); // Sending data to GPU 

//To get a CUdeviceptr, use the property 'DevicePointer' of CudaDeviceVariable:
//call a kernel or a CUDNN function using DATA_ON_DEVICE.DevicePointer
Finally, don't forget to cleanup your allocated device memory by calling:
Jul 21, 2016 at 3:54 PM

Thanks for is clear now and works fine as it should be :)

Thank you very match!