struct argument in kernel...

Apr 9, 2014 at 7:24 AM
Hi,
Is anyone know how to pass a custom struct type value to a kernel like,
struct Dist { public double[] fC, fE, fW, fS, fN, fSE, fSW, fNE, fNW;}
static Dist ivDist_D1;
// original kernel
extern "C" __global__ void LBMStep1(Dist cd);
Thx.
Coordinator
Apr 9, 2014 at 8:39 AM
As you're using double[] as type and not double*, I'm assuming that the size of the arrays is constant and known at compile time. If this is the case, your struct in C# would look like something like this:
[StructLayout(LayoutKind.Sequential)]
unsafe struct Dist
{
    public fixed double fC[4], fE[4], fW[5], fS[5], fN[6], fSE[6], fSW[7], fNE[7], fNW[8];
}
Note the unsafe and fixed keyword.

If the array sizes vary at runtime you can't do this in C#, because then you'll get only a struct of pointers in C/C++/Cuda, what would not be possible to map in C#. At least not in such a simple way...

-Michael
Apr 9, 2014 at 9:13 AM
Another Q,
Is the argument passing like that
CudaDeviceVariable<Dist> temp;
temp = ivDist_D1;
kernel.run(temp.DevicePointer);
or
kernel.run(ivDist_D1);
which the argument "Dist cd" of the kernel will be iter each loop.
Thx.
Coordinator
Apr 9, 2014 at 10:55 AM
It's
kernel.run(temp.DevicePointer);
But be careful: If you call this code more than once, the line
temp = ivDist_D1;
will allocate each time new space in device memory. To avoid this do the following:
//Call once:
CudaDeviceVariable<Dist> temp = new CudaDeviceVariable<Dist>("ElementCount");
//...

//Call e.g. in a loop:
temp.CopyToDevice(ivDist_D1);
kernel.Run(temp.DevicePointer);
//...do other stuff
Apr 10, 2014 at 4:42 AM
Edited Apr 10, 2014 at 5:13 AM
About CudaDeviceVariable.DevicePointer,
is the kernel argument should be "struct*"?
// original kernel
extern "C" __global__ void LBMStep1(Dist cd);
VS
extern "C" __global__ void LBMStep1(Dist* cd);
like CudaDeviceVariable<double> passing double*.
And those kernel got exceptions,
1.(Dist cd)  ErrorLaunchFailed: An exception occurred on the device while executing a kernel. Common causes include dereferencing an invalid device pointer and accessing out of bounds shared memory. The context cannot be used, so it must be destroyed (and a new one should be created). All existing device memory allocations from this context are invalid and must be reconstructed if the program is to continue using CUDA.
// I check the DevicePointers, no one changed.And original kernel run at c/c++ is OK.
2.(Dist* cd)  716
Is something wrong?
Thx.
Coordinator
Apr 10, 2014 at 9:17 AM
you're right, when your kernel is defined as
extern "C" __global__ void LBMStep1(Dist cd);
you launch it as
kernel.Run("C#struct"); //pass by value
so you pass directly the struct build in C# in host memory.

If your kernel is defines as
extern "C" __global__ void LBMStep1(Dist* cd);
you need to pass a pointer, pointing to device memory. And this is a CudaDeviceVariable in managedCuda:
CudaDeviceVariable<type> CudaDevVar = new ...
CudaDevVar.CopyToDevice("C#struct");
kernel.Run(CudaDevVar.DevicePointer); //pass by reference