Struct/Class passed to CUDA

Apr 24, 2015 at 11:29 AM
Hello,

I'm Sebastian, and I'm new in Cuda coding. I have some problem.
I would Like to work with my classes made in c# in Cuda kernel, but passing class by "copytodevice" is impossible (only non-nullable types...)

So I've changed classes to struct but still i cannot manage to use them in Cuda

my struct in C# looks like this:
   [StructLayout(LayoutKind.Sequential)]
    public struct NodeData
    {
        public int id { get; set; }
        public int2[] e_Id { get; set; } 

        public NodeData(int2[] _e_Id) :this()
        {
            this.e_Id = _e_Id;
        }
    }
in CUDA kernel:
typedef struct NodeData{
    public:
        std::array<int2,8> e_Id; 
        //int2 e_Id[];  <- this not working
        NodeData(std::array<int2, 8> _e_Id) : e_Id(_e_Id)
        {
        }
    }NodeData;
I have array NodeDatList with elements NodeData, which I'm trying to copy to GPU by calling this instruction:
CudaDeviceVariable<NodeData> d_NodeDataList = new CudaDeviceVariable<NodeData>(NodeDatList.Length);
             d_NodeDataList.CopyToDevice(NodeDatList);
I'm getting error that I'm trying to copy object containing none- basis or none-copyable type of data.
So after searching, I'm guessing the problem is with e_Id array inside struct. There is any way to copy array of arrays ? Another problem is that E_Id are different sizes, but never exceeds 8 elements.
I really don't know what to do. I will appreciate any help.

PS. Sorry for my English.
Coordinator
Apr 24, 2015 at 1:34 PM
Hi,

when you copy around arrays, every element must have the same size without inner pointers pointing to other memory areas. (This is Cuda unrelated and is valid for C and C#.) So you can't copy array of arrays directly.

You kind of have two options: make every element (NodeData) fixed size, i.e. 8 elements so that the struct is one consecutive memory area without inner pointers:
In Cuda:
typedef struct NodeData
{
    public:
        int2 e_Id[8];  //<-- C-syntax for a fixed sized array
}NodeData;
and in C#
[StructLayout(LayoutKind.Sequential)]
public unsafe struct NodeData1
{
    public fixed int e_Id[2*8]; //now inner struct possible, only byte, short, int etc.
}
or
[StructLayout(LayoutKind.Sequential)]
public struct NodeData2
{
    [MarshalAs(UnmanagedType.ByValArray, SizeConst = 8)]
    public int2[] e_Id;
}
depending your needs. The unsafe-version only supports primary datatypes like byte, short, int, float etc. but directly allocates for each NodeData the right amount of memory, also for the managed version. The second version only allocates enough space for an internal pointer for memory space that you have to allocate by calling "new int2[8]"...
The advantage of the unsafe version is, that no marshaling is necessary, so no copy step when going from unmanaged to managed memory. The second version is easier to code, but if you copy a large array of NodeData, .net first aggregates all elements and copies the data around before they get passed to cuda...

To fill the elements:
NodeData1 sol1 = new NodeData1();
//No "new" array is needed: the structure already has the memory space for 16 ints
unsafe
{
    sol1.e_Id[0] = 0;//simulate int2 by using 2 int...
    sol1.e_Id[1] = 0;
    sol1.e_Id[2] = 1;
    sol1.e_Id[3] = 1;
    sol1.e_Id[4] = 2;
    sol1.e_Id[5] = 2;
    sol1.e_Id[6] = 3;
    sol1.e_Id[7] = 3;
    sol1.e_Id[8] = 4;
    sol1.e_Id[9] = 4;
    sol1.e_Id[10] = 5;
    sol1.e_Id[11] = 5;
    sol1.e_Id[12] = 6;
    sol1.e_Id[13] = 6;
    sol1.e_Id[14] = 7;
    sol1.e_Id[15] = 7;
}
//When sol1 is passed to unmanaged world, no copying is needed: managed and unmanged struct have same layout

NodeData2 sol2 = new NodeData2();
sol2.e_Id = new int2[8]; //alloc memory space
sol2.e_Id[0] = new int2(0, 0);
sol2.e_Id[1] = new int2(1, 1);
sol2.e_Id[2] = new int2(2, 2);
sol2.e_Id[3] = new int2(3, 3);
sol2.e_Id[4] = new int2(4, 4);
sol2.e_Id[5] = new int2(5, 5);
sol2.e_Id[6] = new int2(6, 6);
sol2.e_Id[7] = new int2(7, 7);
//When sol2 is passed to unmanaged world, .net first creates under the hood an unmanaged struct with the right layout. sol2 and sol2.member are not in a consecutive memory area in managed world!
Your second possibility is to work with cuda's managedMemory, but I don't advise to do so: The implementation with C#/managedCuda is not as smooth as with C++, and I think beginners should first start with the simpler concepts of separated host and device memory to understand what is going on under the hood.
Apr 24, 2015 at 2:57 PM
Thank you very much for so fast reply!
I will test this solution asap. :)

But I have still some doubt.
  1. Even if I will use unsafe struct, with 1D array, my Cuda struct should stay int2 ?
  2. When I had constructor in my c# structs (or I shouldn't have any?) I had problem with initializing int2 e_Id[] in Cuda part.
    When I've try to assign array from constructor to array from struct by e_Id = _e_Id I got error, that's why I used std::array.
  3. I should keep my struct as simple as possible, without any constructors?
Coordinator
Apr 24, 2015 at 3:22 PM
1) Whatever suits you best, the struct in C# and the struct in Cuda are independent from each other. Only the bit pattern in memory must be identical. int2 is just a fixed array of two ints, so they are exactly the same:
public unsafe struct a
{
    public fixed int data[2];
}

public struct b
{
    public int2 data;
}
    
public struct c
{
    public int data1;
    public int data2;
}
All these structs have exactly the same layout in memory. If you take the one definition in C# and another in Cuda doesn't matter, it is all the same.

2) Again, the two structs in C# and Cuda are independent, only bits and bytes unaware of their meaning are transferred. You can add constructors to either of them, this won't change the data part of your struct (constructors are kind of static and not stored for every element).
And
std::array<int2,8> e_Id;
and
int2 e_Id[8]; 
is the same, the first one is C++, the latter C syntax for a fixed sized array. But if you forget the number "8" like in
int2 e_Id[];
this is identical to
int2* e_Id;
so just a simple pointer to a variable sized array somewhere in memory why you'd need a constructor to allocate it...

3) Doesn't matter, do what ever is best for your given task

Cheers,
Michael