Implement Nvidia's Reduction kernel (Templates)

Jun 26, 2013 at 10:17 PM
I'd think anyone and everyone successfully writing parallel code would depend heavily on existing reduction code. NVIDIA publishes this example, but it uses templates which resemble generics to me, but I'm completely at a loss as to how to call this using managed code. Anyone have any insight or experience at this?
Jun 26, 2013 at 11:17 PM
to start with, C++ templates are not the same as generics in C#. You can think of templates as a description (or template) for the compiler how to create a function. The goal of generics is more or less the same, but once you have compiled a templated function, you can't distinguish it anymore from a “normal” function. And this is exactly the trick on how you can use templates with managedCuda: You simply put a dummy host function in your cu-file to tell the compiler how to instantiate the template and then use it the same way as any other kernel from managed code.
But, as templates are a C++ feature, you must omit ‘extern “C”’ in your source code and further, you must determine the new kernel name. The compiler encodes all the information about what instance of your template to use inside the name (aka name mangling). You can simply compile to ptx and open the file with a text editor to look up the new mangled names you have to use with managedCuda.
If all that sounds too complicated, have a look at the GrabCutNPP sample, many kernels there are templates. E.g. -> line 82 is the template kernel and line 111 is the dummy host function to call it. From this you get three kernels that you can use as
        ApplyMatteKernelMode0 = ctx.LoadKernelPTX(kernel, "_Z16ApplyMatteKernelILi0EEvP6uchar4iPKS0_iPKhiii");
        ApplyMatteKernelMode1 = ctx.LoadKernelPTX(kernel, "_Z16ApplyMatteKernelILi1EEvP6uchar4iPKS0_iPKhiii");
        ApplyMatteKernelMode2 = ctx.LoadKernelPTX(kernel, "_Z16ApplyMatteKernelILi2EEvP6uchar4iPKS0_iPKhiii");
Jun 27, 2013 at 11:58 AM
Thank you so much! Just like everything else; it's simple once you understand it.