Skip to main content

Posts

Showing posts from May, 2017

So that static array is actually really cool... but not super useful in CUDA

     So that static array I mentioned last post just got a lot more interesting.  At first, I thought it was an array of values that contained information about the Equidistant Surface for doing calculations on how much flux the observer actually sees.  It does do this, but the way it does it is very different than what I thought.  My intuition was that each array element was some value, and that it was static for ease of access by mulitiple programs.  What it actually is, is an array of a struct that contains several doubles.      What this means is you specify values for each struct object for each array point, creating a collection of parameters for for every point in discrete space.  It is very cool indeed, but I am now even more unsure of how to copy this static object to the GPU.  I don't want to dismantle it, as it is such an elegant solution to the problem of defining all these variables for each point, but I have...

Update: A D'Oh moment, and the curious case of static member classes.

     So last night, I was laying in bed thinking about the myriad of issues I had been having with functions pointing to member objects and multiple threads trying to assign values to those objects, when I realized something.  The part I was attempting to parallelize was a waste because there were multiple serial functions embedded in them.  What actually made sense was to move further into the code and instead of trying to parallelize the flux calculation in time, it made more sense to parallelize the spatial calculations because those do not have the same dependencies on objects.      This morning has consisted of implementing that code, and tagging the required functions as device runnable code, going back and catching typos, and then relocating the kernel because __global__ functions cannot be called as class types.  I was feeling great about this and was quite confident as I keyed in the make arguments, when... Whoops......

Here things are.

     Okay, this is the inaugural post of the Afterglow project development blog.  I have been pushing hard the last couple of weeks to get the CUDA-accelerated version of boxfit to into a state where it can be alpha-tested.  The main issue is in trying to deal with the way variables are allocated in classes.  There are a lot of functions that call class objects which are just doubles or ints, with the idea being that they can be passed between functions without actually having to be explicitly passed.      This works quite well when each object is allocated and requested sequentially for a thread, or when each thread has an independent instance of that object.  The problem with porting the code to CUDA is that the objects become shared among the threads, so each thread attempts to assign its own value to the variable and everything would go up in flames were it not for the compiler catching the code as incompatible.     ...