Okay, this is the inaugural post of the Afterglow project development blog. I have been pushing hard the last couple of weeks to get the CUDA-accelerated version of boxfit to into a state where it can be alpha-tested. The main issue is in trying to deal with the way variables are allocated in classes. There are a lot of functions that call class objects which are just doubles or ints, with the idea being that they can be passed between functions without actually having to be explicitly passed.
This works quite well when each object is allocated and requested sequentially for a thread, or when each thread has an independent instance of that object. The problem with porting the code to CUDA is that the objects become shared among the threads, so each thread attempts to assign its own value to the variable and everything would go up in flames were it not for the compiler catching the code as incompatible.
I made a video talking about this which I may or may not post because it rambles quite a bit, but I am hoping to have a working code tomorrow, as well as an in-depth discussion of how I got around this and a few other issues. But for now, it is back to building the websites and scratching my head as I stare at terminals.
Cheers,
TEJ
This works quite well when each object is allocated and requested sequentially for a thread, or when each thread has an independent instance of that object. The problem with porting the code to CUDA is that the objects become shared among the threads, so each thread attempts to assign its own value to the variable and everything would go up in flames were it not for the compiler catching the code as incompatible.
I made a video talking about this which I may or may not post because it rambles quite a bit, but I am hoping to have a working code tomorrow, as well as an in-depth discussion of how I got around this and a few other issues. But for now, it is back to building the websites and scratching my head as I stare at terminals.
Cheers,
TEJ
Comments
Post a Comment