Skip to main content

Here things are.

     Okay, this is the inaugural post of the Afterglow project development blog.  I have been pushing hard the last couple of weeks to get the CUDA-accelerated version of boxfit to into a state where it can be alpha-tested.  The main issue is in trying to deal with the way variables are allocated in classes.  There are a lot of functions that call class objects which are just doubles or ints, with the idea being that they can be passed between functions without actually having to be explicitly passed.

     This works quite well when each object is allocated and requested sequentially for a thread, or when each thread has an independent instance of that object.  The problem with porting the code to CUDA is that the objects become shared among the threads, so each thread attempts to assign its own value to the variable and everything would go up in flames were it not for the compiler catching the code as incompatible.

     I made a video talking about this which I may or may not post because it rambles quite a bit, but I am hoping to have a working code tomorrow, as well as an in-depth discussion of how I got around this and a few other issues.  But for now, it is back to building the websites and scratching my head as I stare at terminals.

Cheers,

TEJ

Comments

Popular posts from this blog

Update: A D'Oh moment, and the curious case of static member classes.

     So last night, I was laying in bed thinking about the myriad of issues I had been having with functions pointing to member objects and multiple threads trying to assign values to those objects, when I realized something.  The part I was attempting to parallelize was a waste because there were multiple serial functions embedded in them.  What actually made sense was to move further into the code and instead of trying to parallelize the flux calculation in time, it made more sense to parallelize the spatial calculations because those do not have the same dependencies on objects.      This morning has consisted of implementing that code, and tagging the required functions as device runnable code, going back and catching typos, and then relocating the kernel because __global__ functions cannot be called as class types.  I was feeling great about this and was quite confident as I keyed in the make arguments, when... Whoops......

Synchotron Self-Compton Radiation and some multithreading

  Progress has been made since the last post.  My modifications to boxfit now allow for basic Inverse-Compton radiation.  Here is a reference spectrum generated using the shipped settings.  The current method uses the definition of the inverse compton parameter (Y) laid out in Nakar et. al. Apj, 703, 675, and functions for the slow cooling regime mainly, with placeholders in the other regimes. The orange is the SSC enabled spectrum, and it is behaving exactly as expected above the cooling break.   The Next step is to get the proper parameterization for Y based on Nakar et. al. as well as Beniamini et al. MNRAS, 454, 1073B.  This includes the Klein-Nishina effect at higher frequencies.  I do worry a bit about how computationally expensive this will be, but I can't really speak to optimizations until I have a better idea of what the algorithm is going to look like.   I am still working on the CUDA port, but I haven't had much time to think abo...

More adventures in Inverse-Compton

Had to do some back pedaling on implementing Inverse-Compton.  The Klein-Nishina effects are not as important for most observations, and there was some strangeness in the fast cooling regime.   The main issue was that the cooling frequency was not suppressed as one would expect and this was due to stupidity in the way I originally implemented it.  Here are some new results for physical parameters E = 10^52 ergs, p=2.5, theta_0= 0.2 rad, theta_0bs=0, n= 5, e_e=1, e_b = 0.01, and ksi_n=1.  We tested in the fast cooling regime using the standard expression for Y (Full derivation to follow) and in the slow cooling regime using an approximation of the formula derived in Beniamini et. al. 2015 (arxiv:1504.04833v2). The red line is the minimum accelerated electron emitted frequency, the dashed line is the IC suppressed cooling frequency, and the blue line is the un-suppressed cooling frequency.  We originally had a debate about whether the fast cooling spect...