“It is more efficient to compute the child index of the current node inside the parent node and write the bounds when available. The previous code could load up to 16 AABBs to compute the new ones. The new code also only needs 1/7 of the previously used scratch memory. The new code seems to be around 30% faster (0.5ms) in GOTG on a 6700XT.”
Shitty missleading clickbait headline though. 0.5ms improvement probably wont translate to 30% more fps with raytracing on RDNA2 cards as the “30% faster on rdna2” in the headline suggest.
“It is more efficient to compute the child index of the current node inside the parent node and write the bounds when available. The previous code could load up to 16 AABBs to compute the new ones. The new code also only needs 1/7 of the previously used scratch memory. The new code seems to be around 30% faster (0.5ms) in GOTG on a 6700XT.”
Shitty missleading clickbait headline though. 0.5ms improvement probably wont translate to 30% more fps with raytracing on RDNA2 cards as the “30% faster on rdna2” in the headline suggest.