-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compute-Intensive Kernel Hits NaN #71
Comments
@elliottslaughter Do you think this problem can be linked to #69 ? |
@hyviquel It shouldn't have anything to do with that issue since the results of the compute bound kernel are thrown away, and the "result" that is actually written into the output region is instead a tuple containing the timestep and point (column) of the task. @frobnitzem Thanks for pointing this out. Overall this looks good to me. I don't think it will change the practical results on any platforms we tested (since we already verified that we hit peak FLOPS), but it is true that a future system might hypothetically add an early-out for NaNs (which would then cause Task Bench to over-report its achieved FLOPS). I'm happy to take a PR on this or may get back to it myself in a week or so. (Currently digging myself out of things that have been piling up since the break.) |
I agree, my timings were the same after changing the code. I didn't make a
PR because one of the avx cases doesn't have a clear fix.
…On Thu, Jan 7, 2021, 3:41 PM Elliott Slaughter ***@***.***> wrote:
@hyviquel <https://github.com/hyviquel> It shouldn't have anything to do
with that issue since the results of the compute bound kernel are thrown
away, and the "result" that is actually written into the output region is
instead a tuple containing the timestep and point (column) of the task.
@frobnitzem <https://github.com/frobnitzem> Thanks for pointing this out.
Overall this looks good to me. I don't think it will change the practical
results on any platforms we tested (since we already verified that we hit
peak FLOPS), but it is true that a future system might hypothetically add
an early-out for NaNs (which would then cause Task Bench to over-report its
achieved FLOPS).
I'm happy to take a PR on this or may get back to it myself in a week or
so. (Currently digging myself out of things that have been piling up since
the break.)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#71 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AARDW54JN2RKSC3NMMCCQCDSYYMAHANCNFSM4VGCCOEQ>
.
|
This kernel:
task-bench/core/core_kernel.cc
Line 234 in e241152
repeatedly applies
A = A*A + A (i.e. fmadd(A, A, A))
which quickly escalates to A=NaN.
To avoid this, the kernel can be changed to
A = -A*A + A (i.e. fnmsub(A, A, A)).
This makes the iteration equivalent to the Logistic map with r=1.
Adjusting the initial condition to A = 0.7 (or anything between 0 and 1) makes the iterations converge slowly to 0 over time.
The text was updated successfully, but these errors were encountered: