You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 4, 2018. It is now read-only.
Thread blocking access would be achieved by the directive schedule(static,1) on the outer most loop. It allows threads processing the z plane use some y and x planes already in cache.
Wave Equation Based Stencil Optimizations on Multi-core CPU - Muhong Zhou and William W. Symes, Rice University – Section: Reducing L3 Cache Misses – Blocking thread accesses
Modifies the array access pattern by fission on the inner most loop and rearranging the access pattern by its stride. Beyond that, this changes helps to reduce register pressure on the vectorization.
Borges, L., 2011, 3d finite differences on multi-core processors. (available online at [https://software.intel. com/en-us/articles/3d-finite-differences-on-multi-core-processors](https://software.intel. com/en-us/articles/3d-finite-differences-on-multi-core-processors)).
The text was updated successfully, but these errors were encountered: