How to speed up OpenCL compilation times? #6628

knzivid · 2022-02-22T15:23:55Z

knzivid
Feb 22, 2022

Hi :) In our halide applications, we see that OpenCL compile times can be quite high, especially with Intel. After startup, it takes 6 seconds before our application is able to use the halide pipelines. We use almost all of the kernels in one go, so lazy-loading kernels is not an option. Ultimately, our goal is to reduce the time between application start and running a full halide pipeline. Do you have any tips for improving this? Is it possible to pre-compile OpenCL to multiple IRs like SPIR, PTX and load them at runtime if the driver supports them? If this is currently not supported, what would be the right place to implement this?

Background: Our halide pipeline is written in python and is consumed by our applications using the output of compile_to_static_library. Our moderate-sized halide pipeline boils down to 28 clBuildProgram calls at runtime. The plain text kernels are cumulatively 1.3MiB in size. The total duration for all clBuildProgram calls is about 6 seconds, with the largest kernels taking ~0.75 seconds per clBuildProgram. We measure this with Intel GPUs on linux using Halide's OpenCL target. Other GPU vendors may have faster clBuildProgram times but we do not impose any limitations on what GPU or OpenCL driver to use with the halide OpenCL backend.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to speed up OpenCL compilation times? #6628

{{title}}

Replies: 0 comments

Select a reply

How to speed up OpenCL compilation times? #6628

knzivid Feb 22, 2022

Replies: 0 comments

knzivid
Feb 22, 2022