OptiX testrender overhaul #1829

tgrant-nv · 2024-06-10T20:58:24Z

Description

This PR adds support for full path tracing in the OptiX mode of testrender, including full BSDF sampling and evaluation. The benefit of this change is that we're able to render most of the scenes from the testsuite in OptiX mode with results that closely match the host output. This comes at the cost of increased coupling between the host and OptiX renderers, and therefore an increased maintenance burden.

ID-based dispatch

The main difference between the host and OptiX paths is how the individual BSDFs are evaluated in the CompositeBSDF class. Virtual function calls aren't well supported in OptiX, so rather than using regular C++ polymorphism to invoke the sample(), eval(), and get_albedo() functions for each of the BSDF sub-types, we manually invoke the correct function based on the closure ID (which we have added as a member of the BSDF class).

// from shading_cuda.cpp

#define BSDF_CAST(BSDF_TYPE, bsdf) reinterpret_cast<const BSDF_TYPE*>(bsdf)

OSL_HOSTDEVICE Color3
CompositeBSDF::get_bsdf_albedo(const BSDF* bsdf, const Vec3& wo) const
{
    ...
    switch (bsdf->id) {
    case DIFFUSE_ID:
        albedo = BSDF_CAST(Diffuse<0>, bsdf)->get_albedo(wo);
        break;
    case TRANSPARENT_ID:
    case MX_TRANSPARENT_ID:
        albedo = BSDF_CAST(Transparent, bsdf)->get_albedo(wo);
        break;
    ...

Iterative closure evaluation

Another key difference from the host path is the non-recursive closure evaluation. We retain the same style of iterative tree traversal used in the previous OptiX version of process_closure(). This PR also adds evaluate_layer_opacity(), process_medium_closure(), process_background_closure(), which follow the same evaluation pattern.

`subpixel_radiance()`

The raytracing pipeline mirrors the host code very closely, including camera ray generation and the spawning of secondary rays. This allows a close visual match between the host and OptiX modes.

We've implemented a CUDA version of subpixel_radiance() (in optix_raytracer.cu) that closely mirrors the host version, with the main difference being in how rays are traced and how the shaders are executed. It might be possible to unify the implementations if it would ease the maintenance burden, but for now it seemed cleaner to leave them separate.

Background sampling

We've included support for background closures. This includes a CUDA implementation of the Background::prepare() function. We've broken that function into three phases, where phases 1 and 3 are parallelized across a warp and phase 2 is executed on a single thread. This offers a decent speedup over a single-threaded implementation without the complexity of a more sophisticated implementation.

    // from background.h
    
    template<typename F>
    OSL_HOSTDEVICE void prepare_cuda(int stride, int idx, F cb)
    {
        prepare_cuda_01(stride, idx, cb);
        if (idx == 0)
            prepare_cuda_02();
        prepare_cuda_03(stride, idx);
    }

Tests

Checklist:

I have read the contribution guidelines.
I have updated the documentation, if applicable.
I have ensured that the change is tested somewhere in the testsuite (adding new test cases if necessary).
My code follows the prevailing code style of this project. If I haven't
already run clang-format v17 before submitting, I definitely will look at
the CI test that runs clang-format and fix anything that it highlights as
being nonconforming.

…hat use the higher-precision intrinsics.

…scene.

…bedo functions for the BSDF sub-types.

…low the self-intersection test.

…al to work.

…or CUDA mipmapped textures in testrender. Update the signature of osl_tex2DLookup in testshade.

…ure.

…te the tracedata into a TraceData type.

aconty · 2024-07-16T11:07:08Z

src/testrender/cuda/optix_raytracer.cu

+// Adapted from Sphere::sample in ../raytracer.h
+static __device__ float3
+sample_sphere(const Vec3& x, const SphereParams& sphere, float xi, float yi,
+              float& pdf)


Any chance we can call the original code to avoid duplication? Even if creating a temporary Sphere, the compiler should optimize out cruft.

That is a possibility I had considered. I'll see how well it works in practice.

aconty · 2024-07-16T11:22:43Z

src/testrender/shading.cpp

-    Sample sample(const Vec3& /*wo*/, float rx, float ry,
-                  float /*rz*/) const override
+    OSL_HOSTDEVICE Sample sample(const Vec3& /*wo*/, float rx, float ry,
+                  float /*rz*/) const OSL_HOSTDEVICE_OVERRIDE


Does this OSL_HOSTDEVICE_OVERRIDE mean you are keeping the virtual calls on CPU?

Hmm, good catch. I'll get rid of the virtual calls.

aconty · 2024-07-16T11:31:35Z

src/testrender/shading.cpp

+        sample = BSDF_CAST(Diffuse<1>, bsdf)->eval(wo, wi);
+        break;
+    case PHONG_ID: sample = BSDF_CAST(Phong, bsdf)->eval(wo, wi); break;
+    case WARD_ID: sample = BSDF_CAST(Ward, bsdf)->eval(wo, wi); break;


Trying to wrap my mind around this, is this switch/case particular to CPU?

No, The CompositeBSDF::get_albedo, CompositeBSDF::eval, and CompositeBSDF::sample functions are shared between the CPU and CUDA paths.

tgrant-nv · 2024-07-24T17:44:46Z

I've converted this PR to a draft, pending Chris's mesh support.

@fpsunflower, I should point out that testrender does already run in OptiX mode, but it only supports simple diffuse shading. This PR is to enable full path tracing. I can definitely help out with the OptiX side of mesh support, so don't hesitate to reach out.

fpsunflower · 2024-07-24T18:41:54Z

Will do! I've pushed my branch so far to https://github.com/fpsunflower/OpenShadingLanguage/tree/testrender-triangles in case you want to take a look. I was working on this on my mac, so the existing Optix path is most likely broken.

We should definitely circle back on this after Siggraph. There's a few more loose ends to fix up around derivatives (which are technically not set up right in the current implementation either).

fpsunflower · 2024-07-24T18:43:41Z

Also, that branch adds a few scene to the testsuite, but github complained about the files being too large -- we'll probably want to remove those from the history before merging and find smaller models to use instead.

tgrant-nv · 2024-07-24T18:59:51Z

Sounds good. I'll check out your branch and see what needs to be done to get OptiX up and running again. This PR includes a refactor of the OptixRaytracer class that we can fold into the mesh effort.

lgritz · 2024-07-24T22:06:56Z

Also, that branch adds a few scene to the testsuite, but github complained about the files being too large -- we'll probably want to remove those from the history before merging and find smaller models to use instead.

Much like how OIIO has oiio-images and OpenEXR has openexr-images, OSL could also have a second repo just for "big things only used for tests" so it doesn't clutter the main repo or make a big download for people who just want to inspect or build the code. The testing scripts can enable certain tests only if that other repo is found at build time in a sibling directory,

tgrant-nv · 2024-08-20T17:23:22Z

@fpsunflower, please check out the testrender-triangles-optix branch from my fork to see what I've done to get the OptiX path working. It includes most of the host-side refactoring from this PR. Having only one primitive type is a nice simplification.

fpsunflower · 2024-08-21T21:46:06Z

I haven't go through your branch in great detail, but it sounds like you got everything working already which is great :) What would you say is the best path forward in terms of sequencing the changes into the main repo?

I was thinking I should probably put forward a first PR to move testrender over to triangles. It would temporarily break the optix side, but it looks we could probably follow up quickly with your changes.

@lgritz would that be ok? Or would you prefer if Tim and I coordinate to make a single PR that submits both things in one go?

lgritz · 2024-08-21T23:36:15Z

I'm ok doing it in stages. It's ok to temporarily break as long as we are confident that the part 2 is coming right on its heels.

fpsunflower · 2024-08-22T17:31:25Z

Sounds good. When I get some cycles I will polish off my end of the branch and submit a PR.

Do you want me to prepare splitting out the shaderball test into its own repo (in preparation of maybe having more big obj files and textures)?

With maps/geo, the current render-shaderball folder is 61Mb. For comparison, the rest of the testsuite folder is ~42Mb, so its more than doubling the size for just one test.

At the same time, I'm not sure its really worth the hassle to create a second repo + extra cmake hoops to jump through (same amount of data to download in the end, and a slightly clunkier workflow). We'll just have to be somewhat disciplined in what we add to the testsuite.

I can also punt and skip adding that particular test in the initial PR.

lgritz · 2024-08-22T19:43:27Z

Do you want me to prepare splitting out the shaderball test into its own repo (in preparation of maybe having more big obj files and textures)?

With maps/geo, the current render-shaderball folder is 61Mb. For comparison, the rest of the testsuite folder is ~42Mb, so its more than doubling the size for just one test.

At the same time, I'm not sure its really worth the hassle to create a second repo + extra cmake hoops to jump through (same amount of data to download in the end, and a slightly clunkier workflow). We'll just have to be somewhat disciplined in what we add to the testsuite.

I feel like with the addition of meshes and surely other features to come, it's just a matter of time before we're going to want to add a number of bigger tests. So I'm inclined to bite the bullet and just set it up now. Most tests will continue to live in the main repo, but large data files referenced by tests -- say, more than a couple MB -- should live in the test data repo.

This is analogous to the OpenEXR-images or OpenImageIO-images repos, which are used to store large things used only for tests, to keep them from cluttering the main repo or forcing people to download and store them merely to build the main project.

Maybe ask John to create a new repo for this purpose. OpenShadingLanguage-tests? OpenShadingLanguage-data? -testdata? -models? -scenes? I'm open to suggestions.

tgrant-nv added 30 commits June 10, 2024 13:29

Crudely refactor make_optix_materials.

8deb471

Continue refactoring the OptiX pipeline setup.

4a85d18

Tweaks to allow including shadding.h/shading.cpp in wrapper.cu.

5475f08

FIX: Use the correct OptiX call to retrieve t_hit.

79d39d6

Basic pathtracing working on the GPU.

e7adbe5

Add PDF calculation for self-emission.

c97b0c0

Make the sampling match the CPU path more closely.

418e916

Add some vector casting macros.

4cff2d8

Trace light rays to both quad and sphere prims.

20d452b

Separate out subpixel_radiance function to make anti-aliasing easier.

4b16643

Enable anti-aliasing.

c6a2240

Enable show_albedo_scale. Misc. cleanup.

8f4bb85

Add "precise" versions of the quad and sphere intersection programs t…

4c63724

…hat use the higher-precision intrinsics.

Don't create hitgroups for geometry types that aren't present in the …

9e54203

…scene.

Tweak the UV calculation for the sphere.

a34ccc9

More BDSFs sort of working (Ward yes, Phong no).

a3f7e85

Make the surface area computation for the GPU sphere match the CPU.

0c92f24

Record the backfacing property for quad hits.

b0229ed

Rework the light ray tracing a bit.

4f4ccce

Pass max bounces through the render params.

7dcc2b2

Add cases from FRESNEL_REFLECTION_ID.

e17b590

FIX: Adjust the size needed calculation in closure_component_allot.

fd0d6cc

Use the correct case labels for REFRACTION_ID. Use the correct get_al…

d976332

…bedo functions for the BSDF sub-types.

Changes to make sphere refraction work. Pass in the last hit ID to al…

d391c73

…low the self-intersection test.

No need to redefine the OSL_HOSTDEVICE macro.

77485d3

Hack in the dPdx/dPdy calculation, to allow things like calculatenorm…

14445e4

…al to work.

Add basic support for the Microfacet BSDF.

ccba6f7

Add initial support for Background sampling on the GPU. Add support f…

48adb13

…or CUDA mipmapped textures in testrender. Update the signature of osl_tex2DLookup in testshade.

Change license notice for upper_bound.

a0b9477

Use the full-precision intrinsics for some calculations in Background.

ddb401d

tgrant-nv added 11 commits June 17, 2024 16:12

Make process_medium_closure iterative.

88cd20c

Make evaluate_layer_opacity iterative.

f329ba1

Make process_bsdf_closure iterative.

2c60ac9

Make process_background_closure iterative.

f6d65d5

Make the CUDA path use the host closure evaluation functions.

9b7c31c

"Fix" the closure id in the MX_CONDUCTOR_ID case in process_bsdf_clos…

7958fe1

…ure.

Get rid of the closure evaluation code in optix_raytracer.cu.

1a71f24

Switch the host path to basic ID-based dispatch.

33d3576

Simplify 'tracedata' handling by using explicit object IDs. Encapsula…

e73c4c3

…te the tracedata into a TraceData type.

Tuck TraceData into CudaScene::intersect.

6bb82fe

Share SimpleRaytracer::eval_background between host and CUDA.

41ffaf9

aconty reviewed Jul 16, 2024

View reviewed changes

tgrant-nv added 4 commits July 16, 2024 16:29

Get rid of the vestigial virtual function calls.

ddcfd22

Use the Sphere and Quad intersect functions on the GPU.

02bb427

Use the Sphere and Quad uv functions on the GPU.

db46f09

Use the Sphere and Quad sample and shapepdf functions on the GPU.

eb5a865

tgrant-nv marked this pull request as draft July 24, 2024 17:39

tgrant-nv mentioned this pull request Sep 21, 2024

Fix testrender OptiX build #1869

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OptiX testrender overhaul #1829

OptiX testrender overhaul #1829

tgrant-nv commented Jun 10, 2024

aconty Jul 16, 2024

tgrant-nv Jul 16, 2024

aconty Jul 16, 2024

tgrant-nv Jul 16, 2024

aconty Jul 16, 2024

tgrant-nv Jul 16, 2024

tgrant-nv commented Jul 24, 2024

fpsunflower commented Jul 24, 2024

fpsunflower commented Jul 24, 2024

tgrant-nv commented Jul 24, 2024

lgritz commented Jul 24, 2024

tgrant-nv commented Aug 20, 2024

fpsunflower commented Aug 21, 2024

lgritz commented Aug 21, 2024

fpsunflower commented Aug 22, 2024

lgritz commented Aug 22, 2024 •

edited

Loading

OptiX testrender overhaul #1829

Are you sure you want to change the base?

OptiX testrender overhaul #1829

Conversation

tgrant-nv commented Jun 10, 2024

Description

ID-based dispatch

Iterative closure evaluation

subpixel_radiance()

Background sampling

Tests

Checklist:

aconty Jul 16, 2024

Choose a reason for hiding this comment

tgrant-nv Jul 16, 2024

Choose a reason for hiding this comment

aconty Jul 16, 2024

Choose a reason for hiding this comment

tgrant-nv Jul 16, 2024

Choose a reason for hiding this comment

aconty Jul 16, 2024

Choose a reason for hiding this comment

tgrant-nv Jul 16, 2024

Choose a reason for hiding this comment

tgrant-nv commented Jul 24, 2024

fpsunflower commented Jul 24, 2024

fpsunflower commented Jul 24, 2024

tgrant-nv commented Jul 24, 2024

lgritz commented Jul 24, 2024

tgrant-nv commented Aug 20, 2024

fpsunflower commented Aug 21, 2024

lgritz commented Aug 21, 2024

fpsunflower commented Aug 22, 2024

lgritz commented Aug 22, 2024 • edited Loading

`subpixel_radiance()`

lgritz commented Aug 22, 2024 •

edited

Loading