Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.look_at() in step 9 of the inverse caustics example failing with mitsuba 3.6 #1440

Closed
tstigen opened this issue Dec 23, 2024 · 5 comments
Closed

Comments

@tstigen
Copy link

tstigen commented Dec 23, 2024

Summary

I saw the v3.60 update could yield 10-20x speedups so I wanted to try it. Installed a new environment and few when trying to run the inverse caustics optimizer example it fails.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[18], line 2
      1 # Looking at the receiving plane, not looking through the lens
----> 2 sensor_to_world = mi.ScalarTransform4f.look_at(
      3     target=mi.ScalarPoint3f([0, -20, 0]),
      4     origin=mi.ScalarPoint3f([0, -4.65, 0]),
      5     up=mi.ScalarPoint3f([0, 0, 1])
      6 )
      7 resx, resy = config['render_resolution']
      8 sensor = {
      9     'type': 'perspective',
     10     'near_clip': 1,
   (...)
     28     },
     29 }

TypeError: look_at(): incompatible function arguments. The following argument types are supported:
    1. look_at(self, origin: mitsuba.ScalarPoint3f, target: mitsuba.ScalarPoint3f, up: mitsuba.ScalarPoint3f) -> mitsuba.ScalarTransform4f

Invoked with types: kwargs = { target: mitsuba.ScalarPoint3f, origin: mitsuba.ScalarPoint3f, up: mitsuba.ScalarPoint3f }

System configuration

System information:

OS: windows 10 64 bit
CPU: Intel 7700k
GPU: nVidia 1080
Python version: 3.12.8
LLVM version: ...
CUDA version: 12.6.65
NVidia driver: 561.17

Dr.Jit version: 1.01
Mitsuba version: 3.60
Compiled with: from pip
Variants compiled: ...

Description

Steps to reproduce

  1. Run the inverse caustics optimization until the 'Assembling the scene' step, the .look_at() will throw an error.
@lnuic
Copy link
Contributor

lnuic commented Dec 23, 2024

Dear @tstigen,

It appears that you are using an older version of the mitsuba-tutorials.

In the current master version of mitsuba-tutorials, the relevant section has been updated to:

sensor_to_world = mi.ScalarTransform4f().look_at(
    target=[0, -20, 0],
    origin=[0, -4.65, 0],
    up=[0, 0, 1]
)

@tstigen
Copy link
Author

tstigen commented Dec 23, 2024

Sorry, I think I was trying to get it to work and made some changes to see if it needed a new type. Even direct from the website I get this error.

TypeError                                 Traceback (most recent call last)
Cell In[9], line 2
      1 # Looking at the receiving plane, not looking through the lens
----> 2 sensor_to_world = mi.ScalarTransform4f.look_at(
      3     target=[0, -20, 0],
      4     origin=[0, -4.65, 0],
      5     up=[0, 0, 1]
      6 )
      7 resx, resy = config['render_resolution']
      8 sensor = {
      9     'type': 'perspective',
     10     'near_clip': 1,
   (...)
     28     },
     29 }

TypeError: look_at(): incompatible function arguments. The following argument types are supported:
    1. look_at(self, origin: mitsuba.ScalarPoint3f, target: mitsuba.ScalarPoint3f, up: mitsuba.ScalarPoint3f) -> mitsuba.ScalarTransform4f

Invoked with types: kwargs = { target: list, origin: list, up: list }```

@tstigen
Copy link
Author

tstigen commented Dec 23, 2024

I fully rebooted and that section runs now. Not sure what was going on. However now it is failing on the optimization step. This could be related to the driver version though. I had success running on 531, I was hoping that 561 plus the new versions of Mitsuba and DrJit might play nice together, but it appears the 1080 + driver version issue may still exist. I'll revert to 531 and see if that makes it work and report back.

import time
start_time = time.time()
mi.set_log_level(mi.LogLevel.Warn)
iterations = config['max_iterations']
loss_values = []
spp = config['spp']

for it in range(iterations):
    t0 = time.time()
    
    # Apply displacement and update the scene BHV accordingly
    apply_displacement()

    # Perform a differentiable rendering of the scene
    image = mi.render(scene, params, seed=it, spp=2 * spp, spp_grad=spp)

    # Scale-independent L2 function
    loss = scale_independent_loss(image, image_ref)

    # Back-propagate errors to input parameters and take an optimizer step
    dr.backward(loss)
    
    # Take a gradient step
    opt.step()
    
    # Increase resolution of the heightmap
    if it in upsampling_steps:
        opt['data'] = dr.upsample(opt['data'], scale_factor=(2, 2, 1))
    
    # Carry over the update to our "latent variable" (the heightmap values)
    params.update(opt)

    # Log progress
    elapsed_ms = 1000. * (time.time() - t0)
    current_loss = loss.array[0]
    loss_values.append(current_loss)
    mi.Thread.thread().logger().log_progress(
        it / (iterations-1),
        f'Iteration {it:03d}: loss={current_loss:g} (took {elapsed_ms:.0f}ms)',
        'Caustic Optimization', '')
    
    
    # Increase rendering quality toward the end of the optimization
    if it in (int(0.7 * iterations), int(0.9 * iterations)):
        spp *= 2
        opt.set_learning_rate(0.5 * opt.lr['data'])
        

end_time = time.time()
print(((end_time - start_time) * 1000) / iterations, ' ms per iteration on average')
mi.set_log_level(mi.LogLevel.Info)
Logs
jit_optix_compile(): optixPipelineCreate() failed. Please see the PTX assembly listing and error message below:

.version 6.5
.target sm_50
.address_size 64

.const .align 8 .b8 params[1232];

.entry __raygen__d3119078d4d969c89891ae73b74b715c() {
    .reg.b8  %b <1058>; .reg.b16  %w<1058>; .reg.b32 %r<1058>;
    .reg.b64 %rd<1058>; .reg.f16  %h<1058>; .reg.f32 %f<1058>;
    .reg.f64 %d <1058>; .reg.pred %p<1058>;

    call (%r0), _optix_get_launch_index_x, ();
    call (%r2), _optix_get_launch_dimension_x, ();
    ld.const.u32 %r1, [params + 4];
    add.u32 %r0, %r0, %r1;

body:
    ld.const.u64 %rd5, [params+8];
    ld.const.u64 %rd6, [params+16];
    ld.const.u64 %rd7, [params+24];
    ld.const.u64 %rd8, [params+32];
    ld.const.u64 %rd9, [params+40];
    ld.const.u64 %rd10, [params+48];
    ld.const.u64 %rd11, [params+56];
    ld.const.u64 %rd12, [params+64];
    ld.const.u64 %rd0, [params+72];
    ldu.global.b32 %f13, [%rd0];
    ld.const.u64 %rd0, [params+80];
    ldu.global.b32 %f14, [%rd0];
    ld.const.u64 %rd0, [params+88];
    ldu.global.b32 %f15, [%rd0];
    ld.const.u64 %rd0, [params+96];
    ldu.global.b32 %f16, [%rd0];
    ld.const.u64 %rd0, [params+104];
    ldu.global.b32 %f17, [%rd0];
    ld.const.u64 %rd0, [params+112];
    ldu.global.b32 %f18, [%rd0];
    ld.const.u64 %rd0, [params+120];
    ldu.global.b32 %f19, [%rd0];
    ld.const.u64 %rd0, [params+128];
    ldu.global.b32 %f20, [%rd0];
    ld.const.u64 %rd0, [params+136];
    ldu.global.b32 %f21, [%rd0];
    ld.const.u64 %rd0, [params+144];
    ldu.global.b32 %f22, [%rd0];
    ld.const.u64 %rd0, [params+152];
    ldu.global.b32 %f23, [%rd0];
    ld.const.u64 %rd0, [params+160];
    ldu.global.b32 %f24, [%rd0];
    mov.b32 %f25, 0xbed413cd;
    mov.b32 %f26, 0x3ed413cd;
    ld.const.u64 %rd0, [params+168];
    ldu.global.b32 %f27, [%rd0];
    ld.const.u64 %rd0, [params+176];
    ldu.global.b32 %f28, [%rd0];
    ld.const.u64 %rd0, [params+184];
    ldu.global.b32 %f29, [%rd0];
    ld.const.u64 %rd0, [params+192];
    ldu.global.b32 %f30, [%rd0];
    ld.const.u64 %rd0, [params+200];
    ldu.global.b32 %f31, [%rd0];
    ld.const.u64 %rd0, [params+208];
    ldu.global.b32 %f32, [%rd0];
    ld.const.u64 %rd0, [params+216];
    ldu.global.b32 %f33, [%rd0];
    ld.const.u64 %rd0, [params+224];
    ldu.global.b32 %f34, [%rd0];
    ld.const.u64 %rd0, [params+232];
    ldu.global.b32 %f35, [%rd0];
    ld.const.u64 %rd0, [params+240];
    ldu.global.b32 %f36, [%rd0];
    ld.const.u64 %rd0, [params+248];
    ldu.global.b32 %f37, [%rd0];
    ld.const.u64 %rd0, [params+256];
    ldu.global.b32 %f38, [%rd0];
    ld.const.u64 %rd0, [params+264];
    ldu.global.b32 %f39, [%rd0];
    ld.const.u64 %rd0, [params+272];
    ldu.global.b32 %f40, [%rd0];
    ld.const.u64 %rd0, [params+280];
    ldu.global.b32 %f41, [%rd0];
    ld.const.u64 %rd0, [params+288];
    ldu.global.b32 %f42, [%rd0];
    ld.const.u64 %rd0, [params+296];
    ldu.global.b32 %f43, [%rd0];
    ld.const.u64 %rd0, [params+304];
    ldu.global.b32 %f44, [%rd0];
    ld.const.u64 %rd0, [params+312];
    ldu.global.b64 %rd46, [%rd0];
    ld.const.u64 %rd0, [params+320];
    mad.wide.u32 %rd0, %r0, 8, %rd0;
    ld.global.cs.b64 %rd47, [%rd0];
    ld.const.u64 %rd0, [params+328];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f48, [%rd0];
    ld.const.u64 %rd0, [params+336];
    mad.wide.u32 %rd0, %r0, 1, %rd0;
    ld.global.cs.b8 %w0, [%rd0];
    setp.ne.u16 %p49, %w0, 0;
    ld.const.u64 %rd0, [params+344];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %r50, [%rd0];
    ld.const.u64 %rd0, [params+352];
    mad.wide.u32 %rd0, %r0, 1, %rd0;
    ld.global.cs.b8 %w0, [%rd0];
    setp.ne.u16 %p51, %w0, 0;
    ld.const.u64 %rd0, [params+360];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f52, [%rd0];
    ld.const.u64 %rd0, [params+368];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f53, [%rd0];
    ld.const.u64 %rd0, [params+376];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f54, [%rd0];
    ld.const.u64 %rd0, [params+384];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f55, [%rd0];
    ld.const.u64 %rd0, [params+392];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f56, [%rd0];
    ld.const.u64 %rd0, [params+400];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f57, [%rd0];
    ld.const.u64 %rd0, [params+408];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f58, [%rd0];
    ld.const.u64 %rd0, [params+416];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f59, [%rd0];
    ld.const.u64 %rd0, [params+424];
    mad.wide.u32 %rd0, %r0, 8, %rd0;
    ld.global.cs.b64 %rd60, [%rd0];
    ld.const.u64 %rd0, [params+432];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %r61, [%rd0];
    mov.b32 %r62, 0x0;
    setp.ne.u32 %p63, %r61, %r62;
    ld.const.u64 %rd0, [params+440];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f64, [%rd0];
    ld.const.u64 %rd0, [params+448];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f65, [%rd0];
    ld.const.u64 %rd0, [params+456];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f66, [%rd0];
    ld.const.u64 %rd0, [params+464];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f67, [%rd0];
    ld.const.u64 %rd0, [params+472];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f68, [%rd0];
    ld.const.u64 %rd0, [params+480];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f69, [%rd0];
    ld.const.u64 %rd0, [params+488];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f70, [%rd0];
    ld.const.u64 %rd0, [params+496];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f71, [%rd0];
    ld.const.u64 %rd0, [params+504];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f72, [%rd0];
    ld.const.u64 %rd0, [params+512];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f73, [%rd0];
    ld.const.u64 %rd0, [params+520];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f74, [%rd0];
    ld.const.u64 %rd0, [params+528];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f75, [%rd0];
    ld.const.u64 %rd0, [params+536];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f76, [%rd0];
    ld.const.u64 %rd0, [params+544];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f77, [%rd0];
    ld.const.u64 %rd0, [params+552];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f78, [%rd0];
    ld.const.u64 %rd0, [params+560];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f79, [%rd0];
    ld.const.u64 %rd0, [params+568];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f80, [%rd0];
    ld.const.u64 %rd0, [params+576];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f81, [%rd0];
    ld.const.u64 %rd0, [params+584];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f82, [%rd0];
    ld.const.u64 %rd0, [params+592];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f83, [%rd0];
    ld.const.u64 %rd0, [params+600];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f84, [%rd0];
    ld.const.u64 %rd0, [params+608];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f85, [%rd0];
    ld.const.u64 %rd0, [params+616];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f86, [%rd0];
    ld.const.u64 %rd0, [params+624];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f87, [%rd0];
    ld.const.u64 %rd0, [params+632];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f88, [%rd0];
    ld.const.u64 %rd0, [params+640];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f89, [%rd0];
    ld.const.u64 %rd0, [params+648];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f90, [%rd0];
    ld.const.u64 %rd0, [params+656];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f91, [%rd0];
    ld.const.u64 %rd0, [params+664];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f92, [%rd0];
    ld.const.u64 %rd0, [params+672];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f93, [%rd0];
    ld.const.u64 %rd0, [params+680];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f94, [%rd0];
    ld.const.u64 %rd0, [params+688];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f95, [%rd0];
    ld.const.u64 %rd0, [params+696];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f96, [%rd0];
    ld.const.u64 %rd0, [params+704];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f97, [%rd0];
    ld.const.u64 %rd0, [params+712];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f98, [%rd0];
    ld.const.u64 %rd0, [params+720];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f99, [%rd0];
    ld.const.u64 %rd0, [params+728];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f100, [%rd0];
    ld.const.u64 %rd0, [params+736];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f101, [%rd0];
    ld.const.u64 %rd0, [params+744];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f102, [%rd0];
    ld.const.u64 %rd0, [params+752];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f103, [%rd0];
    ld.const.u64 %rd0, [params+760];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %f104, [%rd0];
    ld.const.u64 %rd0, [params+768];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %r105, [%rd0];
    ld.const.u64 %rd0, [params+776];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    ld.global.cs.b32 %r106, [%rd0];
    mov.b64 %rd107, 0x5851f42d4c957f2d;
    mad.lo.u64 %rd108, %rd60, %rd107, %rd47;
    mad.lo.u64 %rd109, %rd108, %rd107, %rd47;
    and.pred %p110, %p49, %p63;
    mad.wide.u32 %rd3, %r61, 4, %rd5;
    mov.b32 %r111, 0;
    @%p110 ld.global.nc.b32 %r111, [%rd3];
    fma.rn.ftz.f32 %f112, %f15, %f77, %f16;
    fma.rn.ftz.f32 %f113, %f14, %f78, %f112;
    fma.rn.ftz.f32 %f114, %f13, %f79, %f113;
    fma.rn.ftz.f32 %f115, %f19, %f77, %f20;
    fma.rn.ftz.f32 %f116, %f18, %f78, %f115;
    fma.rn.ftz.f32 %f117, %f17, %f79, %f116;
    fma.rn.ftz.f32 %f118, %f23, %f77, %f24;
    fma.rn.ftz.f32 %f119, %f22, %f78, %f118;
    fma.rn.ftz.f32 %f120, %f21, %f79, %f119;
    mul.ftz.f32 %f121, %f120, %f120;
    fma.rn.ftz.f32 %f122, %f117, %f117, %f121;
    fma.rn.ftz.f32 %f123, %f114, %f114, %f122;
    sqrt.approx.ftz.f32 %f124, %f123;
    rcp.approx.ftz.f32 %f125, %f124;
    mul.ftz.f32 %f126, %f114, %f125;
    mov.b32 %f127, 0x0;
    setp.gt.f32 %p128, %f126, %f127;
    mul.ftz.f32 %f129, %f120, %f125;
    rcp.approx.ftz.f32 %f130, %f126;
    mul.ftz.f32 %f131, %f129, %f130;
    setp.ge.f32 %p132, %f131, %f25;
    setp.le.f32 %p133, %f131, %f26;
    and.pred %p134, %p132, %p133;
    mul.ftz.f32 %f135, %f117, %f125;
    mul.ftz.f32 %f136, %f135, %f130;
    setp.ge.f32 %p137, %f136, %f25;
    setp.le.f32 %p138, %f136, %f26;
    and.pred %p139, %p137, %p138;
    and.pred %p140, %p134, %p139;
    and.pred %p141, %p128, %p140;
    mul.ftz.f32 %f142, %f27, %f130;
    mul.ftz.f32 %f143, %f142, %f130;
    mul.ftz.f32 %f144, %f143, %f130;
    selp.b32 %f145, %f144, %f127, %p141;
    mul.ftz.f32 %f146, %f145, %f125;
    mul.ftz.f32 %f147, %f146, %f125;
    mul.ftz.f32 %f148, %f74, %f147;
    mov.b32 %f149, 0x3f800000;
    setp.ge.f32 %p150, %f114, %f149;
    mov.b32 %f151, 0x447a0000;
    setp.le.f32 %p152, %f114, %f151;
    and.pred %p153, %p150, %p152;
    and.pred %p154, %p51, %p153;
    fma.rn.ftz.f32 %f155, %f30, %f120, %f31;
    fma.rn.ftz.f32 %f156, %f29, %f117, %f155;
    fma.rn.ftz.f32 %f157, %f28, %f114, %f156;
    fma.rn.ftz.f32 %f158, %f34, %f120, %f35;
    fma.rn.ftz.f32 %f159, %f33, %f117, %f158;
    fma.rn.ftz.f32 %f160, %f32, %f114, %f159;
    rcp.approx.ftz.f32 %f161, %f160;
    mul.ftz.f32 %f162, %f157, %f161;
    mov.b32 %f163, 0x43000000;
    mul.ftz.f32 %f164, %f163, %f36;
    mov.b32 %f165, 0x3c000000;
    mul.ftz.f32 %f166, %f164, %f165;
    sub.ftz.f32 %f167, %f162, %f166;
    setp.ge.f32 %p168, %f167, %f127;
    setp.le.f32 %p169, %f167, %f149;
    and.pred %p170, %p168, %p169;
    fma.rn.ftz.f32 %f171, %f39, %f120, %f40;
    fma.rn.ftz.f32 %f172, %f38, %f117, %f171;
    fma.rn.ftz.f32 %f173, %f37, %f114, %f172;
    mul.ftz.f32 %f174, %f173, %f161;
    mul.ftz.f32 %f175, %f163, %f41;
    mul.ftz.f32 %f176, %f175, %f165;
    sub.ftz.f32 %f177, %f174, %f176;
    setp.ge.f32 %p178, %f177, %f127;
    and.pred %p179, %p170, %p178;
    setp.le.f32 %p180, %f177, %f149;
    and.pred %p181, %p179, %p180;
    and.pred %p182, %p154, %p181;
    selp.b32 %f183, %f149, %f127, %p182;
    setp.gt.f32 %p184, %f183, %f127;
    setp.ne.f32 %p185, %f148, %f127;
    mul.ftz.f32 %f186, %f75, %f147;
    setp.ne.f32 %p187, %f186, %f127;
    or.pred %p188, %p185, %p187;
    mul.ftz.f32 %f189, %f76, %f147;
    setp.ne.f32 %p190, %f189, %f127;
    or.pred %p191, %p188, %p190;
    and.pred %p192, %p184, %p191;
    and.pred %p193, %p51, %p192;
    sub.ftz.f32 %f194, %f42, %f79;
    sub.ftz.f32 %f195, %f43, %f78;
    sub.ftz.f32 %f196, %f44, %f77;
    mul.ftz.f32 %f197, %f56, %f196;
    fma.rn.ftz.f32 %f198, %f54, %f195, %f197;
    fma.rn.ftz.f32 %f199, %f52, %f194, %f198;
    mov.b32 %r200, %f199;
    mov.b32 %r201, 0x0;
    setp.lt.s32 %p202, %r200, %r201;
    abs.f32 %f203, %f77;
    abs.f32 %f204, %f78;
    max.ftz.f32 %f205, %f203, %f204;
    abs.f32 %f206, %f79;
    max.ftz.f32 %f207, %f205, %f206;
    add.ftz.f32 %f208, %f149, %f207;
    mov.b32 %f209, 0x38bb8000;
    mul.ftz.f32 %f210, %f208, %f209;
    neg.ftz.f32 %f211, %f210;
    selp.b32 %f212, %f211, %f210, %p202;
    fma.rn.ftz.f32 %f213, %f212, %f56, %f77;
    fma.rn.ftz.f32 %f214, %f212, %f54, %f78;
    fma.rn.ftz.f32 %f215, %f212, %f52, %f79;
    sub.ftz.f32 %f216, %f44, %f213;
    sub.ftz.f32 %f217, %f42, %f215;
    sub.ftz.f32 %f218, %f43, %f214;
    mul.ftz.f32 %f219, %f216, %f216;
    fma.rn.ftz.f32 %f220, %f218, %f218, %f219;
    fma.rn.ftz.f32 %f221, %f217, %f217, %f220;
    sqrt.approx.ftz.f32 %f222, %f221;
    rcp.approx.ftz.f32 %f223, %f222;
    mul.ftz.f32 %f224, %f216, %f223;
    mul.ftz.f32 %f225, %f218, %f223;
    mul.ftz.f32 %f226, %f217, %f223;
    mov.b32 %f227, 0x3f7fc568;
    mul.ftz.f32 %f228, %f222, %f227;
    mov.b32 %r229, 0xff;
    mov.b32 %r230, 0xc;
    mov.b32 %r231, 0x0;
    mov.b32 %r232, 0x1;
    and.pred %p233, %p49, %p193;
    .reg.u32 %u234_out_<32>;
    @!%p233 bra l_masked_234;
    .reg.u32 %u234_z, %u234_count;
    mov.u32 %u234_z, 0;
    mov.u32 %u234_count, 1;
    call (%u234_out_0, %u234_out_1, %u234_out_2, %u234_out_3, %u234_out_4, %u234_out_5, %u234_out_6, %u234_out_7, %u234_out_8, %u234_out_9, %u234_out_10, %u234_out_11, %u234_out_12, %u234_out_13, %u234_out_14, %u234_out_15, %u234_out_16, %u234_out_17, %u234_out_18, %u234_out_19, %u234_out_20, %u234_out_21, %u234_out_22, %u234_out_23, %u234_out_24, %u234_out_25, %u234_out_26, %u234_out_27, %u234_out_28, %u234_out_29, %u234_out_30, %u234_out_31), _optix_trace_typed_32, (%u234_z, %rd46, %f213, %f214, %f215, %f224, %f225, %f226, %f127, %f228, %f48, %r229, %r230, %r231, %r232, %r231, %u234_count, %r232, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z, %u234_z);

l_masked_234:
    mov.b32 %r235, %u234_out_0;
    setp.eq.u32 %p236, %r235, %r232;
    and.pred %p237, %p193, %p236;
    not.pred %p238, %p237;
    and.pred %p239, %p193, %p238;
    mul.ftz.f32 %f240, %f224, %f71;
    fma.rn.ftz.f32 %f241, %f225, %f68, %f240;
    fma.rn.ftz.f32 %f242, %f226, %f59, %f241;
    setp.ne.u32 %p243, %r61, %r231;
    and.pred %p244, %p239, %p243;
    setp.ne.u32 %p245, %r111, %r231;
    and.pred %p246, %p244, %p245;
    setp.eq.u32 %p247, %r111, %r231;
    and.pred %p248, %p244, %p247;
    max.ftz.f32 %f249, %f127, %f242;
    selp.b32 %f250, %f249, %f149, %p248;
    mul.ftz.f32 %f251, %f66, %f67;
    fma.rn.ftz.f32 %f252, %f64, %f65, %f251;
    fma.rn.ftz.f32 %f253, %f59, %f58, %f252;
    mul.ftz.f32 %f254, %f70, %f67;
    fma.rn.ftz.f32 %f255, %f69, %f65, %f254;
    fma.rn.ftz.f32 %f256, %f68, %f58, %f255;
    mul.ftz.f32 %f257, %f73, %f67;
    fma.rn.ftz.f32 %f258, %f72, %f65, %f257;
    fma.rn.ftz.f32 %f259, %f71, %f58, %f258;
    mul.ftz.f32 %f260, %f56, %f259;
    fma.rn.ftz.f32 %f261, %f54, %f256, %f260;
    fma.rn.ftz.f32 %f262, %f52, %f253, %f261;
    mul.ftz.f32 %f263, %f262, %f58;
    setp.gt.f32 %p264, %f263, %f127;
    mul.ftz.f32 %f265, %f56, %f224;
    fma.rn.ftz.f32 %f266, %f54, %f225, %f265;
    fma.rn.ftz.f32 %f267, %f52, %f226, %f266;
    mul.ftz.f32 %f268, %f267, %f242;
    setp.gt.f32 %p269, %f268, %f127;
    and.pred %p270, %p264, %p269;
    mul.ftz.f32 %f271, %f58, %f267;
    mul.ftz.f32 %f272, %f242, %f262;
    div.approx.ftz.f32 %f273, %f271, %f272;
    abs.f32 %f274, %f273;
    selp.b32 %f275, %f274, %f127, %p270;
    mul.ftz.f32 %f276, %f167, %f163;
    mul.ftz.f32 %f277, %f177, %f163;
    mov.b32 %r278, 0x0;
    setp.ne.u32 %p279, %r111, %r278;
    and.pred %p280, %p246, %p279;
    and.pred %p281, %p49, %p280;

    @!%p281 bra l_masked_282;

    { // Call: mitsuba::BSDF::eval()

        mad.wide.u32 %rd3, %r111, 8, %rd11;
        ld.global.u64 %rd3, [%rd3];
        cvt.u32.u64 %r3, %rd3;
        call (%rd2), _optix_call_direct_callable, (%r3);
        shr.u64 %rd3, %rd3, 32;
        add.u64 %rd3, %rd3, %rd12;

        {
            proto: .callprototype (.param .align 4 .b8 result[12]) _(.reg .u64 data, .param .align 4 .b8 params[168]);
            .param .align 4 .b8 out[12];
            .param .align 4 .b8 in[168];
            st.param.b32 [in+0], %f86;
            st.param.b32 [in+4], %f48;
            st.param.b32 [in+8], %f77;
            st.param.b32 [in+12], %f78;
            st.param.b32 [in+16], %f79;
            st.param.b32 [in+20], %f56;
            st.param.b32 [in+24], %f54;
            st.param.b32 [in+28], %f52;
            st.param.b32 [in+32], %r61;
            st.param.b32 [in+36], %f87;
            st.param.b32 [in+40], %f88;
            st.param.b32 [in+44], %f73;
            st.param.b32 [in+48], %f70;
            st.param.b32 [in+52], %f66;
            st.param.b32 [in+56], %f72;
            st.param.b32 [in+60], %f69;
            st.param.b32 [in+64], %f64;
            st.param.b32 [in+68], %f71;
            st.param.b32 [in+72], %f68;
            st.param.b32 [in+76], %f59;
            st.param.b32 [in+80], %f89;
            st.param.b32 [in+84], %f90;
            st.param.b32 [in+88], %f91;
            st.param.b32 [in+92], %f92;
            st.param.b32 [in+96], %f93;
            st.param.b32 [in+100], %f94;
            st.param.b32 [in+104], %f95;
            st.param.b32 [in+108], %f96;
            st.param.b32 [in+112], %f97;
            st.param.b32 [in+116], %f98;
            st.param.b32 [in+120], %f99;
            st.param.b32 [in+124], %f100;
            st.param.b32 [in+128], %f101;
            st.param.b32 [in+132], %f102;
            st.param.b32 [in+136], %f103;
            st.param.b32 [in+140], %f104;
            st.param.b32 [in+144], %f67;
            st.param.b32 [in+148], %f65;
            st.param.b32 [in+152], %f58;
            st.param.b32 [in+156], %r105;
            st.param.b32 [in+160], %r106;
            st.param.b32 [in+164], %f242;
            call (out), %rd2, (%rd3, in), proto;
            ld.param.b32 %f283, [out+0];
            ld.param.b32 %f284, [out+4];
            ld.param.b32 %f285, [out+8];
        }

        bra.uni l_done_282;
    }

l_masked_282:
    mov.b32 %f283, 0;
    mov.b32 %f284, 0;
    mov.b32 %f285, 0;

l_done_282:
    mov.b64 %rd286, 0x5851f42d4c957f2d;
    mad.lo.u64 %rd287, %rd109, %rd286, %rd47;
    selp.b64 %rd288, %rd287, %rd109, %p51;
    mad.lo.u64 %rd289, %rd288, %rd286, %rd47;
    selp.b64 %rd290, %rd289, %rd288, %p51;
    mov.b64 %rd291, 0x12;
    cvt.u32.u64 %r3, %rd291;
    shr.u64 %rd292, %rd290, %r3;
    xor.b64 %rd293, %rd292, %rd290;
    mov.b64 %rd294, 0x1b;
    cvt.u32.u64 %r3, %rd294;
    shr.u64 %rd295, %rd293, %r3;
    cvt.u32.u64 %r296, %rd295;
    mov.b64 %rd297, 0x3b;
    cvt.u32.u64 %r3, %rd297;
    shr.u64 %rd298, %rd290, %r3;
    cvt.u32.u64 %r299, %rd298;
    shr.u32 %r300, %r296, %r299;
    mov.b32 %r301, %r299;
    neg.s32 %r302, %r301;
    mov.b32 %r303, 0x1f;
    and.b32 %r304, %r302, %r303;
    mov.b32 %r305, %r304;
    shl.b32 %r306, %r296, %r305;
    or.b32 %r307, %r300, %r306;
    mov.b32 %r308, 0x9;
    shr.u32 %r309, %r307, %r308;
    mov.b32 %r310, 0x3f800000;
    or.b32 %r311, %r309, %r310;
    mov.b32 %f312, %r311;
    mov.b32 %f313, 0x3f800000;
    sub.ftz.f32 %f314, %f312, %f313;
    cvt.u32.u64 %r3, %rd291;
    shr.u64 %rd315, %rd109, %r3;
    xor.b64 %rd316, %rd315, %rd109;
    cvt.u32.u64 %r3, %rd294;
    shr.u64 %rd317, %rd316, %r3;
    cvt.u32.u64 %r318, %rd317;
    cvt.u32.u64 %r3, %rd297;
    shr.u64 %rd319, %rd109, %r3;
    cvt.u32.u64 %r320, %rd319;
    shr.u32 %r321, %r318, %r320;
    mov.b32 %r322, %r320;
    neg.s32 %r323, %r322;
    and.b32 %r324, %r323, %r303;
    mov.b32 %r325, %r324;
    shl.b32 %r326, %r318, %r325;
    or.b32 %r327, %r321, %r326;
    shr.u32 %r328, %r327, %r308;
    or.b32 %r329, %r328, %r310;
    mov.b32 %f330, %r329;
    sub.ftz.f32 %f331, %f330, %f313;
    cvt.u32.u64 %r3, %rd291;
    shr.u64 %rd332, %rd288, %r3;
    xor.b64 %rd333, %rd332, %rd288;
    cvt.u32.u64 %r3, %rd294;
    shr.u64 %rd334, %rd333, %r3;
    cvt.u32.u64 %r335, %rd334;
    cvt.u32.u64 %r3, %rd297;
    shr.u64 %rd336, %rd288, %r3;
    cvt.u32.u64 %r337, %rd336;
    shr.u32 %r338, %r335, %r337;
    mov.b32 %r339, %r337;
    neg.s32 %r340, %r339;
    and.b32 %r341, %r340, %r303;
    mov.b32 %r342, %r341;
    shl.b32 %r343, %r335, %r342;
    or.b32 %r344, %r338, %r343;
    shr.u32 %r345, %r344, %r308;
    or.b32 %r346, %r345, %r310;
    mov.b32 %f347, %r346;
    sub.ftz.f32 %f348, %f347, %f313;
    mad.lo.u64 %rd349, %rd290, %rd286, %rd47;
    selp.b64 %rd350, %rd349, %rd290, %p51;
    mov.b32 %r351, 0x0;
    setp.eq.u32 %p352, %r61, %r351;
    and.pred %p353, %p239, %p352;
    setp.eq.u32 %p354, %r111, %r351;
    and.pred %p355, %p353, %p354;
    mov.b32 %f356, 0x0;
    setp.le.f32 %p357, %f242, %f356;
    and.pred %p358, %p355, %p357;
    mul.ftz.f32 %f359, %f275, %f283;
    mul.ftz.f32 %f360, %f250, %f359;
    selp.b32 %f361, %f360, %f250, %p246;
    selp.b32 %f362, %f356, %f361, %p358;
    mul.ftz.f32 %f363, %f148, %f362;
    mov.b32 %f364, 0x3d000000;
    mul.ftz.f32 %f365, %f363, %f364;
    mov.b32 %f366, 0xc038aa3b;
    mov.b32 %f367, 0xbf000000;
    add.ftz.f32 %f368, %f276, %f367;
    mov.b32 %f369, 0x40000000;
    sub.ftz.f32 %f370, %f368, %f369;
    cvt.rpi.f32.f32 %f371, %f370;
    cvt.rzi.s32.f32 %r372, %f371;
    mov.b32 %r373, 0x0;
    max.s32 %r374, %r372, %r373;
    mov.b32 %r375, %r374;
    cvt.rn.f32.u32 %f376, %r375;
    sub.ftz.f32 %f377, %f376, %f368;
    mul.ftz.f32 %f378, %f377, %f377;
    mul.ftz.f32 %f379, %f366, %f378;
    ex2.approx.ftz.f32 %f380, %f379;
    mov.b32 %f381, 0x39afe108;
    sub.ftz.f32 %f382, %f380, %f381;
    max.ftz.f32 %f383, %f356, %f382;
    sub.ftz.f32 %f384, %f371, %f368;
    mul.ftz.f32 %f385, %f384, %f384;
    mul.ftz.f32 %f386, %f366, %f385;
    ex2.approx.ftz.f32 %f387, %f386;
    sub.ftz.f32 %f388, %f387, %f381;
    max.ftz.f32 %f389, %f356, %f388;
    add.ftz.f32 %f390, %f384, %f313;
    mul.ftz.f32 %f391, %f390, %f390;
    mul.ftz.f32 %f392, %f366, %f391;
    ex2.approx.ftz.f32 %f393, %f392;
    sub.ftz.f32 %f394, %f393, %f381;
    max.ftz.f32 %f395, %f356, %f394;
    add.ftz.f32 %f396, %f389, %f395;
    add.ftz.f32 %f397, %f390, %f313;
    mul.ftz.f32 %f398, %f397, %f397;
    mul.ftz.f32 %f399, %f366, %f398;
    ex2.approx.ftz.f32 %f400, %f399;
    sub.ftz.f32 %f401, %f400, %f381;
    max.ftz.f32 %f402, %f356, %f401;
    add.ftz.f32 %f403, %f396, %f402;
    add.ftz.f32 %f404, %f397, %f313;
    mul.ftz.f32 %f405, %f404, %f404;
    mul.ftz.f32 %f406, %f366, %f405;
    ex2.approx.ftz.f32 %f407, %f406;
    sub.ftz.f32 %f408, %f407, %f381;
    max.ftz.f32 %f409, %f356, %f408;
    add.ftz.f32 %f410, %f403, %f409;
    add.ftz.f32 %f411, %f277, %f367;
    sub.ftz.f32 %f412, %f411, %f369;
    cvt.rpi.f32.f32 %f413, %f412;
    sub.ftz.f32 %f414, %f413, %f411;
    mul.ftz.f32 %f415, %f414, %f414;
    mul.ftz.f32 %f416, %f366, %f415;
    ex2.approx.ftz.f32 %f417, %f416;
    sub.ftz.f32 %f418, %f417, %f381;
    max.ftz.f32 %f419, %f356, %f418;
    add.ftz.f32 %f420, %f414, %f313;
    mul.ftz.f32 %f421, %f420, %f420;
    mul.ftz.f32 %f422, %f366, %f421;
    ex2.approx.ftz.f32 %f423, %f422;
    sub.ftz.f32 %f424, %f423, %f381;
    max.ftz.f32 %f425, %f356, %f424;
    add.ftz.f32 %f426, %f419, %f425;
    add.ftz.f32 %f427, %f420, %f313;
    mul.ftz.f32 %f428, %f427, %f427;
    mul.ftz.f32 %f429, %f366, %f428;
    ex2.approx.ftz.f32 %f430, %f429;
    sub.ftz.f32 %f431, %f430, %f381;
    max.ftz.f32 %f432, %f356, %f431;
    add.ftz.f32 %f433, %f426, %f432;
    add.ftz.f32 %f434, %f427, %f313;
    mul.ftz.f32 %f435, %f434, %f434;
    mul.ftz.f32 %f436, %f366, %f435;
    ex2.approx.ftz.f32 %f437, %f436;
    sub.ftz.f32 %f438, %f437, %f381;
    max.ftz.f32 %f439, %f356, %f438;
    add.ftz.f32 %f440, %f433, %f439;
    mul.ftz.f32 %f441, %f410, %f440;
    setp.ne.f32 %p442, %f441, %f356;
    rcp.approx.ftz.f32 %f443, %f441;
    selp.b32 %f444, %f443, %f356, %p442;
    mul.ftz.f32 %f445, %f383, %f444;
    cvt.rzi.s32.f32 %r446, %f413;
    max.s32 %r447, %r446, %r373;
    mov.b32 %r448, %r447;
    cvt.rn.f32.u32 %f449, %r448;
    sub.ftz.f32 %f450, %f449, %f411;
    mul.ftz.f32 %f451, %f450, %f450;
    mul.ftz.f32 %f452, %f366, %f451;
    ex2.approx.ftz.f32 %f453, %f452;
    sub.ftz.f32 %f454, %f453, %f381;
    max.ftz.f32 %f455, %f356, %f454;
    mul.ftz.f32 %f456, %f445, %f455;
    mul.ftz.f32 %f457, %f365, %f456;
    mov.b32 %r458, 0x80;
    mad.lo.u32 %r459, %r448, %r458, %r375;
    mov.b32 %r460, 0x2;
    shl.b32 %r461, %r459, %r460;
    add.ftz.f32 %f462, %f368, %f369;
    cvt.rmi.f32.f32 %f463, %f462;
    cvt.rzi.s32.f32 %r464, %f463;
    mov.b32 %r465, 0x7f;
    min.s32 %r466, %r464, %r465;
    mov.b32 %r467, %r466;
    setp.ls.u32 %p468, %r375, %r467;
    add.ftz.f32 %f469, %f411, %f369;
    cvt.rmi.f32.f32 %f470, %f469;
    cvt.rzi.s32.f32 %r471, %f470;
    min.s32 %r472, %r471, %r465;
    mov.b32 %r473, %r472;
    setp.ls.u32 %p474, %r448, %r473;
    and.pred %p475, %p468, %p474;
    and.pred %p476, %p239, %p475;
    sub.u32 %r477, %r473, %r448;
    mov.b32 %r478, 0x1;
    add.u32 %r479, %r477, %r478;
    setp.lo.u32 %p480, %r351, %r479;
    and.pred %p481, %p476, %p480;
    sub.u32 %r482, %r467, %r375;
    add.u32 %r483, %r482, %r478;
    setp.lo.u32 %p484, %r351, %r483;
    and.pred %p485, %p481, %p484;
    and.pred %p486, %p49, %p485;
    @!%p486 bra l_487_done;
    mad.wide.u32 %rd3, %r461, 4, %rd10;
    red.global.add.f32 [%rd3], %f457;

l_487_done:
    mul.ftz.f32 %f488, %f275, %f284;
    mul.ftz.f32 %f489, %f250, %f488;
    selp.b32 %f490, %f489, %f250, %p246;
    selp.b32 %f491, %f356, %f490, %p358;
    mul.ftz.f32 %f492, %f186, %f491;
    mul.ftz.f32 %f493, %f492, %f364;
    mul.ftz.f32 %f494, %f493, %f456;
    add.u32 %r495, %r461, %r478;
    @!%p486 bra l_496_done;
    mad.wide.u32 %rd3, %r495, 4, %rd10;
    red.global.add.f32 [%rd3], %f494;

l_496_done:
    mul.ftz.f32 %f497, %f275, %f285;
    mul.ftz.f32 %f498, %f250, %f497;
    selp.b32 %f499, %f498, %f250, %p246;
    selp.b32 %f500, %f356, %f499, %p358;
    mul.ftz.f32 %f501, %f189, %f500;
    mul.ftz.f32 %f502, %f501, %f364;
    mul.ftz.f32 %f503, %f502, %f456;
    add.u32 %r504, %r495, %r478;
    @!%p486 bra l_505_done;
    mad.wide.u32 %rd3, %r504, 4, %rd10;
    red.global.add.f32 [%rd3], %f503;

l_505_done:
    add.ftz.f32 %f506, %f377, %f313;
    mul.ftz.f32 %f507, %f506, %f506;
    mul.ftz.f32 %f508, %f366, %f507;
    ex2.approx.ftz.f32 %f509, %f508;
    sub.ftz.f32 %f510, %f509, %f381;
    max.ftz.f32 %f511, %f356, %f510;
    mul.ftz.f32 %f512, %f511, %f444;
    mul.ftz.f32 %f513, %f512, %f455;
    mul.ftz.f32 %f514, %f365, %f513;
    add.u32 %r515, %r504, %r478;
    add.u32 %r516, %r515, %r478;
    setp.lo.u32 %p517, %r478, %r483;
    and.pred %p518, %p481, %p517;
    and.pred %p519, %p49, %p518;
    @!%p519 bra l_520_done;
    mad.wide.u32 %rd3, %r516, 4, %rd10;
    red.global.add.f32 [%rd3], %f514;

l_520_done:
    mul.ftz.f32 %f521, %f493, %f513;
    add.u32 %r522, %r516, %r478;
    @!%p519 bra l_523_done;
    mad.wide.u32 %rd3, %r522, 4, %rd10;
    red.global.add.f32 [%rd3], %f521;

l_523_done:
    mul.ftz.f32 %f524, %f502, %f513;
    add.u32 %r525, %r522, %r478;
    @!%p519 bra l_526_done;
    mad.wide.u32 %rd3, %r525, 4, %rd10;
    red.global.add.f32 [%rd3], %f524;

l_526_done:
    add.ftz.f32 %f527, %f506, %f313;
    mul.ftz.f32 %f528, %f527, %f527;
    mul.ftz.f32 %f529, %f366, %f528;
    ex2.approx.ftz.f32 %f530, %f529;
    sub.ftz.f32 %f531, %f530, %f381;
    max.ftz.f32 %f532, %f356, %f531;
    mul.ftz.f32 %f533, %f532, %f444;
    mul.ftz.f32 %f534, %f533, %f455;
    mul.ftz.f32 %f535, %f365, %f534;
    add.u32 %r536, %r525, %r478;
    add.u32 %r537, %r536, %r478;
    mov.b32 %r538, 0x2;
    setp.lo.u32 %p539, %r538, %r483;
    and.pred %p540, %p481, %p539;
    and.pred %p541, %p49, %p540;
    @!%p541 bra l_542_done;
    mad.wide.u32 %rd3, %r537, 4, %rd10;
    red.global.add.f32 [%rd3], %f535;

l_542_done:
    mul.ftz.f32 %f543, %f493, %f534;
    add.u32 %r544, %r537, %r478;
    @!%p541 bra l_545_done;
    mad.wide.u32 %rd3, %r544, 4, %rd10;
    red.global.add.f32 [%rd3], %f543;

l_545_done:
    mul.ftz.f32 %f546, %f502, %f534;
    add.u32 %r547, %r544, %r478;
    @!%p541 bra l_548_done;
    mad.wide.u32 %rd3, %r547, 4, %rd10;
    red.global.add.f32 [%rd3], %f546;

l_548_done:
    add.ftz.f32 %f549, %f527, %f313;
    mul.ftz.f32 %f550, %f549, %f549;
    mul.ftz.f32 %f551, %f366, %f550;
    ex2.approx.ftz.f32 %f552, %f551;
    sub.ftz.f32 %f553, %f552, %f381;
    max.ftz.f32 %f554, %f356, %f553;
    mul.ftz.f32 %f555, %f554, %f444;
    mul.ftz.f32 %f556, %f555, %f455;
    mul.ftz.f32 %f557, %f365, %f556;
    add.u32 %r558, %r547, %r478;
    add.u32 %r559, %r558, %r478;
    mov.b32 %r560, 0x3;
    setp.lo.u32 %p561, %r560, %r483;
    and.pred %p562, %p481, %p561;
    and.pred %p563, %p49, %p562;
    @!%p563 bra l_564_done;
    mad.wide.u32 %rd3, %r559, 4, %rd10;
    red.global.add.f32 [%rd3], %f557;

l_564_done:
    mul.ftz.f32 %f565, %f493, %f556;
    add.u32 %r566, %r559, %r478;
    @!%p563 bra l_567_done;
    mad.wide.u32 %rd3, %r566, 4, %rd10;
    red.global.add.f32 [%rd3], %f565;

l_567_done:
    mul.ftz.f32 %f568, %f502, %f556;
    add.u32 %r569, %r566, %r478;
    @!%p563 bra l_570_done;
    mad.wide.u32 %rd3, %r569, 4, %rd10;
    red.global.add.f32 [%rd3], %f568;

l_570_done:
    add.ftz.f32 %f571, %f450, %f313;
    mul.ftz.f32 %f572, %f571, %f571;
    mul.ftz.f32 %f573, %f366, %f572;
    ex2.approx.ftz.f32 %f574, %f573;
    sub.ftz.f32 %f575, %f574, %f381;
    max.ftz.f32 %f576, %f356, %f575;
    mul.ftz.f32 %f577, %f445, %f576;
    mul.ftz.f32 %f578, %f365, %f577;
    add.u32 %r579, %r569, %r478;
    add.u32 %r580, %r579, %r478;
    mov.b32 %r581, 0x1f0;
    add.u32 %r582, %r580, %r581;
    setp.lo.u32 %p583, %r478, %r479;
    and.pred %p584, %p476, %p583;
    and.pred %p585, %p584, %p484;
    and.pred %p586, %p49, %p585;
    @!%p586 bra l_587_done;
    mad.wide.u32 %rd3, %r582, 4, %rd10;
    red.global.add.f32 [%rd3], %f578;

l_587_done:
    mul.ftz.f32 %f588, %f493, %f577;
    add.u32 %r589, %r582, %r478;
    @!%p586 bra l_590_done;
    mad.wide.u32 %rd3, %r589, 4, %rd10;
    red.global.add.f32 [%rd3], %f588;

l_590_done:
    mul.ftz.f32 %f591, %f502, %f577;
    add.u32 %r592, %r589, %r478;
    @!%p586 bra l_593_done;
    mad.wide.u32 %rd3, %r592, 4, %rd10;
    red.global.add.f32 [%rd3], %f591;

l_593_done:
    mul.ftz.f32 %f594, %f512, %f576;
    mul.ftz.f32 %f595, %f365, %f594;
    add.u32 %r596, %r592, %r478;
    add.u32 %r597, %r596, %r478;
    and.pred %p598, %p584, %p517;
    and.pred %p599, %p49, %p598;
    @!%p599 bra l_600_done;
    mad.wide.u32 %rd3, %r597, 4, %rd10;
    red.global.add.f32 [%rd3], %f595;

l_600_done:
    mul.ftz.f32 %f601, %f493, %f594;
    add.u32 %r602, %r597, %r478;
    @!%p599 bra l_603_done;
    mad.wide.u32 %rd3, %r602, 4, %rd10;
    red.global.add.f32 [%rd3], %f601;

l_603_done:
    mul.ftz.f32 %f604, %f502, %f594;
    add.u32 %r605, %r602, %r478;
    @!%p599 bra l_606_done;
    mad.wide.u32 %rd3, %r605, 4, %rd10;
    red.global.add.f32 [%rd3], %f604;

l_606_done:
    mul.ftz.f32 %f607, %f533, %f576;
    mul.ftz.f32 %f608, %f365, %f607;
    add.u32 %r609, %r605, %r478;
    add.u32 %r610, %r609, %r478;
    and.pred %p611, %p584, %p539;
    and.pred %p612, %p49, %p611;
    @!%p612 bra l_613_done;
    mad.wide.u32 %rd3, %r610, 4, %rd10;
    red.global.add.f32 [%rd3], %f608;

l_613_done:
    mul.ftz.f32 %f614, %f493, %f607;
    add.u32 %r615, %r610, %r478;
    @!%p612 bra l_616_done;
    mad.wide.u32 %rd3, %r615, 4, %rd10;
    red.global.add.f32 [%rd3], %f614;

l_616_done:
    mul.ftz.f32 %f617, %f502, %f607;
    add.u32 %r618, %r615, %r478;
    @!%p612 bra l_619_done;
    mad.wide.u32 %rd3, %r618, 4, %rd10;
    red.global.add.f32 [%rd3], %f617;

l_619_done:
    mul.ftz.f32 %f620, %f555, %f576;
    mul.ftz.f32 %f621, %f365, %f620;
    add.u32 %r622, %r618, %r478;
    add.u32 %r623, %r622, %r478;
    and.pred %p624, %p584, %p561;
    and.pred %p625, %p49, %p624;
    @!%p625 bra l_626_done;
    mad.wide.u32 %rd3, %r623, 4, %rd10;
    red.global.add.f32 [%rd3], %f621;

l_626_done:
    mul.ftz.f32 %f627, %f493, %f620;
    add.u32 %r628, %r623, %r478;
    @!%p625 bra l_629_done;
    mad.wide.u32 %rd3, %r628, 4, %rd10;
    red.global.add.f32 [%rd3], %f627;

l_629_done:
    mul.ftz.f32 %f630, %f502, %f620;
    add.u32 %r631, %r628, %r478;
    @!%p625 bra l_632_done;
    mad.wide.u32 %rd3, %r631, 4, %rd10;
    red.global.add.f32 [%rd3], %f630;

l_632_done:
    add.ftz.f32 %f633, %f571, %f313;
    mul.ftz.f32 %f634, %f633, %f633;
    mul.ftz.f32 %f635, %f366, %f634;
    ex2.approx.ftz.f32 %f636, %f635;
    sub.ftz.f32 %f637, %f636, %f381;
    max.ftz.f32 %f638, %f356, %f637;
    mul.ftz.f32 %f639, %f445, %f638;
    mul.ftz.f32 %f640, %f365, %f639;
    add.u32 %r641, %r631, %r478;
    add.u32 %r642, %r641, %r478;
    add.u32 %r643, %r642, %r581;
    setp.lo.u32 %p644, %r538, %r479;
    and.pred %p645, %p476, %p644;
    and.pred %p646, %p645, %p484;
    and.pred %p647, %p49, %p646;
    @!%p647 bra l_648_done;
    mad.wide.u32 %rd3, %r643, 4, %rd10;
    red.global.add.f32 [%rd3], %f640;

l_648_done:
    mul.ftz.f32 %f649, %f493, %f639;
    add.u32 %r650, %r643, %r478;
    @!%p647 bra l_651_done;
    mad.wide.u32 %rd3, %r650, 4, %rd10;
    red.global.add.f32 [%rd3], %f649;

l_651_done:
    mul.ftz.f32 %f652, %f502, %f639;
    add.u32 %r653, %r650, %r478;
    @!%p647 bra l_654_done;
    mad.wide.u32 %rd3, %r653, 4, %rd10;
    red.global.add.f32 [%rd3], %f652;

l_654_done:
    mul.ftz.f32 %f655, %f512, %f638;
    mul.ftz.f32 %f656, %f365, %f655;
    add.u32 %r657, %r653, %r478;
    add.u32 %r658, %r657, %r478;
    and.pred %p659, %p645, %p517;
    and.pred %p660, %p49, %p659;
    @!%p660 bra l_661_done;
    mad.wide.u32 %rd3, %r658, 4, %rd10;
    red.global.add.f32 [%rd3], %f656;

l_661_done:
    mul.ftz.f32 %f662, %f493, %f655;
    add.u32 %r663, %r658, %r478;
    @!%p660 bra l_664_done;
    mad.wide.u32 %rd3, %r663, 4, %rd10;
    red.global.add.f32 [%rd3], %f662;

l_664_done:
    mul.ftz.f32 %f665, %f502, %f655;
    add.u32 %r666, %r663, %r478;
    @!%p660 bra l_667_done;
    mad.wide.u32 %rd3, %r666, 4, %rd10;
    red.global.add.f32 [%rd3], %f665;

l_667_done:
    mul.ftz.f32 %f668, %f533, %f638;
    mul.ftz.f32 %f669, %f365, %f668;
    add.u32 %r670, %r666, %r478;
    add.u32 %r671, %r670, %r478;
    and.pred %p672, %p645, %p539;
    and.pred %p673, %p49, %p672;
    @!%p673 bra l_674_done;
    mad.wide.u32 %rd3, %r671, 4, %rd10;
    red.global.add.f32 [%rd3], %f669;

l_674_done:
    mul.ftz.f32 %f675, %f493, %f668;
    add.u32 %r676, %r671, %r478;
    @!%p673 bra l_677_done;
    mad.wide.u32 %rd3, %r676, 4, %rd10;
    red.global.add.f32 [%rd3], %f675;

l_677_done:
    mul.ftz.f32 %f678, %f502, %f668;
    add.u32 %r679, %r676, %r478;
    @!%p673 bra l_680_done;
    mad.wide.u32 %rd3, %r679, 4, %rd10;
    red.global.add.f32 [%rd3], %f678;

l_680_done:
    mul.ftz.f32 %f681, %f555, %f638;
    mul.ftz.f32 %f682, %f365, %f681;
    add.u32 %r683, %r679, %r478;
    add.u32 %r684, %r683, %r478;
    and.pred %p685, %p645, %p561;
    and.pred %p686, %p49, %p685;
    @!%p686 bra l_687_done;
    mad.wide.u32 %rd3, %r684, 4, %rd10;
    red.global.add.f32 [%rd3], %f682;

l_687_done:
    mul.ftz.f32 %f688, %f493, %f681;
    add.u32 %r689, %r684, %r478;
    @!%p686 bra l_690_done;
    mad.wide.u32 %rd3, %r689, 4, %rd10;
    red.global.add.f32 [%rd3], %f688;

l_690_done:
    mul.ftz.f32 %f691, %f502, %f681;
    add.u32 %r692, %r689, %r478;
    @!%p686 bra l_693_done;
    mad.wide.u32 %rd3, %r692, 4, %rd10;
    red.global.add.f32 [%rd3], %f691;

l_693_done:
    add.ftz.f32 %f694, %f633, %f313;
    mul.ftz.f32 %f695, %f694, %f694;
    mul.ftz.f32 %f696, %f366, %f695;
    ex2.approx.ftz.f32 %f697, %f696;
    sub.ftz.f32 %f698, %f697, %f381;
    max.ftz.f32 %f699, %f356, %f698;
    mul.ftz.f32 %f700, %f445, %f699;
    mul.ftz.f32 %f701, %f365, %f700;
    add.u32 %r702, %r692, %r478;
    add.u32 %r703, %r702, %r478;
    add.u32 %r704, %r703, %r581;
    setp.lo.u32 %p705, %r560, %r479;
    and.pred %p706, %p476, %p705;
    and.pred %p707, %p706, %p484;
    and.pred %p708, %p49, %p707;
    @!%p708 bra l_709_done;
    mad.wide.u32 %rd3, %r704, 4, %rd10;
    red.global.add.f32 [%rd3], %f701;

l_709_done:
    mul.ftz.f32 %f710, %f493, %f700;
    add.u32 %r711, %r704, %r478;
    @!%p708 bra l_712_done;
    mad.wide.u32 %rd3, %r711, 4, %rd10;
    red.global.add.f32 [%rd3], %f710;

l_712_done:
    mul.ftz.f32 %f713, %f502, %f700;
    add.u32 %r714, %r711, %r478;
    @!%p708 bra l_715_done;
    mad.wide.u32 %rd3, %r714, 4, %rd10;
    red.global.add.f32 [%rd3], %f713;

l_715_done:
    mul.ftz.f32 %f716, %f512, %f699;
    mul.ftz.f32 %f717, %f365, %f716;
    add.u32 %r718, %r714, %r478;
    add.u32 %r719, %r718, %r478;
    and.pred %p720, %p706, %p517;
    and.pred %p721, %p49, %p720;
    @!%p721 bra l_722_done;
    mad.wide.u32 %rd3, %r719, 4, %rd10;
    red.global.add.f32 [%rd3], %f717;

l_722_done:
    mul.ftz.f32 %f723, %f493, %f716;
    add.u32 %r724, %r719, %r478;
    @!%p721 bra l_725_done;
    mad.wide.u32 %rd3, %r724, 4, %rd10;
    red.global.add.f32 [%rd3], %f723;

l_725_done:
    mul.ftz.f32 %f726, %f502, %f716;
    add.u32 %r727, %r724, %r478;
    @!%p721 bra l_728_done;
    mad.wide.u32 %rd3, %r727, 4, %rd10;
    red.global.add.f32 [%rd3], %f726;

l_728_done:
    mul.ftz.f32 %f729, %f533, %f699;
    mul.ftz.f32 %f730, %f365, %f729;
    add.u32 %r731, %r727, %r478;
    add.u32 %r732, %r731, %r478;
    and.pred %p733, %p706, %p539;
    and.pred %p734, %p49, %p733;
    @!%p734 bra l_735_done;
    mad.wide.u32 %rd3, %r732, 4, %rd10;
    red.global.add.f32 [%rd3], %f730;

l_735_done:
    mul.ftz.f32 %f736, %f493, %f729;
    add.u32 %r737, %r732, %r478;
    @!%p734 bra l_738_done;
    mad.wide.u32 %rd3, %r737, 4, %rd10;
    red.global.add.f32 [%rd3], %f736;

l_738_done:
    mul.ftz.f32 %f739, %f502, %f729;
    add.u32 %r740, %r737, %r478;
    @!%p734 bra l_741_done;
    mad.wide.u32 %rd3, %r740, 4, %rd10;
    red.global.add.f32 [%rd3], %f739;

l_741_done:
    mul.ftz.f32 %f742, %f555, %f699;
    mul.ftz.f32 %f743, %f365, %f742;
    add.u32 %r744, %r740, %r478;
    add.u32 %r745, %r744, %r478;
    and.pred %p746, %p706, %p561;
    and.pred %p747, %p49, %p746;
    @!%p747 bra l_748_done;
    mad.wide.u32 %rd3, %r745, 4, %rd10;
    red.global.add.f32 [%rd3], %f743;

l_748_done:
    mul.ftz.f32 %f749, %f493, %f742;
    add.u32 %r750, %r745, %r478;
    @!%p747 bra l_751_done;
    mad.wide.u32 %rd3, %r750, 4, %rd10;
    red.global.add.f32 [%rd3], %f749;

l_751_done:
    mul.ftz.f32 %f752, %f502, %f742;
    add.u32 %r753, %r750, %r478;
    @!%p747 bra l_754_done;
    mad.wide.u32 %rd3, %r753, 4, %rd10;
    red.global.add.f32 [%rd3], %f752;

l_754_done:
    mov.b32 %r755, 0x0;
    setp.ne.u32 %p756, %r111, %r755;
    and.pred %p757, %p51, %p756;
    and.pred %p758, %p49, %p757;

    @!%p758 bra l_masked_759;

    { // Call: mitsuba::BSDF::sample()

        mad.wide.u32 %rd3, %r111, 8, %rd6;
        ld.global.u64 %rd3, [%rd3];
        cvt.u32.u64 %r3, %rd3;
        call (%rd2), _optix_call_direct_callable, (%r3);
        shr.u64 %rd3, %rd3, 32;
        add.u64 %rd3, %rd3, %rd7;

        {
            proto: .callprototype (.param .align 4 .b8 result[28]) _(.reg .u64 data, .param .align 4 .b8 params[176]);
            .param .align 4 .b8 out[28];
            .param .align 4 .b8 in[176];
            st.param.b32 [in+0], %f86;
            st.param.b32 [in+4], %f48;
            st.param.b32 [in+8], %f77;
            st.param.b32 [in+12], %f78;
            st.param.b32 [in+16], %f79;
            st.param.b32 [in+20], %f56;
            st.param.b32 [in+24], %f54;
            st.param.b32 [in+28], %f52;
            st.param.b32 [in+32], %r61;
            st.param.b32 [in+36], %f87;
            st.param.b32 [in+40], %f88;
            st.param.b32 [in+44], %f73;
            st.param.b32 [in+48], %f70;
            st.param.b32 [in+52], %f66;
            st.param.b32 [in+56], %f72;
            st.param.b32 [in+60], %f69;
            st.param.b32 [in+64], %f64;
            st.param.b32 [in+68], %f71;
            st.param.b32 [in+72], %f68;
            st.param.b32 [in+76], %f59;
            st.param.b32 [in+80], %f89;
            st.param.b32 [in+84], %f90;
            st.param.b32 [in+88], %f91;
            st.param.b32 [in+92], %f92;
            st.param.b32 [in+96], %f93;
            st.param.b32 [in+100], %f94;
            st.param.b32 [in+104], %f95;
            st.param.b32 [in+108], %f96;
            st.param.b32 [in+112], %f97;
            st.param.b32 [in+116], %f98;
            st.param.b32 [in+120], %f99;
            st.param.b32 [in+124], %f100;
            st.param.b32 [in+128], %f101;
            st.param.b32 [in+132], %f102;
            st.param.b32 [in+136], %f103;
            st.param.b32 [in+140], %f104;
            st.param.b32 [in+144], %f67;
            st.param.b32 [in+148], %f65;
            st.param.b32 [in+152], %f58;
            st.param.b32 [in+156], %r105;
            st.param.b32 [in+160], %r106;
            st.param.b32 [in+164], %f314;
            st.param.b32 [in+168], %f331;
            st.param.b32 [in+172], %f348;
            call (out), %rd2, (%rd3, in), proto;
            ld.param.b32 %f762, [out+0];
            ld.param.b32 %f761, [out+4];
            ld.param.b32 %f760, [out+8];
            ld.param.b32 %f766, [out+12];
            ld.param.b32 %f763, [out+16];
            ld.param.b32 %f764, [out+20];
            ld.param.b32 %f765, [out+24];
        }

        bra.uni l_done_759;
    }

l_masked_759:
    mov.b32 %f762, 0;
    mov.b32 %f761, 0;
    mov.b32 %f760, 0;
    mov.b32 %f766, 0;
    mov.b32 %f763, 0;
    mov.b32 %f764, 0;
    mov.b32 %f765, 0;

l_done_759:
    neg.ftz.f32 %f767, %f53;
    neg.ftz.f32 %f768, %f55;
    neg.ftz.f32 %f769, %f57;
    mul.ftz.f32 %f770, %f56, %f769;
    fma.rn.ftz.f32 %f771, %f54, %f768, %f770;
    fma.rn.ftz.f32 %f772, %f52, %f767, %f771;
    mul.ftz.f32 %f773, %f772, %f58;
    mov.b32 %f774, 0x0;
    setp.gt.f32 %p775, %f773, %f774;
    mul.ftz.f32 %f776, %f66, %f762;
    fma.rn.ftz.f32 %f777, %f64, %f761, %f776;
    fma.rn.ftz.f32 %f778, %f59, %f760, %f777;
    mul.ftz.f32 %f779, %f70, %f762;
    fma.rn.ftz.f32 %f780, %f69, %f761, %f779;
    fma.rn.ftz.f32 %f781, %f68, %f760, %f780;
    mul.ftz.f32 %f782, %f73, %f762;
    fma.rn.ftz.f32 %f783, %f72, %f761, %f782;
    fma.rn.ftz.f32 %f784, %f71, %f760, %f783;
    mul.ftz.f32 %f785, %f56, %f784;
    fma.rn.ftz.f32 %f786, %f54, %f781, %f785;
    fma.rn.ftz.f32 %f787, %f52, %f778, %f786;
    mul.ftz.f32 %f788, %f787, %f760;
    setp.gt.f32 %p789, %f788, %f774;
    and.pred %p790, %p775, %p789;
    and.pred %p791, %p51, %p790;
    mul.ftz.f32 %f792, %f58, %f787;
    mul.ftz.f32 %f793, %f760, %f772;
    div.approx.ftz.f32 %f794, %f792, %f793;
    abs.f32 %f795, %f794;
    mul.ftz.f32 %f796, %f763, %f795;
    mul.ftz.f32 %f797, %f74, %f796;
    setp.ne.f32 %p798, %f797, %f774;
    mul.ftz.f32 %f799, %f764, %f795;
    mul.ftz.f32 %f800, %f75, %f799;
    setp.ne.f32 %p801, %f800, %f774;
    or.pred %p802, %p798, %p801;
    mul.ftz.f32 %f803, %f765, %f795;
    mul.ftz.f32 %f804, %f76, %f803;
    setp.ne.f32 %p805, %f804, %f774;
    or.pred %p806, %p802, %p805;
    and.pred %p807, %p791, %p806;
    not.pred %p808, %p807;
    mov.b32 %f809, 0x7f800000;
    mov.b32 %r810, %f787;
    mov.b32 %r811, 0x0;
    setp.lt.s32 %p812, %r810, %r811;
    mov.b32 %f813, 0x3f800000;
    abs.f32 %f814, %f77;
    abs.f32 %f815, %f78;
    max.ftz.f32 %f816, %f814, %f815;
    abs.f32 %f817, %f79;
    max.ftz.f32 %f818, %f816, %f817;
    add.ftz.f32 %f819, %f813, %f818;
    mov.b32 %f820, 0x38bb8000;
    mul.ftz.f32 %f821, %f819, %f820;
    neg.ftz.f32 %f822, %f821;
    selp.b32 %f823, %f822, %f821, %p812;
    fma.rn.ftz.f32 %f824, %f823, %f56, %f77;
    fma.rn.ftz.f32 %f825, %f823, %f54, %f78;
    fma.rn.ftz.f32 %f826, %f823, %f52, %f79;
    mov.b32 %f827, 0x7f7fffff;
    mov.b32 %r828, 0xff;
    mov.b32 %r829, 0x0;
    mov.b32 %r830, 0x1;
    and.pred %p831, %p49, %p807;
    .reg.u32 %u832_out_<32>;
    @!%p831 bra l_masked_832;
    .reg.u32 %u832_z, %u832_count;
    mov.u32 %u832_z, 0;
    mov.u32 %u832_count, 6;
    call (%u832_out_0, %u832_out_1, %u832_out_2, %u832_out_3, %u832_out_4, %u832_out_5, %u832_out_6, %u832_out_7, %u832_out_8, %u832_out_9, %u832_out_10, %u832_out_11, %u832_out_12, %u832_out_13, %u832_out_14, %u832_out_15, %u832_out_16, %u832_out_17, %u832_out_18, %u832_out_19, %u832_out_20, %u832_out_21, %u832_out_22, %u832_out_23, %u832_out_24, %u832_out_25, %u832_out_26, %u832_out_27, %u832_out_28, %u832_out_29, %u832_out_30, %u832_out_31), _optix_trace_typed_32, (%u832_z, %rd46, %f824, %f825, %f826, %f784, %f781, %f778, %f774, %f827, %f48, %r828, %r829, %r829, %r830, %r829, %u832_count, %r829, %r829, %r829, %r829, %r829, %r829, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z, %u832_z);

l_masked_832:
    mov.b32 %r833, %u832_out_0;
    mov.b32 %f834, %r833;
    selp.b32 %f835, %f809, %f834, %p808;
    setp.ne.f32 %p836, %f835, %f809;
    and.pred %p837, %p807, %p836;
    not.pred %p838, %p837;
    mov.b32 %r839, %u832_out_5;
    selp.b32 %r840, %r829, %r839, %p838;
    setp.eq.u32 %p841, %r840, %r829;
    mov.b32 %r842, %u832_out_4;
    selp.b32 %r843, %r829, %r842, %p838;
    selp.b32 %r844, %r843, %r840, %p841;
    mul.ftz.f32 %f845, %f80, %f766;
    mov.b32 %r846, %u832_out_3;
    mov.b32 %r847, %u832_out_1;
    mov.b32 %f848, %r847;
    mov.b32 %r849, %u832_out_2;
    mov.b32 %f850, %r849;
    mov.b32 %r851, 0x0;
    setp.ne.u32 %p852, %r844, %r851;
    and.pred %p853, %p837, %p852;
    and.pred %p854, %p49, %p853;

    @!%p854 bra l_masked_855;

    { // Call: mitsuba::Shape::compute_surface_interaction()

        mad.wide.u32 %rd3, %r844, 8, %rd8;
        ld.global.u64 %rd3, [%rd3];
        cvt.u32.u64 %r3, %rd3;
        call (%rd2), _optix_call_direct_callable, (%r3);
        shr.u64 %rd3, %rd3, 32;
        add.u64 %rd3, %rd3, %rd9;

        {
            proto: .callprototype (.param .align 4 .b8 result[76]) _(.reg .u32 self, .reg .u64 data, .param .align 4 .b8 params[52]);
            .param .align 4 .b8 out[76];
            .param .align 4 .b8 in[52];
            st.param.b32 [in+0], %f824;
            st.param.b32 [in+4], %f825;
            st.param.b32 [in+8], %f826;
            st.param.b32 [in+12], %f784;
            st.param.b32 [in+16], %f781;
            st.param.b32 [in+20], %f778;
            st.param.b32 [in+24], %f48;
            st.param.b32 [in+28], %f835;
            st.param.b32 [in+32], %f848;
            st.param.b32 [in+36], %f850;
            st.param.b32 [in+40], %r846;
            st.param.b32 [in+44], %r843;
            st.param.b32 [in+48], %r840;
            call (out), %rd2, (%r844, %rd3, in), proto;
            ld.param.b32 %f856, [out+0];
            ld.param.b32 %f857, [out+4];
            ld.param.b32 %f858, [out+8];
            ld.param.b32 %f859, [out+12];
            ld.param.b32 %f860, [out+16];
            ld.param.b32 %f861, [out+20];
            ld.param.b32 %f862, [out+24];
            ld.param.b32 %r863, [out+28];
            ld.param.b32 %f864, [out+32];
            ld.param.b32 %f865, [out+36];
            ld.param.b32 %f870, [out+40];
            ld.param.b32 %f871, [out+44];
            ld.param.b32 %f869, [out+48];
            ld.param.b32 %f866, [out+52];
            ld.param.b32 %f867, [out+56];
            ld.param.b32 %f868, [out+60];
            ld.param.b32 %f872, [out+64];
            ld.param.b32 %f873, [out+68];
            ld.param.b32 %f874, [out+72];
        }

        bra.uni l_done_855;
    }

l_masked_855:
    mov.b32 %f856, 0;
    mov.b32 %f857, 0;
    mov.b32 %f858, 0;
    mov.b32 %f859, 0;
    mov.b32 %f860, 0;
    mov.b32 %f861, 0;
    mov.b32 %f862, 0;
    mov.b32 %r863, 0;
    mov.b32 %f864, 0;
    mov.b32 %f865, 0;
    mov.b32 %f870, 0;
    mov.b32 %f871, 0;
    mov.b32 %f869, 0;
    mov.b32 %f866, 0;
    mov.b32 %f867, 0;
    mov.b32 %f868, 0;
    mov.b32 %f872, 0;
    mov.b32 %f873, 0;
    mov.b32 %f874, 0;

l_done_855:
    mov.b32 %f875, 0x0;
    mov.b32 %r876, 0x1;
    add.s32 %r877, %r50, %r876;
    mov.b32 %r878, 0x5;
    setp.gt.s32 %p879, %r877, %r878;
    mov.b32 %r880, 0x4;
    setp.lt.s32 %p881, %r877, %r880;
    and.pred %p882, %p807, %p881;
    not.pred %p883, %p837;
    mov.b32 %f884, 0x7f800000;
    selp.b32 %f885, %f884, %f856, %p883;
    setp.ne.f32 %p886, %f885, %f884;
    and.pred %p887, %p882, %p886;
    mov.b64 %rd888, 0x12;
    cvt.u32.u64 %r3, %rd888;
    shr.u64 %rd889, %rd350, %r3;
    xor.b64 %rd890, %rd889, %rd350;
    mov.b64 %rd891, 0x1b;
    cvt.u32.u64 %r3, %rd891;
    shr.u64 %rd892, %rd890, %r3;
    cvt.u32.u64 %r893, %rd892;
    mov.b64 %rd894, 0x3b;
    cvt.u32.u64 %r3, %rd894;
    shr.u64 %rd895, %rd350, %r3;
    cvt.u32.u64 %r896, %rd895;
    shr.u32 %r897, %r893, %r896;
    mov.b32 %r898, %r896;
    neg.s32 %r899, %r898;
    mov.b32 %r900, 0x1f;
    and.b32 %r901, %r899, %r900;
    mov.b32 %r902, %r901;
    shl.b32 %r903, %r893, %r902;
    or.b32 %r904, %r897, %r903;
    mov.b32 %r905, 0x9;
    shr.u32 %r906, %r904, %r905;
    mov.b32 %r907, 0x3f800000;
    or.b32 %r908, %r906, %r907;
    mov.b32 %f909, %r908;
    mov.b32 %f910, 0x3f800000;
    sub.ftz.f32 %f911, %f909, %f910;
    max.ftz.f32 %f912, %f797, %f800;
    max.ftz.f32 %f913, %f912, %f804;
    mul.ftz.f32 %f914, %f845, %f845;
    mul.ftz.f32 %f915, %f913, %f914;
    mov.b32 %f916, 0x3f733333;
    min.ftz.f32 %f917, %f915, %f916;
    setp.lt.f32 %p918, %f911, %f917;
    and.pred %p919, %p887, %p918;
    and.pred %p3, %p879, %p919;
    and.pred %p2, !%p879, %p887;
    or.pred %p920, %p2, %p3;
    and.pred %p3, %p49, %p920;
    and.pred %p2, !%p49, %p51;
    or.pred %p921, %p2, %p3;
    ld.const.u64 %rd0, [params+784];
    mad.wide.u32 %rd0, %r0, 1, %rd0;
    selp.u16 %w0, 1, 0, %p921;
    st.global.cs.b8 [%rd0], %w0;
    selp.b32 %r922, %r877, %r50, %p49;
    ld.const.u64 %rd0, [params+792];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %r922;
    selp.b32 %f923, %f824, %f81, %p49;
    ld.const.u64 %rd0, [params+800];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f923;
    selp.b32 %f924, %f825, %f82, %p49;
    ld.const.u64 %rd0, [params+808];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f924;
    selp.b32 %f925, %f826, %f83, %p49;
    ld.const.u64 %rd0, [params+816];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f925;
    selp.b32 %f926, %f784, %f57, %p49;
    ld.const.u64 %rd0, [params+824];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f926;
    selp.b32 %f927, %f781, %f55, %p49;
    ld.const.u64 %rd0, [params+832];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f927;
    selp.b32 %f928, %f778, %f53, %p49;
    ld.const.u64 %rd0, [params+840];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f928;
    selp.b32 %f929, %f827, %f84, %p49;
    ld.const.u64 %rd0, [params+848];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f929;
    selp.b32 %f930, %f48, %f85, %p49;
    ld.const.u64 %rd0, [params+856];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f930;
    rcp.approx.ftz.f32 %f931, %f917;
    mul.ftz.f32 %f932, %f797, %f931;
    selp.b32 %f933, %f932, %f797, %p879;
    selp.b32 %f934, %f933, %f74, %p49;
    ld.const.u64 %rd0, [params+864];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f934;
    mul.ftz.f32 %f935, %f800, %f931;
    selp.b32 %f936, %f935, %f800, %p879;
    selp.b32 %f937, %f936, %f75, %p49;
    ld.const.u64 %rd0, [params+872];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f937;
    mul.ftz.f32 %f938, %f804, %f931;
    selp.b32 %f939, %f938, %f804, %p879;
    selp.b32 %f940, %f939, %f76, %p49;
    ld.const.u64 %rd0, [params+880];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f940;
    selp.b32 %f941, %f885, %f86, %p49;
    ld.const.u64 %rd0, [params+888];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f941;
    selp.b32 %f942, %f857, %f77, %p49;
    ld.const.u64 %rd0, [params+896];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f942;
    selp.b32 %f943, %f858, %f78, %p49;
    ld.const.u64 %rd0, [params+904];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f943;
    selp.b32 %f944, %f859, %f79, %p49;
    ld.const.u64 %rd0, [params+912];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f944;
    selp.b32 %f945, %f860, %f56, %p49;
    ld.const.u64 %rd0, [params+920];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f945;
    selp.b32 %f946, %f861, %f54, %p49;
    ld.const.u64 %rd0, [params+928];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f946;
    selp.b32 %f947, %f862, %f52, %p49;
    ld.const.u64 %rd0, [params+936];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f947;
    and.pred %p948, %p837, %p886;
    not.pred %p949, %p948;
    mov.b32 %r950, 0x0;
    selp.b32 %r951, %r950, %r863, %p949;
    selp.b32 %r952, %r951, %r61, %p49;
    ld.const.u64 %rd0, [params+944];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %r952;
    selp.b32 %f953, %f864, %f87, %p49;
    ld.const.u64 %rd0, [params+952];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f953;
    selp.b32 %f954, %f865, %f88, %p49;
    ld.const.u64 %rd0, [params+960];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f954;
    mov.b32 %f955, 0x0;
    setp.eq.f32 %p956, %f866, %f955;
    setp.eq.f32 %p957, %f867, %f955;
    and.pred %p958, %p956, %p957;
    setp.eq.f32 %p959, %f868, %f955;
    and.pred %p960, %p958, %p959;
    mov.b32 %r961, %f869;
    mov.b32 %r962, 0x0;
    setp.lt.s32 %p963, %r961, %r962;
    mul.ftz.f32 %f964, %f870, %f870;
    mov.b32 %f965, 0xbf800000;
    selp.b32 %f966, %f965, %f910, %p963;
    add.ftz.f32 %f967, %f966, %f869;
    rcp.approx.ftz.f32 %f968, %f967;
    neg.ftz.f32 %f969, %f968;
    mul.ftz.f32 %f970, %f964, %f969;
    neg.ftz.f32 %f971, %f970;
    selp.b32 %f972, %f971, %f970, %p963;
    add.ftz.f32 %f973, %f972, %f910;
    mul.ftz.f32 %f974, %f870, %f866;
    fma.rn.ftz.f32 %f975, %f871, %f867, %f974;
    fma.rn.ftz.f32 %f976, %f869, %f868, %f975;
    neg.ftz.f32 %f977, %f976;
    fma.rn.ftz.f32 %f978, %f870, %f977, %f866;
    fma.rn.ftz.f32 %f979, %f869, %f977, %f868;
    fma.rn.ftz.f32 %f980, %f871, %f977, %f867;
    mul.ftz.f32 %f981, %f978, %f978;
    fma.rn.ftz.f32 %f982, %f980, %f980, %f981;
    fma.rn.ftz.f32 %f983, %f979, %f979, %f982;
    rsqrt.approx.ftz.f32 %f984, %f983;
    mul.ftz.f32 %f985, %f978, %f984;
    selp.b32 %f986, %f973, %f985, %p960;
    selp.b32 %f987, %f986, %f73, %p49;
    ld.const.u64 %rd0, [params+968];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f987;
    mul.ftz.f32 %f988, %f870, %f871;
    mul.ftz.f32 %f989, %f988, %f969;
    neg.ftz.f32 %f990, %f989;
    selp.b32 %f991, %f990, %f989, %p963;
    mul.ftz.f32 %f992, %f980, %f984;
    selp.b32 %f993, %f991, %f992, %p960;
    selp.b32 %f994, %f993, %f70, %p49;
    ld.const.u64 %rd0, [params+976];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f994;
    neg.ftz.f32 %f995, %f870;
    selp.b32 %f996, %f870, %f995, %p963;
    mul.ftz.f32 %f997, %f979, %f984;
    selp.b32 %f998, %f996, %f997, %p960;
    selp.b32 %f999, %f998, %f66, %p49;
    ld.const.u64 %rd0, [params+984];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f999;
    mul.ftz.f32 %f1000, %f869, %f993;
    neg.ftz.f32 %f1001, %f1000;
    fma.rn.ftz.f32 %f1002, %f871, %f998, %f1001;
    selp.b32 %f1003, %f1002, %f72, %p49;
    ld.const.u64 %rd0, [params+992];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1003;
    mul.ftz.f32 %f1004, %f870, %f998;
    neg.ftz.f32 %f1005, %f1004;
    fma.rn.ftz.f32 %f1006, %f869, %f986, %f1005;
    selp.b32 %f1007, %f1006, %f69, %p49;
    ld.const.u64 %rd0, [params+1000];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1007;
    mul.ftz.f32 %f1008, %f871, %f986;
    neg.ftz.f32 %f1009, %f1008;
    fma.rn.ftz.f32 %f1010, %f870, %f993, %f1009;
    selp.b32 %f1011, %f1010, %f64, %p49;
    ld.const.u64 %rd0, [params+1008];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1011;
    selp.b32 %f1012, %f870, %f71, %p49;
    ld.const.u64 %rd0, [params+1016];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1012;
    selp.b32 %f1013, %f871, %f68, %p49;
    ld.const.u64 %rd0, [params+1024];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1013;
    selp.b32 %f1014, %f869, %f59, %p49;
    ld.const.u64 %rd0, [params+1032];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1014;
    selp.b32 %f1015, %f866, %f89, %p49;
    ld.const.u64 %rd0, [params+1040];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1015;
    selp.b32 %f1016, %f867, %f90, %p49;
    ld.const.u64 %rd0, [params+1048];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1016;
    selp.b32 %f1017, %f868, %f91, %p49;
    ld.const.u64 %rd0, [params+1056];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1017;
    selp.b32 %f1018, %f872, %f92, %p49;
    ld.const.u64 %rd0, [params+1064];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1018;
    selp.b32 %f1019, %f873, %f93, %p49;
    ld.const.u64 %rd0, [params+1072];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1019;
    selp.b32 %f1020, %f874, %f94, %p49;
    ld.const.u64 %rd0, [params+1080];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1020;
    selp.b32 %f1021, %f875, %f95, %p49;
    ld.const.u64 %rd0, [params+1088];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1021;
    selp.b32 %f1022, %f875, %f96, %p49;
    ld.const.u64 %rd0, [params+1096];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1022;
    selp.b32 %f1023, %f875, %f97, %p49;
    ld.const.u64 %rd0, [params+1104];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1023;
    selp.b32 %f1024, %f875, %f98, %p49;
    ld.const.u64 %rd0, [params+1112];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1024;
    selp.b32 %f1025, %f875, %f99, %p49;
    ld.const.u64 %rd0, [params+1120];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1025;
    selp.b32 %f1026, %f875, %f100, %p49;
    ld.const.u64 %rd0, [params+1128];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1026;
    selp.b32 %f1027, %f955, %f101, %p49;
    ld.const.u64 %rd0, [params+1136];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1027;
    selp.b32 %f1028, %f955, %f102, %p49;
    ld.const.u64 %rd0, [params+1144];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1028;
    selp.b32 %f1029, %f955, %f103, %p49;
    ld.const.u64 %rd0, [params+1152];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1029;
    selp.b32 %f1030, %f955, %f104, %p49;
    ld.const.u64 %rd0, [params+1160];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1030;
    neg.ftz.f32 %f1031, %f778;
    neg.ftz.f32 %f1032, %f781;
    neg.ftz.f32 %f1033, %f784;
    mul.ftz.f32 %f1034, %f1033, %f986;
    fma.rn.ftz.f32 %f1035, %f1032, %f993, %f1034;
    fma.rn.ftz.f32 %f1036, %f1031, %f998, %f1035;
    selp.b32 %f1037, %f1036, %f1033, %p948;
    selp.b32 %f1038, %f1037, %f67, %p49;
    ld.const.u64 %rd0, [params+1168];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1038;
    mul.ftz.f32 %f1039, %f1033, %f1002;
    fma.rn.ftz.f32 %f1040, %f1032, %f1006, %f1039;
    fma.rn.ftz.f32 %f1041, %f1031, %f1010, %f1040;
    selp.b32 %f1042, %f1041, %f1032, %p948;
    selp.b32 %f1043, %f1042, %f65, %p49;
    ld.const.u64 %rd0, [params+1176];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1043;
    mul.ftz.f32 %f1044, %f1033, %f870;
    fma.rn.ftz.f32 %f1045, %f1032, %f871, %f1044;
    fma.rn.ftz.f32 %f1046, %f1031, %f869, %f1045;
    selp.b32 %f1047, %f1046, %f1031, %p948;
    selp.b32 %f1048, %f1047, %f58, %p49;
    ld.const.u64 %rd0, [params+1184];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1048;
    selp.b32 %r1049, %r846, %r105, %p49;
    ld.const.u64 %rd0, [params+1192];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %r1049;
    selp.b32 %r1050, %r950, %r851, %p949;
    selp.b32 %r1051, %r1050, %r106, %p49;
    ld.const.u64 %rd0, [params+1200];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %r1051;
    selp.b32 %f1052, %f845, %f80, %p49;
    ld.const.u64 %rd0, [params+1208];
    mad.wide.u32 %rd0, %r0, 4, %rd0;
    st.global.cs.b32 [%rd0], %f1052;
    mov.b64 %rd1053, 0x5851f42d4c957f2d;
    mad.lo.u64 %rd1054, %rd350, %rd1053, %rd47;
    selp.b64 %rd1055, %rd1054, %rd350, %p887;
    selp.b64 %rd1056, %rd1055, %rd60, %p49;
    ld.const.u64 %rd0, [params+1216];
    mad.wide.u32 %rd0, %r0, 8, %rd0;
    st.global.cs.b64 [%rd0], %rd1056;
    and.pred %p1057, %p49, %p921;
    ld.const.u64 %rd0, [params+1224];
    mad.wide.u32 %rd0, %r0, 1, %rd0;
    selp.u16 %w0, 1, 0, %p1057;
    st.global.cs.b8 [%rd0], %w0;
    ret;
}

.visible .func (.param .align 4 .b8 result[76]) __direct_callable__1df600217dbd334aaa725d4808e7bd22(.reg .u32 self, .reg .u64 data, .param .align 4 .b8 params[52]) {
    // Call: mitsuba::Shape::compute_surface_interaction()
    .reg.b8   %b <121>; .reg.b16  %w<121>; .reg.b32 %r<121>;
    .reg.b64  %rd<121>; .reg.f16  %h<121>; .reg.f32 %f<121>;
    .reg.f64  %d <121>; .reg.pred %p<121>;
    ld.global.b64 %rd4, [data+0];
    ld.global.b64 %rd5, [data+8];
    ld.global.b64 %rd6, [data+16];
    ld.param.b32 %f7, [params+28];
    ld.param.b32 %r8, [params+40];
    mov.b32 %r9, 0x3;
    mul.lo.u32 %r10, %r8, %r9;
    mov.pred %p11, 0x1;
    mad.wide.u32 %rd3, %r10, 4, %rd5;
    ld.global.nc.b32 %r12, [%rd3];
    mul.lo.u32 %r13, %r12, %r9;
    mad.wide.u32 %rd3, %r13, 4, %rd4;
    ld.global.nc.b32 %f14, [%rd3];
    mov.b32 %f15, 0x3f800000;
    ld.param.b32 %f16, [params+32];
    sub.ftz.f32 %f17, %f15, %f16;
    ld.param.b32 %f18, [params+36];
    sub.ftz.f32 %f19, %f17, %f18;
    mov.b32 %r20, 0x1;
    mad.lo.u32 %r21, %r8, %r9, %r20;
    mad.wide.u32 %rd3, %r21, 4, %rd5;
    ld.global.nc.b32 %r22, [%rd3];
    mul.lo.u32 %r23, %r22, %r9;
    mad.wide.u32 %rd3, %r23, 4, %rd4;
    ld.global.nc.b32 %f24, [%rd3];
    mov.b32 %r25, 0x2;
    mad.lo.u32 %r26, %r8, %r9, %r25;
    mad.wide.u32 %rd3, %r26, 4, %rd5;
    ld.global.nc.b32 %r27, [%rd3];
    mul.lo.u32 %r28, %r27, %r9;
    mad.wide.u32 %rd3, %r28, 4, %rd4;
    ld.global.nc.b32 %f29, [%rd3];
    mul.ftz.f32 %f30, %f29, %f18;
    fma.rn.ftz.f32 %f31, %f24, %f16, %f30;
    fma.rn.ftz.f32 %f32, %f14, %f19, %f31;
    mad.lo.u32 %r33, %r12, %r9, %r20;
    mad.wide.u32 %rd3, %r33, 4, %rd4;
    ld.global.nc.b32 %f34, [%rd3];
    mad.lo.u32 %r35, %r22, %r9, %r20;
    mad.wide.u32 %rd3, %r35, 4, %rd4;
    ld.global.nc.b32 %f36, [%rd3];
    mad.lo.u32 %r37, %r27, %r9, %r20;
    mad.wide.u32 %rd3, %r37, 4, %rd4;
    ld.global.nc.b32 %f38, [%rd3];
    mul.ftz.f32 %f39, %f38, %f18;
    fma.rn.ftz.f32 %f40, %f36, %f16, %f39;
    fma.rn.ftz.f32 %f41, %f34, %f19, %f40;
    mad.lo.u32 %r42, %r12, %r9, %r25;
    mad.wide.u32 %rd3, %r42, 4, %rd4;
    ld.global.nc.b32 %f43, [%rd3];
    mad.lo.u32 %r44, %r22, %r9, %r25;
    mad.wide.u32 %rd3, %r44, 4, %rd4;
    ld.global.nc.b32 %f45, [%rd3];
    mad.lo.u32 %r46, %r27, %r9, %r25;
    mad.wide.u32 %rd3, %r46, 4, %rd4;
    ld.global.nc.b32 %f47, [%rd3];
    mul.ftz.f32 %f48, %f47, %f18;
    fma.rn.ftz.f32 %f49, %f45, %f16, %f48;
    fma.rn.ftz.f32 %f50, %f43, %f19, %f49;
    sub.ftz.f32 %f51, %f36, %f34;
    sub.ftz.f32 %f52, %f47, %f43;
    sub.ftz.f32 %f53, %f45, %f43;
    sub.ftz.f32 %f54, %f38, %f34;
    mul.ftz.f32 %f55, %f53, %f54;
    neg.ftz.f32 %f56, %f55;
    fma.rn.ftz.f32 %f57, %f51, %f52, %f56;
    sub.ftz.f32 %f58, %f24, %f14;
    sub.ftz.f32 %f59, %f29, %f14;
    mul.ftz.f32 %f60, %f51, %f59;
    neg.ftz.f32 %f61, %f60;
    fma.rn.ftz.f32 %f62, %f58, %f54, %f61;
    mul.ftz.f32 %f63, %f58, %f52;
    neg.ftz.f32 %f64, %f63;
    fma.rn.ftz.f32 %f65, %f53, %f59, %f64;
    mul.ftz.f32 %f66, %f57, %f57;
    fma.rn.ftz.f32 %f67, %f65, %f65, %f66;
    fma.rn.ftz.f32 %f68, %f62, %f62, %f67;
    rsqrt.approx.ftz.f32 %f69, %f68;
    mul.ftz.f32 %f70, %f57, %f69;
    mul.ftz.f32 %f71, %f65, %f69;
    mul.ftz.f32 %f72, %f62, %f69;
    mov.u32 %r73, self;
    mad.wide.u32 %rd3, %r28, 4, %rd6;
    ld.global.nc.b32 %f74, [%rd3];
    mad.wide.u32 %rd3, %r23, 4, %rd6;
    ld.global.nc.b32 %f75, [%rd3];
    mad.wide.u32 %rd3, %r13, 4, %rd6;
    ld.global.nc.b32 %f76, [%rd3];
    mul.ftz.f32 %f77, %f76, %f19;
    fma.rn.ftz.f32 %f78, %f75, %f16, %f77;
    fma.rn.ftz.f32 %f79, %f74, %f18, %f78;
    mad.wide.u32 %rd3, %r46, 4, %rd6;
    ld.global.nc.b32 %f80, [%rd3];
    mad.wide.u32 %rd3, %r44, 4, %rd6;
    ld.global.nc.b32 %f81, [%rd3];
    mad.wide.u32 %rd3, %r42, 4, %rd6;
    ld.global.nc.b32 %f82, [%rd3];
    mul.ftz.f32 %f83, %f82, %f19;
    fma.rn.ftz.f32 %f84, %f81, %f16, %f83;
    fma.rn.ftz.f32 %f85, %f80, %f18, %f84;
    mad.wide.u32 %rd3, %r37, 4, %rd6;
    ld.global.nc.b32 %f86, [%rd3];
    mad.wide.u32 %rd3, %r35, 4, %rd6;
    ld.global.nc.b32 %f87, [%rd3];
    mad.wide.u32 %rd3, %r33, 4, %rd6;
    ld.global.nc.b32 %f88, [%rd3];
    mul.ftz.f32 %f89, %f88, %f19;
    fma.rn.ftz.f32 %f90, %f87, %f16, %f89;
    fma.rn.ftz.f32 %f91, %f86, %f18, %f90;
    mul.ftz.f32 %f92, %f79, %f79;
    fma.rn.ftz.f32 %f93, %f91, %f91, %f92;
    fma.rn.ftz.f32 %f94, %f85, %f85, %f93;
    rsqrt.approx.ftz.f32 %f95, %f94;
    mul.ftz.f32 %f96, %f79, %f95;
    mul.ftz.f32 %f97, %f91, %f95;
    mul.ftz.f32 %f98, %f85, %f95;
    mov.b32 %r99, %f72;
    mov.b32 %r100, 0x0;
    setp.lt.s32 %p101, %r99, %r100;
    mul.ftz.f32 %f102, %f70, %f70;
    mov.b32 %f103, 0xbf800000;
    selp.b32 %f104, %f103, %f15, %p101;
    add.ftz.f32 %f105, %f104, %f72;
    rcp.approx.ftz.f32 %f106, %f105;
    neg.ftz.f32 %f107, %f106;
    mul.ftz.f32 %f108, %f102, %f107;
    neg.ftz.f32 %f109, %f108;
    selp.b32 %f110, %f109, %f108, %p101;
    add.ftz.f32 %f111, %f110, %f15;
    mul.ftz.f32 %f112, %f70, %f71;
    mul.ftz.f32 %f113, %f112, %f107;
    neg.ftz.f32 %f114, %f113;
    selp.b32 %f115, %f114, %f113, %p101;
    neg.ftz.f32 %f116, %f70;
    selp.b32 %f117, %f70, %f116, %p101;
    mul.ftz.f32 %f118, %f71, %f107;
    fma.rn.ftz.f32 %f119, %f71, %f118, %f104;
    neg.ftz.f32 %f120, %f71;
    st.param.b32 [result+0], %f7;
    st.param.b32 [result+4], %f32;
    st.param.b32 [result+8], %f41;
    st.param.b32 [result+12], %f50;
    st.param.b32 [result+16], %f70;
    st.param.b32 [result+20], %f71;
    st.param.b32 [result+24], %f72;
    st.param.b32 [result+28], %r73;
    st.param.b32 [result+32], %f16;
    st.param.b32 [result+36], %f18;
    st.param.b32 [result+40], %f96;
    st.param.b32 [result+44], %f97;
    st.param.b32 [result+48], %f98;
    st.param.b32 [result+52], %f111;
    st.param.b32 [result+56], %f115;
    st.param.b32 [result+60], %f117;
    st.param.b32 [result+64], %f113;
    st.param.b32 [result+68], %f119;
    st.param.b32 [result+72], %f120;
    ret;
}

.visible .func (.param .align 4 .b8 result[12]) __direct_callable__2c1902982d8a799b2921eed551aff8be(.reg .u64 data, .param .align 4 .b8 params[168]) {
    // Call: mitsuba::BSDF::eval()
    .reg.b8   %b <15>; .reg.b16  %w<15>; .reg.b32 %r<15>;
    .reg.b64  %rd<15>; .reg.f16  %h<15>; .reg.f32 %f<15>;
    .reg.f64  %d <15>; .reg.pred %p<15>;
    ld.global.b32 %f4, [data+0];
    ld.param.b32 %f5, [params+152];
    mov.b32 %f6, 0x0;
    setp.gt.f32 %p7, %f5, %f6;
    ld.param.b32 %f8, [params+164];
    setp.gt.f32 %p9, %f8, %f6;
    and.pred %p10, %p7, %p9;
    mov.b32 %f11, 0x3ea2f983;
    mul.ftz.f32 %f12, %f4, %f11;
    mul.ftz.f32 %f13, %f12, %f8;
    selp.b32 %f14, %f13, %f6, %p10;
    st.param.b32 [result+0], %f14;
    st.param.b32 [result+4], %f14;
    st.param.b32 [result+8], %f14;
    ret;
}

.visible .func (.param .align 4 .b8 result[28]) __direct_callable__2d5a8b9371ce6959b2749cf5fae1a167(.reg .u64 data, .param .align 4 .b8 params[176]) {
    // Call: mitsuba::BSDF::sample()
    .reg.b8   %b <54>; .reg.b16  %w<54>; .reg.b32 %r<54>;
    .reg.b64  %rd<54>; .reg.f16  %h<54>; .reg.f32 %f<54>;
    .reg.f64  %d <54>; .reg.pred %p<54>;
    ld.global.b32 %f4, [data+0];
    ld.param.b32 %f5, [params+164];
    ld.param.b32 %f6, [params+152];
    abs.f32 %f7, %f6;
    mov.b32 %f8, 0x0;
    setp.eq.f32 %p9, %f7, %f8;
    mov.b32 %f10, 0x3f800000;
    mov.b32 %f11, 0x3f000000;
    setp.ge.f32 %p12, %f6, %f8;
    mov.b32 %f13, 0x3fbff263;
    mov.b32 %f14, 0x3f2ab6c5;
    selp.b32 %f15, %f13, %f14, %p12;
    neg.ftz.f32 %f16, %f6;
    fma.rn.ftz.f32 %f17, %f6, %f16, %f10;
    selp.b32 %f18, %f14, %f13, %p12;
    mul.ftz.f32 %f19, %f18, %f18;
    neg.ftz.f32 %f20, %f19;
    fma.rn.ftz.f32 %f21, %f17, %f20, %f10;
    max.ftz.f32 %f22, %f21, %f8;
    sqrt.approx.ftz.f32 %f23, %f22;
    neg.ftz.f32 %f24, %f23;
    fma.rn.ftz.f32 %f25, %f15, %f24, %f7;
    fma.rn.ftz.f32 %f26, %f15, %f23, %f7;
    div.approx.ftz.f32 %f27, %f25, %f26;
    mul.ftz.f32 %f28, %f27, %f27;
    neg.ftz.f32 %f29, %f7;
    fma.rn.ftz.f32 %f30, %f15, %f29, %f23;
    fma.rn.ftz.f32 %f31, %f15, %f7, %f23;
    div.approx.ftz.f32 %f32, %f30, %f31;
    mul.ftz.f32 %f33, %f32, %f32;
    add.ftz.f32 %f34, %f28, %f33;
    mul.ftz.f32 %f35, %f11, %f34;
    selp.b32 %f36, %f10, %f35, %p9;
    setp.le.f32 %p37, %f5, %f36;
    ld.param.b32 %f38, [params+144];
    neg.ftz.f32 %f39, %f38;
    neg.ftz.f32 %f40, %f18;
    mul.ftz.f32 %f41, %f40, %f38;
    selp.b32 %f42, %f39, %f41, %p37;
    ld.param.b32 %f43, [params+148];
    neg.ftz.f32 %f44, %f43;
    mul.ftz.f32 %f45, %f40, %f43;
    selp.b32 %f46, %f44, %f45, %p37;
    mov.b32 %r47, %f6;
    mov.b32 %r48, 0x0;
    setp.lt.s32 %p49, %r47, %r48;
    selp.b32 %f50, %f23, %f24, %p49;
    selp.b32 %f51, %f6, %f50, %p37;
    selp.b32 %f52, %f10, %f15, %p37;
    selp.b32 %f53, %f4, %f10, %p37;
    st.param.b32 [result+0], %f42;
    st.param.b32 [result+4], %f46;
    st.param.b32 [result+8], %f51;
    st.param.b32 [result+12], %f52;
    st.param.b32 [result+16], %f53;
    st.param.b32 [result+20], %f53;
    st.param.b32 [result+24], %f53;
    ret;
}

.visible .func (.param .align 4 .b8 result[28]) __direct_callable__955a00be0b580e05d3ecd87bf59c1848(.reg .u64 data, .param .align 4 .b8 params[176]) {
    // Call: mitsuba::BSDF::sample()
    .reg.b8   %b <100>; .reg.b16  %w<100>; .reg.b32 %r<100>;
    .reg.b64  %rd<100>; .reg.f16  %h<100>; .reg.f32 %f<100>;
    .reg.f64  %d <100>; .reg.pred %p<100>;
    ld.global.b32 %f4, [data+0];
    mov.b32 %f5, 0x40000000;
    ld.param.b32 %f6, [params+168];
    mov.b32 %f7, 0xbf800000;
    fma.rn.ftz.f32 %f8, %f5, %f6, %f7;
    abs.f32 %f9, %f8;
    ld.param.b32 %f10, [params+172];
    fma.rn.ftz.f32 %f11, %f5, %f10, %f7;
    abs.f32 %f12, %f11;
    setp.lt.f32 %p13, %f9, %f12;
    selp.b32 %f14, %f11, %f8, %p13;
    mov.b32 %f15, 0x0;
    setp.eq.f32 %p16, %f8, %f15;
    setp.eq.f32 %p17, %f11, %f15;
    and.pred %p18, %p16, %p17;
    mov.b32 %f19, 0x3fc90fdb;
    mov.b32 %f20, 0x3f490fdb;
    selp.b32 %f21, %f8, %f11, %p13;
    mul.ftz.f32 %f22, %f20, %f21;
    div.approx.ftz.f32 %f23, %f22, %f14;
    sub.ftz.f32 %f24, %f19, %f23;
    selp.b32 %f25, %f24, %f23, %p13;
    selp.b32 %f26, %f15, %f25, %p18;
    abs.f32 %f27, %f26;
    mov.b32 %f28, 0x3fa2f983;
    mul.ftz.f32 %f29, %f27, %f28;
    cvt.rzi.s32.f32 %r30, %f29;
    mov.b32 %r31, 0x1;
    add.s32 %r32, %r30, %r31;
    mov.b32 %r33, 0xfffffffe;
    and.b32 %r34, %r32, %r33;
    mov.b32 %r35, 0x2;
    and.b32 %r36, %r34, %r35;
    mov.b32 %r37, 0x0;
    setp.eq.s32 %p38, %r36, %r37;
    cvt.rn.f32.s32 %f39, %r34;
    mov.b32 %f40, 0x3f490000;
    mul.ftz.f32 %f41, %f39, %f40;
    sub.ftz.f32 %f42, %f27, %f41;
    mov.b32 %f43, 0x397da000;
    mul.ftz.f32 %f44, %f39, %f43;
    sub.ftz.f32 %f45, %f42, %f44;
    mov.b32 %f46, 0x33222169;
    mul.ftz.f32 %f47, %f39, %f46;
    sub.ftz.f32 %f48, %f45, %f47;
    mul.ftz.f32 %f49, %f48, %f48;
    mov.b32 %f50, 0x7f800000;
    setp.eq.f32 %p51, %f27, %f50;
    selp.b32 %f52, -1, %f49, %p51;
    mul.ftz.f32 %f53, %f52, %f52;
    mov.b32 %f54, 0x37ccf5ce;
    mov.b32 %f55, 0xbab6061a;
    mov.b32 %f56, 0x3d2aaaa5;
    fma.rn.ftz.f32 %f57, %f52, %f55, %f56;
    fma.rn.ftz.f32 %f58, %f53, %f54, %f57;
    mul.ftz.f32 %f59, %f58, %f52;
    mov.b32 %f60, 0xbf000000;
    mov.b32 %f61, 0x3f800000;
    fma.rn.ftz.f32 %f62, %f52, %f60, %f61;
    fma.rn.ftz.f32 %f63, %f59, %f52, %f62;
    mov.b32 %f64, 0xb94ca1f9;
    mov.b32 %f65, 0x3c08839e;
    mov.b32 %f66, 0xbe2aaaa3;
    fma.rn.ftz.f32 %f67, %f52, %f65, %f66;
    fma.rn.ftz.f32 %f68, %f53, %f64, %f67;
    mul.ftz.f32 %f69, %f68, %f52;
    fma.rn.ftz.f32 %f70, %f69, %f48, %f48;
    selp.b32 %f71, %f63, %f70, %p38;
    mov.b32 %f72, 0x80000000;
    sub.s32 %r73, %r34, %r35;
    not.b32 %r74, %r73;
    mov.b32 %r75, 0x1d;
    shl.b32 %r76, %r74, %r75;
    mov.b32 %f77, %r76;
    and.b32 %f78, %f72, %f77;
    xor.b32 %f79, %f71, %f78;
    mul.ftz.f32 %f80, %f14, %f79;
    selp.b32 %f81, %f70, %f63, %p38;
    shl.b32 %r82, %r34, %r75;
    mov.b32 %f83, %r82;
    xor.b32 %f84, %f83, %f26;
    and.b32 %f85, %f72, %f84;
    xor.b32 %f86, %f81, %f85;
    mul.ftz.f32 %f87, %f14, %f86;
    mul.ftz.f32 %f88, %f80, %f80;
    fma.rn.ftz.f32 %f89, %f87, %f87, %f88;
    sub.ftz.f32 %f90, %f61, %f89;
    max.ftz.f32 %f91, %f90, %f15;
    sqrt.approx.ftz.f32 %f92, %f91;
    ld.param.b32 %f93, [params+152];
    setp.gt.f32 %p94, %f93, %f15;
    mov.b32 %f95, 0x3ea2f983;
    mul.ftz.f32 %f96, %f95, %f92;
    setp.gt.f32 %p97, %f96, %f15;
    and.pred %p98, %p94, %p97;
    selp.b32 %f99, %f4, %f15, %p98;
    st.param.b32 [result+0], %f80;
    st.param.b32 [result+4], %f87;
    st.param.b32 [result+8], %f92;
    st.param.b32 [result+12], %f61;
    st.param.b32 [result+16], %f99;
    st.param.b32 [result+20], %f99;
    st.param.b32 [result+24], %f99;
    ret;
}

.visible .func (.param .align 4 .b8 result[76]) __direct_callable__9ac33c82a1dfec558e55bfa0d4c18131(.reg .u32 self, .reg .u64 data, .param .align 4 .b8 params[52]) {
    // Call: mitsuba::Shape::compute_surface_interaction()
    .reg.b8   %b <177>; .reg.b16  %w<177>; .reg.b32 %r<177>;
    .reg.b64  %rd<177>; .reg.f16  %h<177>; .reg.f32 %f<177>;
    .reg.f64  %d <177>; .reg.pred %p<177>;
    ld.global.b64 %rd4, [data+0];
    ld.global.b64 %rd5, [data+8];
    ld.global.b64 %rd6, [data+16];
    ld.global.b64 %rd7, [data+24];
    ld.param.b32 %f8, [params+28];
    ld.param.b32 %r9, [params+40];
    mov.b32 %r10, 0x3;
    mul.lo.u32 %r11, %r9, %r10;
    mov.pred %p12, 0x1;
    mad.wide.u32 %rd3, %r11, 4, %rd5;
    ld.global.nc.b32 %r13, [%rd3];
    mul.lo.u32 %r14, %r13, %r10;
    mad.wide.u32 %rd3, %r14, 4, %rd4;
    ld.global.nc.b32 %f15, [%rd3];
    mov.b32 %f16, 0x3f800000;
    ld.param.b32 %f17, [params+32];
    sub.ftz.f32 %f18, %f16, %f17;
    ld.param.b32 %f19, [params+36];
    sub.ftz.f32 %f20, %f18, %f19;
    mov.b32 %r21, 0x1;
    mad.lo.u32 %r22, %r9, %r10, %r21;
    mad.wide.u32 %rd3, %r22, 4, %rd5;
    ld.global.nc.b32 %r23, [%rd3];
    mul.lo.u32 %r24, %r23, %r10;
    mad.wide.u32 %rd3, %r24, 4, %rd4;
    ld.global.nc.b32 %f25, [%rd3];
    mov.b32 %r26, 0x2;
    mad.lo.u32 %r27, %r9, %r10, %r26;
    mad.wide.u32 %rd3, %r27, 4, %rd5;
    ld.global.nc.b32 %r28, [%rd3];
    mul.lo.u32 %r29, %r28, %r10;
    mad.wide.u32 %rd3, %r29, 4, %rd4;
    ld.global.nc.b32 %f30, [%rd3];
    mul.ftz.f32 %f31, %f30, %f19;
    fma.rn.ftz.f32 %f32, %f25, %f17, %f31;
    fma.rn.ftz.f32 %f33, %f15, %f20, %f32;
    mad.lo.u32 %r34, %r13, %r10, %r21;
    mad.wide.u32 %rd3, %r34, 4, %rd4;
    ld.global.nc.b32 %f35, [%rd3];
    mad.lo.u32 %r36, %r23, %r10, %r21;
    mad.wide.u32 %rd3, %r36, 4, %rd4;
    ld.global.nc.b32 %f37, [%rd3];
    mad.lo.u32 %r38, %r28, %r10, %r21;
    mad.wide.u32 %rd3, %r38, 4, %rd4;
    ld.global.nc.b32 %f39, [%rd3];
    mul.ftz.f32 %f40, %f39, %f19;
    fma.rn.ftz.f32 %f41, %f37, %f17, %f40;
    fma.rn.ftz.f32 %f42, %f35, %f20, %f41;
    mad.lo.u32 %r43, %r13, %r10, %r26;
    mad.wide.u32 %rd3, %r43, 4, %rd4;
    ld.global.nc.b32 %f44, [%rd3];
    mad.lo.u32 %r45, %r23, %r10, %r26;
    mad.wide.u32 %rd3, %r45, 4, %rd4;
    ld.global.nc.b32 %f46, [%rd3];
    mad.lo.u32 %r47, %r28, %r10, %r26;
    mad.wide.u32 %rd3, %r47, 4, %rd4;
    ld.global.nc.b32 %f48, [%rd3];
    mul.ftz.f32 %f49, %f48, %f19;
    fma.rn.ftz.f32 %f50, %f46, %f17, %f49;
    fma.rn.ftz.f32 %f51, %f44, %f20, %f50;
    sub.ftz.f32 %f52, %f37, %f35;
    sub.ftz.f32 %f53, %f48, %f44;
    sub.ftz.f32 %f54, %f46, %f44;
    sub.ftz.f32 %f55, %f39, %f35;
    mul.ftz.f32 %f56, %f54, %f55;
    neg.ftz.f32 %f57, %f56;
    fma.rn.ftz.f32 %f58, %f52, %f53, %f57;
    sub.ftz.f32 %f59, %f25, %f15;
    sub.ftz.f32 %f60, %f30, %f15;
    mul.ftz.f32 %f61, %f52, %f60;
    neg.ftz.f32 %f62, %f61;
    fma.rn.ftz.f32 %f63, %f59, %f55, %f62;
    mul.ftz.f32 %f64, %f59, %f53;
    neg.ftz.f32 %f65, %f64;
    fma.rn.ftz.f32 %f66, %f54, %f60, %f65;
    mul.ftz.f32 %f67, %f58, %f58;
    fma.rn.ftz.f32 %f68, %f66, %f66, %f67;
    fma.rn.ftz.f32 %f69, %f63, %f63, %f68;
    rsqrt.approx.ftz.f32 %f70, %f69;
    mul.ftz.f32 %f71, %f58, %f70;
    mul.ftz.f32 %f72, %f66, %f70;
    mul.ftz.f32 %f73, %f63, %f70;
    mov.u32 %r74, self;
    mad.wide.u32 %rd3, %r28, 8, %rd6;
    .reg.f32 %f75_out_<2>;
    ld.global.nc.v2.b32 {%f75_out_0, %f75_out_1}, [%rd3+0];
    mov.b32 %f76, %f75_out_0;
    mad.wide.u32 %rd3, %r23, 8, %rd6;
    .reg.f32 %f77_out_<2>;
    ld.global.nc.v2.b32 {%f77_out_0, %f77_out_1}, [%rd3+0];
    mov.b32 %f78, %f77_out_0;
    mad.wide.u32 %rd3, %r13, 8, %rd6;
    .reg.f32 %f79_out_<2>;
    ld.global.nc.v2.b32 {%f79_out_0, %f79_out_1}, [%rd3+0];
    mov.b32 %f80, %f79_out_0;
    mul.ftz.f32 %f81, %f80, %f20;
    fma.rn.ftz.f32 %f82, %f78, %f17, %f81;
    fma.rn.ftz.f32 %f83, %f76, %f19, %f82;
    mov.b32 %f84, %f75_out_1;
    mov.b32 %f85, %f77_out_1;
    mov.b32 %f86, %f79_out_1;
    mul.ftz.f32 %f87, %f86, %f20;
    fma.rn.ftz.f32 %f88, %f85, %f17, %f87;
    fma.rn.ftz.f32 %f89, %f84, %f19, %f88;
    mad.wide.u32 %rd3, %r29, 4, %rd7;
    ld.global.nc.b32 %f90, [%rd3];
    mad.wide.u32 %rd3, %r24, 4, %rd7;
    ld.global.nc.b32 %f91, [%rd3];
    mad.wide.u32 %rd3, %r14, 4, %rd7;
    ld.global.nc.b32 %f92, [%rd3];
    mul.ftz.f32 %f93, %f92, %f20;
    fma.rn.ftz.f32 %f94, %f91, %f17, %f93;
    fma.rn.ftz.f32 %f95, %f90, %f19, %f94;
    mad.wide.u32 %rd3, %r47, 4, %rd7;
    ld.global.nc.b32 %f96, [%rd3];
    mad.wide.u32 %rd3, %r45, 4, %rd7;
    ld.global.nc.b32 %f97, [%rd3];
    mad.wide.u32 %rd3, %r43, 4, %rd7;
    ld.global.nc.b32 %f98, [%rd3];
    mul.ftz.f32 %f99, %f98, %f20;
    fma.rn.ftz.f32 %f100, %f97, %f17, %f99;
    fma.rn.ftz.f32 %f101, %f96, %f19, %f100;
    mad.wide.u32 %rd3, %r38, 4, %rd7;
    ld.global.nc.b32 %f102, [%rd3];
    mad.wide.u32 %rd3, %r36, 4, %rd7;
    ld.global.nc.b32 %f103, [%rd3];
    mad.wide.u32 %rd3, %r34, 4, %rd7;
    ld.global.nc.b32 %f104, [%rd3];
    mul.ftz.f32 %f105, %f104, %f20;
    fma.rn.ftz.f32 %f106, %f103, %f17, %f105;
    fma.rn.ftz.f32 %f107, %f102, %f19, %f106;
    mul.ftz.f32 %f108, %f95, %f95;
    fma.rn.ftz.f32 %f109, %f107, %f107, %f108;
    fma.rn.ftz.f32 %f110, %f101, %f101, %f109;
    rsqrt.approx.ftz.f32 %f111, %f110;
    mul.ftz.f32 %f112, %f95, %f111;
    mul.ftz.f32 %f113, %f107, %f111;
    mul.ftz.f32 %f114, %f101, %f111;
    sub.ftz.f32 %f115, %f78, %f80;
    sub.ftz.f32 %f116, %f84, %f86;
    sub.ftz.f32 %f117, %f85, %f86;
    sub.ftz.f32 %f118, %f76, %f80;
    mul.ftz.f32 %f119, %f117, %f118;
    neg.ftz.f32 %f120, %f119;
    fma.rn.ftz.f32 %f121, %f115, %f116, %f120;
    mov.b32 %f122, 0x0;
    setp.ne.f32 %p123, %f121, %f122;
    mul.ftz.f32 %f124, %f117, %f60;
    neg.ftz.f32 %f125, %f124;
    fma.rn.ftz.f32 %f126, %f116, %f59, %f125;
    rcp.approx.ftz.f32 %f127, %f121;
    mul.ftz.f32 %f128, %f126, %f127;
    mov.b32 %r129, %f73;
    mov.b32 %r130, 0x0;
    setp.lt.s32 %p131, %r129, %r130;
    mul.ftz.f32 %f132, %f71, %f71;
    mov.b32 %f133, 0xbf800000;
    selp.b32 %f134, %f133, %f16, %p131;
    add.ftz.f32 %f135, %f134, %f73;
    rcp.approx.ftz.f32 %f136, %f135;
    neg.ftz.f32 %f137, %f136;
    mul.ftz.f32 %f138, %f132, %f137;
    neg.ftz.f32 %f139, %f138;
    selp.b32 %f140, %f139, %f138, %p131;
    add.ftz.f32 %f141, %f140, %f16;
    selp.b32 %f142, %f128, %f141, %p123;
    mul.ftz.f32 %f143, %f117, %f55;
    neg.ftz.f32 %f144, %f143;
    fma.rn.ftz.f32 %f145, %f116, %f52, %f144;
    mul.ftz.f32 %f146, %f145, %f127;
    mul.ftz.f32 %f147, %f71, %f72;
    mul.ftz.f32 %f148, %f147, %f137;
    neg.ftz.f32 %f149, %f148;
    selp.b32 %f150, %f149, %f148, %p131;
    selp.b32 %f151, %f146, %f150, %p123;
    mul.ftz.f32 %f152, %f117, %f53;
    neg.ftz.f32 %f153, %f152;
    fma.rn.ftz.f32 %f154, %f116, %f54, %f153;
    mul.ftz.f32 %f155, %f154, %f127;
    neg.ftz.f32 %f156, %f71;
    selp.b32 %f157, %f71, %f156, %p131;
    selp.b32 %f158, %f155, %f157, %p123;
    neg.ftz.f32 %f159, %f59;
    mul.ftz.f32 %f160, %f115, %f60;
    fma.rn.ftz.f32 %f161, %f118, %f159, %f160;
    mul.ftz.f32 %f162, %f161, %f127;
    selp.b32 %f163, %f162, %f148, %p123;
    neg.ftz.f32 %f164, %f52;
    mul.ftz.f32 %f165, %f115, %f55;
    fma.rn.ftz.f32 %f166, %f118, %f164, %f165;
    mul.ftz.f32 %f167, %f166, %f127;
    mul.ftz.f32 %f168, %f72, %f137;
    fma.rn.ftz.f32 %f169, %f72, %f168, %f134;
    selp.b32 %f170, %f167, %f169, %p123;
    neg.ftz.f32 %f171, %f54;
    mul.ftz.f32 %f172, %f115, %f53;
    fma.rn.ftz.f32 %f173, %f118, %f171, %f172;
    mul.ftz.f32 %f174, %f173, %f127;
    neg.ftz.f32 %f175, %f72;
    selp.b32 %f176, %f174, %f175, %p123;
    st.param.b32 [result+0], %f8;
    st.param.b32 [result+4], %f33;
    st.param.b32 [result+8], %f42;
    st.param.b32 [result+12], %f51;
    st.param.b32 [result+16], %f71;
    st.param.b32 [result+20], %f72;
    st.param.b32 [result+24], %f73;
    st.param.b32 [result+28], %r74;
    st.param.b32 [result+32], %f83;
    st.param.b32 [result+36], %f89;
    st.param.b32 [result+40], %f112;
    st.param.b32 [result+44], %f113;
    st.param.b32 [result+48], %f114;
    st.param.b32 [result+52], %f142;
    st.param.b32 [result+56], %f151;
    st.param.b32 [result+60], %f158;
    st.param.b32 [result+64], %f163;
    st.param.b32 [result+68], %f170;
    st.param.b32 [result+72], %f176;
    ret;
}

.visible .func (.param .align 4 .b8 result[12]) __direct_callable__d8de9386f8c8452cd9495835130f3552(.reg .u64 data, .param .align 4 .b8 params[168]) {
    // Call: mitsuba::BSDF::eval()
    .reg.b8   %b <5>; .reg.b16  %w<5>; .reg.b32 %r<5>;
    .reg.b64  %rd<5>; .reg.f16  %h<5>; .reg.f32 %f<5>;
    .reg.f64  %d <5>; .reg.pred %p<5>;
    mov.b32 %f4, 0x0;
    st.param.b32 [result+0], %f4;
    st.param.b32 [result+4], %f4;
    st.param.b32 [result+8], %f4;
    ret;
}

.visible .func (.param .align 4 .b8 result[12]) __direct_callable__e1de368c6a148bbf759709159c693075(.reg .u64 data, .param .align 4 .b8 params[168]) {
    // Call: mitsuba::BSDF::eval()
    .reg.b8   %b <23>; .reg.b16  %w<23>; .reg.b32 %r<23>;
    .reg.b64  %rd<23>; .reg.f16  %h<23>; .reg.f32 %f<23>;
    .reg.f64  %d <23>; .reg.pred %p<23>;
    ld.global.b32 %f4, [data+0];
    ld.global.b32 %f5, [data+4];
    ld.global.b32 %f6, [data+8];
    ld.param.b32 %f7, [params+152];
    mov.b32 %f8, 0x0;
    setp.gt.f32 %p9, %f7, %f8;
    ld.param.b32 %f10, [params+164];
    setp.gt.f32 %p11, %f10, %f8;
    and.pred %p12, %p9, %p11;
    mov.b32 %f13, 0x3ea2f983;
    mul.ftz.f32 %f14, %f4, %f13;
    mul.ftz.f32 %f15, %f14, %f10;
    selp.b32 %f16, %f15, %f8, %p12;
    mul.ftz.f32 %f17, %f5, %f13;
    mul.ftz.f32 %f18, %f17, %f10;
    selp.b32 %f19, %f18, %f8, %p12;
    mul.ftz.f32 %f20, %f6, %f13;
    mul.ftz.f32 %f21, %f20, %f10;
    selp.b32 %f22, %f21, %f8, %p12;
    st.param.b32 [result+0], %f16;
    st.param.b32 [result+4], %f19;
    st.param.b32 [result+8], %f22;
    ret;
}

.visible .func (.param .align 4 .b8 result[28]) __direct_callable__fffbdc22dd6064333b5d977b88dd29af(.reg .u64 data, .param .align 4 .b8 params[176]) {
    // Call: mitsuba::BSDF::sample()
    .reg.b8   %b <104>; .reg.b16  %w<104>; .reg.b32 %r<104>;
    .reg.b64  %rd<104>; .reg.f16  %h<104>; .reg.f32 %f<104>;
    .reg.f64  %d <104>; .reg.pred %p<104>;
    ld.global.b32 %f4, [data+0];
    ld.global.b32 %f5, [data+4];
    ld.global.b32 %f6, [data+8];
    mov.b32 %f7, 0x40000000;
    ld.param.b32 %f8, [params+168];
    mov.b32 %f9, 0xbf800000;
    fma.rn.ftz.f32 %f10, %f7, %f8, %f9;
    abs.f32 %f11, %f10;
    ld.param.b32 %f12, [params+172];
    fma.rn.ftz.f32 %f13, %f7, %f12, %f9;
    abs.f32 %f14, %f13;
    setp.lt.f32 %p15, %f11, %f14;
    selp.b32 %f16, %f13, %f10, %p15;
    mov.b32 %f17, 0x0;
    setp.eq.f32 %p18, %f10, %f17;
    setp.eq.f32 %p19, %f13, %f17;
    and.pred %p20, %p18, %p19;
    mov.b32 %f21, 0x3fc90fdb;
    mov.b32 %f22, 0x3f490fdb;
    selp.b32 %f23, %f10, %f13, %p15;
    mul.ftz.f32 %f24, %f22, %f23;
    div.approx.ftz.f32 %f25, %f24, %f16;
    sub.ftz.f32 %f26, %f21, %f25;
    selp.b32 %f27, %f26, %f25, %p15;
    selp.b32 %f28, %f17, %f27, %p20;
    abs.f32 %f29, %f28;
    mov.b32 %f30, 0x3fa2f983;
    mul.ftz.f32 %f31, %f29, %f30;
    cvt.rzi.s32.f32 %r32, %f31;
    mov.b32 %r33, 0x1;
    add.s32 %r34, %r32, %r33;
    mov.b32 %r35, 0xfffffffe;
    and.b32 %r36, %r34, %r35;
    mov.b32 %r37, 0x2;
    and.b32 %r38, %r36, %r37;
    mov.b32 %r39, 0x0;
    setp.eq.s32 %p40, %r38, %r39;
    cvt.rn.f32.s32 %f41, %r36;
    mov.b32 %f42, 0x3f490000;
    mul.ftz.f32 %f43, %f41, %f42;
    sub.ftz.f32 %f44, %f29, %f43;
    mov.b32 %f45, 0x397da000;
    mul.ftz.f32 %f46, %f41, %f45;
    sub.ftz.f32 %f47, %f44, %f46;
    mov.b32 %f48, 0x33222169;
    mul.ftz.f32 %f49, %f41, %f48;
    sub.ftz.f32 %f50, %f47, %f49;
    mul.ftz.f32 %f51, %f50, %f50;
    mov.b32 %f52, 0x7f800000;
    setp.eq.f32 %p53, %f29, %f52;
    selp.b32 %f54, -1, %f51, %p53;
    mul.ftz.f32 %f55, %f54, %f54;
    mov.b32 %f56, 0x37ccf5ce;
    mov.b32 %f57, 0xbab6061a;
    mov.b32 %f58, 0x3d2aaaa5;
    fma.rn.ftz.f32 %f59, %f54, %f57, %f58;
    fma.rn.ftz.f32 %f60, %f55, %f56, %f59;
    mul.ftz.f32 %f61, %f60, %f54;
    mov.b32 %f62, 0xbf000000;
    mov.b32 %f63, 0x3f800000;
    fma.rn.ftz.f32 %f64, %f54, %f62, %f63;
    fma.rn.ftz.f32 %f65, %f61, %f54, %f64;
    mov.b32 %f66, 0xb94ca1f9;
    mov.b32 %f67, 0x3c08839e;
    mov.b32 %f68, 0xbe2aaaa3;
    fma.rn.ftz.f32 %f69, %f54, %f67, %f68;
    fma.rn.ftz.f32 %f70, %f55, %f66, %f69;
    mul.ftz.f32 %f71, %f70, %f54;
    fma.rn.ftz.f32 %f72, %f71, %f50, %f50;
    selp.b32 %f73, %f65, %f72, %p40;
    mov.b32 %f74, 0x80000000;
    sub.s32 %r75, %r36, %r37;
    not.b32 %r76, %r75;
    mov.b32 %r77, 0x1d;
    shl.b32 %r78, %r76, %r77;
    mov.b32 %f79, %r78;
    and.b32 %f80, %f74, %f79;
    xor.b32 %f81, %f73, %f80;
    mul.ftz.f32 %f82, %f16, %f81;
    selp.b32 %f83, %f72, %f65, %p40;
    shl.b32 %r84, %r36, %r77;
    mov.b32 %f85, %r84;
    xor.b32 %f86, %f85, %f28;
    and.b32 %f87, %f74, %f86;
    xor.b32 %f88, %f83, %f87;
    mul.ftz.f32 %f89, %f16, %f88;
    mul.ftz.f32 %f90, %f82, %f82;
    fma.rn.ftz.f32 %f91, %f89, %f89, %f90;
    sub.ftz.f32 %f92, %f63, %f91;
    max.ftz.f32 %f93, %f92, %f17;
    sqrt.approx.ftz.f32 %f94, %f93;
    ld.param.b32 %f95, [params+152];
    setp.gt.f32 %p96, %f95, %f17;
    mov.b32 %f97, 0x3ea2f983;
    mul.ftz.f32 %f98, %f97, %f94;
    setp.gt.f32 %p99, %f98, %f17;
    and.pred %p100, %p96, %p99;
    selp.b32 %f101, %f4, %f17, %p100;
    selp.b32 %f102, %f5, %f17, %p100;
    selp.b32 %f103, %f6, %f17, %p100;
    st.param.b32 [result+0], %f82;
    st.param.b32 [result+4], %f89;
    st.param.b32 [result+8], %f94;
    st.param.b32 [result+12], %f63;
    st.param.b32 [result+16], %f101;
    st.param.b32 [result+20], %f102;
    st.param.b32 [result+24], %f103;
    ret;
}


COMPILE ERROR: failed to create pipeline
Info: Pipeline statistics
	module(s)                            :     2
	entry function(s)                    :    23
	trace call(s)                        :     2
	continuation callable call(s)        :     0
	direct callable call(s)              :     3
	basic block(s) in entry functions    :   479
	instruction(s) in entry functions    : 13012
	non-entry function(s)                :     0
	basic block(s) in non-entry functions:     0
	instruction(s) in non-entry functions:     0
	debug information                    :    no


Dr.Jit encountered an unrecoverable error and will now shut
down. Please re-run your program in debug mode to check for
out-of-bounds reads, writes, and other sources of undefined
behavior. You can do so by calling

   dr.set_flag(dr.JitFlag.Debug, True)

at the beginning of the program. If these additional checks
fail to pinpoint the problem, then you have likely found a
bug. We are happy to help investigate and fix the problem if
you can you create a self-contained reproducer and submit it
at https://github.com/mitsuba-renderer/drjit.

The error message of this specific failure is as follows:
>>> jit_optix_check(): API error 7251 (OPTIX_ERROR_PIPELINE_LINK_ERROR): "Pipeline link error" in D:\a\drjit\drjit\ext\drjit-core\src\optix_core.cpp:391.

@tstigen
Copy link
Author

tstigen commented Dec 23, 2024

Can confirm, everything working normally on 531 driver. This was similar to a prior bug #967. So it must be related to how the driver itself interacts with OptiX and (maybe only on the 1080 series).

@merlinND
Copy link
Member

merlinND commented Jan 3, 2025

Hello @tstigen,

It sounds like the tutorial works as expected when using the latest version of the tutorials repo.

Regarding the issue with driver v561+, please see this post: mitsuba-renderer/drjit#296 (comment)

@merlinND merlinND closed this as completed Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants