OpenCL deferred lighting demo

Member
Posts: 142
Joined: 2002.11
Post: #1
I've written up a demo that uses OpenCL to compute lighting. It uses a deferred lighting approach, but saves on bandwidth versus a traditional rasterization based approach because all the lighting is done in a single pass rather than requiring texture reads for every light. Also, unlike a traditional rasterization based approach, lighting is accumulated in full floating point precision but may stored at lower precision, resulting in higher quality results. Using this approach you can have thousands of dynamic lights -- the demo has 1024. I am not the first to put compute kernels/shaders to this task (by about a year), but I think probably the first to do it on OpenCL and for Mac LOL

Lights are first binned into tiles, a tile covering a small fraction of the screen (say 64x64 pixels). Lights are culled against the tile frustum and also minimum and maximum z coordinates for the tile.

Next the lighting is done on a per block basis, a block being 8x8 pixels so that one block fits inside of an OpenCL work group. Lights are block culled in parallel, up to 64 lights at a time. Block culling uses a world-space bounding box around the block, and also culls backfacing lights by using a bounding cone around the normal vectors of all the pixels in the block.

I wrote the demo on a machine that has a Geforce 9400M, so it should work well on Nvidia based Macs that support OpenCL. I don't think it will work well on ATI/AMD cards, however, because the architecture is radically different (VLIW versus scalar) and I have no ATI/AMD card to test with. Also, some ATI cards do not have OpenCL Image support, which this demo requires. If the demo does run successfully on your machine, let me know how well it does!

I'm posting this here because I think it's a cool demo ... however run at your own risk because the implementation is experimental and OpenCL is great at causing kernel panics (ie, please save your work before running this).

[Image: gdemo.png]

Demo controls:
z -- previous view mode
x -- next view mode

Try it out!
Download Link
Quote this message in a reply
Sage
Posts: 1,482
Joined: 2002.09
Post: #2
Pretty cool. Runs at about 45fps on my MacBook Pro's 8600(M? GT? can never remember).

What are the different view modes showing?

How much of this is OpenCL doing? The light accumulation and final shading? Presumably you didn't write an OpenCL based rasterizer and are still drawing the polygons and particles with GL?

Scott Lembcke - Howling Moon Software
Author of Chipmunk Physics - A fast and simple rigid body physics library in C.
Quote this message in a reply
Member
Posts: 142
Joined: 2002.11
Post: #3
(Feb 17, 2011 07:38 PM)Skorche Wrote:  Pretty cool. Runs at about 45fps on my MacBook Pro's 8600(M? GT? can never remember).

What are the different view modes showing?

How much of this is OpenCL doing? The light accumulation and final shading? Presumably you didn't write an OpenCL based rasterizer and are still drawing the polygons and particles with GL?

The modes show normal, depth (probably all white, oops, should change the zNear and zFar values), diffuse and specular lighting components, the number of lights computed per tile and the number of lights computed per pixel. I guess I really should label these LOL

There are 3 main steps:
1. OpenGL generates depth and normal buffers (needed to perform lighting)
2. OpenCL generates diffuse and specular lighting components for all lights in a single pass
3. OpenGL processes scene again, this time running normal material shaders. The material shaders use the specular and diffuse components computed by OpenCL instead of doing any lighting calculations.

The 3rd step could actually be done by OpenCL in step 2 if step 1 were modified so that OpenGL also outputted albedo (color) information. This would be faster, but less flexible in supporting multiple materials. Many commercial games use a similar approach.

Definitely still rasterizing polygons with OpenGL. When I mentioned rasterization, it's that normally what happens in a deferred lighting setup is that some geometry which bounds the region a light is applied to is rasterized by OpenGL. This usually results in the same pixel being processed many many times (one or more times for each light that is applied to it), which I've replaced with a single OpenCL pass. This saves a lot of bandwidth, and reduces artifacts since typically in OpenGL the lighting would be blended onto a low precision buffer.
Quote this message in a reply
⌘-R in Chief
Posts: 1,254
Joined: 2002.05
Post: #4
I admit, I don't fully understand it because all I know about OpenGL lighting is just from reading short bits of what the rest of you write, but it's obvious this is pretty cool and nifty. Smile

60 fps with the window filling my entire screen. 8800GT
Quote this message in a reply
Member
Posts: 142
Joined: 2002.11
Post: #5
(Feb 18, 2011 12:43 PM)SethWillits Wrote:  I admit, I don't fully understand it because all I know about OpenGL lighting is just from reading short bits of what the rest of you write, but it's obvious this is pretty cool and nifty. Smile

60 fps with the window filling my entire screen. 8800GT

Awesome! The 8800GT has 7x the shading units as my 9400M so I was hoping this would be the case. I used to have this card in my Mac Pro until it overheated and failed Cry Now I have a Radeon 4870 which doesn't even support OpenCL fully Mad
Quote this message in a reply
DoG
Moderator
Posts: 869
Joined: 2003.01
Post: #6
Maybe this is a stupid question, but why is the inside of the crater not lit?
Quote this message in a reply
Member
Posts: 142
Joined: 2002.11
Post: #7
(Feb 20, 2011 07:50 AM)DoG Wrote:  Maybe this is a stupid question, but why is the inside of the crater not lit?

Not a stupid question at all.

Answer is because it would require being lit by almost every single light ... ends up reducing the frame rate by about 30%. As a hack, I disabled this. It looks pretty cool when it's lit though:

[Image: opencl-deferred.png]

A better solution would probably be to disable it being lit by all the particles, but to add a single larger light source at the center of the crater to compensate.
Quote this message in a reply
⌘-R in Chief
Posts: 1,254
Joined: 2002.05
Post: #8
That's more like it. I did wonder myself why there were 1024 lights, but relatively few hotspots on the surface.
Quote this message in a reply
Member
Posts: 86
Joined: 2008.04
Post: #9
Crashed on my latest iMac running 10.6.6, Radeon HD 5750, 16GB Ram....

:-( I was looking forward to it

Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000000
Crashed Thread: 0 Dispatch queue: com.apple.main-thread

Thread 0 Crashed: Dispatch queue: com.apple.main-thread
0 libSystem.B.dylib 0x00007fff8516c380 strcmp + 80
1 ...pple.ATIRadeonX3000GLDriver 0x0000000116777d9d glrCompCreateProgram + 9133
2 ...pple.ATIRadeonX3000GLDriver 0x000000011677a27c glrCompLoadBinary + 3884
3 ...pple.ATIRadeonX3000GLDriver 0x000000011677a3f2 glrCompBuildProgram + 322
4 com.apple.opencl 0x00007fff87249175 clBuildProgram + 2301
5 ...ourcompany.DeferredLighting 0x000000010000da50 -[ComputeProgram
Quote this message in a reply
Member
Posts: 142
Joined: 2002.11
Post: #10
(Feb 22, 2011 06:00 PM)OptimisticMonkey Wrote:  Crashed on my latest iMac running 10.6.6, Radeon HD 5750, 16GB Ram....

:-( I was looking forward to it

Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000000
Crashed Thread: 0 Dispatch queue: com.apple.main-thread

Thread 0 Crashed: Dispatch queue: com.apple.main-thread
0 libSystem.B.dylib 0x00007fff8516c380 strcmp + 80
1 ...pple.ATIRadeonX3000GLDriver 0x0000000116777d9d glrCompCreateProgram + 9133
2 ...pple.ATIRadeonX3000GLDriver 0x000000011677a27c glrCompLoadBinary + 3884
3 ...pple.ATIRadeonX3000GLDriver 0x000000011677a3f2 glrCompBuildProgram + 322
4 com.apple.opencl 0x00007fff87249175 clBuildProgram + 2301
5 ...ourcompany.DeferredLighting 0x000000010000da50 -[ComputeProgram

Dang, that's too bad. Like I said though, I have no AMD/ATI card to test with. Weird though that it crashed on string compare of all things.

Anything in the console? I think the console output could at least let me know which OpenCL program compilation caused the error.

Update: it appears that AMD/ATI cards don't support OpenCL Image in Mac OS X 10.6.6 even though the newer ones do on Windows and Linux. I have some anecdotal evidence that driver support is coming in 10.7. With any luck (crossing fingers) we'll see it come come in an update to 10.6 as well.
Quote this message in a reply
Post Reply 

Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  Deferred Shading wquist 0 2,584 Jul 20, 2011 07:42 PM
Last Post: wquist
  AMD/ATI OpenCL best practices? Holmes 5 6,966 Feb 25, 2011 03:05 PM
Last Post: Holmes
  Snow Leopard and OpenCL TythosEternal 4 4,235 Jun 12, 2009 08:05 AM
Last Post: AnotherJake
  2D lighting/shadowing demo Skorche 7 6,259 Nov 22, 2004 09:53 AM
Last Post: Skorche