View Full Version : Software Rendering
AnotherJake
2003.11.26, 05:49 PM
I've been reading up on some more of the fundamentals of 3D rendering on and off for the last year or two. I haven't actually put anything into practice. It's just an interesting curiosity to me really. OpenGL is what we all use in reality. I have some time off right now and I was thinking about playing around with doing some software rendering for the fun of it. Feeling a bit adventurous and impractical I suppose.
Here's the question: I was thinking about setting up a couple of buffers as textures in OpenGL and just using glTexSubImage2D as the swap mechanism. It just seems to me there might be another way. Could I just rasterize straight into the color buffer? Any ideas?
Fenris
2003.11.26, 07:18 PM
No, you would have to rasterize into a texture. You would get decent results by setting up a DMA transfer directly to the texture memory buffer, though.
AnotherJake
2003.11.26, 08:23 PM
I guess that's the conclusion to stick to then. Not that I'm complaining. Just checking to see if I was missing some other option. DMA is certainly fast enough for a quick texture swap. Thanks, Fenris.
It's not quite clear if you're actually using the textures as textures, or just a means of getting pixels on the screen. If you're justing getting pixels on the screen you can write directly into video memory, which probably be at least an order of magnitude faster than writing to a buffer, loading into a texture, and drawing on the screen. I think.
OneSadCookie
2003.11.27, 06:33 PM
order of magnitude slower you mean :p
You can get DMA uploading a texture, which you can't get writing directly to VRAM...
AnotherJake
2003.11.28, 05:40 PM
Originally posted by inio
It's not quite clear if you're actually using the textures as textures, or just a means of getting pixels on the screen.
Getting pixels on the screen is what I'm doing with the textures. I don't even *have* textures in my rendering engine yet ( or lighting for that matter ). The only thing I am using OpenGL for is just to get hardware acceleration for the swap at the end of the rendering pipeline, right after rasterization. That's it. DMA texture uploading is really fast. It's just weird to render into a texture and call it a screen buffer.
With that out of the way, I should point out again that what I'm doing is an excellent example of sheer impracticality. How would I go about writing directly to video memory? I'd like to try that too just for the fun of it.
BTW, in case anyone is interested or, more importantly, has some other reference suggestions for me: one of the books that I'm reading through is "Tricks of the 3D Game Programming Gurus" by LaMothe. This is a really great book on software rendering with lots of info in plain language. It's like 1700 pages though -sheesh! Sucks that he's such a DirectX fan but whatever... "Real-Time Rendering" by Moller and Haines is also very good but it's usually waaaay over my head. http://pages.infinit.net/jstlouis/3dbhole/ is another nice, simple reference. None of them have anything to do with the Mac so I'm having to kind of make stuff up as I go along, but it's still a fun diversion.
OneSadCookie
2003.11.28, 07:00 PM
CGDisplayCapture(kCGDirectMainDisplay);
void *vram = CGDisplayBaseAddress(kCGDirectMainDisplay);
AnotherJake
2003.11.28, 09:11 PM
Thanks OSC. Speed isn't nearly as bad as I suspected it might be -about 23 fps at 1024x768x32 vs. about 82 fps for the same thing using DMA textures in OpenGL. Well, but the DMA textures got to run in a window so I guess that's not really a fair comparison, but interesting anyway. I can't figure out how to synch to VBL when drawing directly to VRAM, but that's not an issue anyway since I just wanted to check it out for kicks.
Edit: I should add that my method of blitting is slightly different between the two and they could both be improved.
arekkusu
2003.11.28, 11:40 PM
Originally posted by AnotherJake
DMA texture uploading is really fast. It's just weird to render into a texture and call it a screen buffer.
Off topic: It's not weird, it's a very common practice for special effects.
Fenris
2003.11.29, 02:58 AM
DMA texture uploading is really fast. It's just weird to render into a texture and call it a screen buffer.
I'm not a 100% on this, but I want to seem well-educated, so I'm going to make a run for it:
This isn't really what happens. DMA access means that the video board can access regular RAM extremely quickly. You still render into a buffer that resides in main RAM, and then call that a texture.
but it's still a fun diversion.
I suppose that you implying that you have free time on your hands. Now, saying this in proximity of my extremely stressed-out, time-is-short-as-hell life can prove to be lethal. I'm letting you off the hook this time. ;)
OneSadCookie
2003.11.29, 07:25 AM
* OpenGL is much faster fullscreen than windowed.
* The only way to sync to VBL on Mac OS X is via OpenGL or the window manager.
Originally posted by Fenris
I'm not a 100% on this, but I want to seem well-educated, so I'm going to make a run for it:
This isn't really what happens. DMA access means that the video board can access regular RAM extremely quickly. You still render into a buffer that resides in main RAM, and then call that a texture.
DMA is short for "Direct Memory Access", which basically means that the video card can access the RAM without having to shove the data through the CPU. You will still be limited by AGP bandwidth (though this limitation in itself probably wont matter for you), but at least no CPU cycles are used for the RAM->VRAM transfer.
AnotherJake
2003.11.29, 03:03 PM
* OpenGL is much faster fullscreen than windowed.
That's what I meant by it not being a fair comparison. The OpenGL version would have been even faster running at fullscreen like the VRAM version was. I should have said that differently.
* The only way to sync to VBL on Mac OS X is via OpenGL or the window manager.
That would seem to be another disadvantage of writing directly to VRAM.
DMA access means that the video board can access regular RAM extremely quickly. You still render into a buffer that resides in main RAM, and then call that a texture.
Exactly. From the standpoint of a software rendering engine that "texture" is being used as a "screen buffer" to put the final rasterized image into. OpenGL isn't being used for anything else. At all. The software rendering engine does all the work. I'm simply using OpenGL as the mechanism to get the pixels pushed to the screen in the end in the form of an OpenGL texture just like you're saying. So I'm calling that texture a "screen buffer" in this instance.
I'm letting you off the hook this time.
Heeeheeee! Vacation rocks!
AnotherJake
2003.11.29, 03:31 PM
Originally posted by DoG
You will still be limited by AGP bandwidth (though this limitation in itself probably wont matter for you), but at least no CPU cycles are used for the RAM->VRAM transfer.
-Or PCI bandwidth. You're right, since software rendering is so processor intensive it's a major plus to get the processor out of the loop here.
AnotherJake
2003.11.29, 03:41 PM
Hey, BTW, I have a Radeon on the PCI bus and a factory GeForce on the AGP bus. They both work really well when the window is mostly on one or the other, but you should see how the texture upload slows to a crawl when I drag the window in between the two. Pathetic.
arekkusu
2003.11.30, 02:10 AM
Originally posted by DoG
You will still be limited by AGP bandwidth (though this limitation in itself probably wont matter for you)
Depending on your application it is still a big bottleneck.
On an old iBook for example, you have AGP2x which is about 512megs/sec. Let's say you want to upload 800x600x32bpp textures, at 60fps. How many can you upload?
Each texture = 1875k.
* 60fps = ~110 megs / sec.
= 4.6 textures uploadable per frame, theoretical max ignoring all other bandwidth.
This might equal, for example, 4 scrolling layers. Or 4 sides of a cube map. Or 4 shadow maps. Or 4 video streams. Or etc.
Originally posted by arekkusu
Depending on your application it is still a big bottleneck.
On an old iBook for example, you have AGP2x which is about 512megs/sec. Let's say you want to upload 800x600x32bpp textures, at 60fps. How many can you upload?
Each texture = 1875k.
* 60fps = ~110 megs / sec.
= 4.6 textures uploadable per frame, theoretical max ignoring all other bandwidth.
This might equal, for example, 4 scrolling layers. Or 4 sides of a cube map. Or 4 shadow maps. Or 4 video streams. Or etc.
BUT, we are only talking a single framebuffer to be uploaded.
Besides, you'd be a fool if you composed multiple framebuffers in software and then transferred it to VRAM. I see no application for what you mentioned.
arekkusu
2003.11.30, 01:32 PM
Stick six HDTV streams on a rotating cube. AGP2x is going to be a bottleneck.
AnotherJake
2003.11.30, 01:44 PM
True, but that's not what I'm doing. For pushing a frame at a time, DoG is right, it just isn't an issue.
Originally posted by arekkusu
Stick six HDTV streams on a rotating cube. AGP2x is going to be a bottleneck.
Hehe, i wonder how you are going to decode six (or even one) HDTV streams in software on a computer which only supports 2x AGP in the first place. :p
Programmer
2003.12.03, 11:58 AM
Just make sure you don't read across the bus. The frame buffers aren't cache-able, plus the read latencies are horrific. Draw locally and then blit the result across the bus with the widest writes you can manage (AltiVec or doubles).
I've often thought that writing an AltiVec renderer would be fun. And then sanity returns and I go back to thinking about how to use the GPU. Doing CPU based texture generation is really useful, but these days there isn't much reason to rasterize polygons on the CPU. Using the CPU for post-processing the frame buffer could be compelling, but it just takes way too long to get the frame buffer back to where the CPU can deal with it at a reasonable speed. Besides, this is what programmable GPUs are for. :)
vBulletin® v3.6.8, Copyright ©2000-2008, Jelsoft Enterprises Ltd.