View Full Version : Optimization
I was wondering, if I have all my textures mipmapped, and some are excessively larger than they should be, will this hurt my performance much, or will it just take up a lot of VRAM? There are a bunch of textures I probably shouldn't mipmap, and some I could lower the resolution, but if it won't even help I will just leave it the same. Also, is there a way to speed up my transparencies, when I render my trees it slows the game down SO MUCH (and they are only 6 triangles, using vertice arrays)!
Mark Levin
2004.01.01, 12:18 PM
They won't hurt performance until you have so many textures they don't all fit in VRAM at once. Then the system will have to swap them in and out during render.
You should use Shark or the OpenGL profiler on the trees, it's a bad idea to just guess where the main load is (especially since I don't know anything about your code). Of course, if you're trying to draw a couple thousand trees, those 6 triangles do add up :)
arekkusu
2004.01.01, 12:32 PM
If you're drawing trees as intersecting billboards using alpha-masked textures, your speed hit is probably due to the blending. The GPU needs to read each pixel from the frame buffer before blending/writing a texel of the billboard. You can lessen the hit somewhat (depending on your textures) by enabling alpha test, so the read is skipped for fully transparent texels (or texels with an alpha lower than whatever you set the alpha test limit to.)
Fillrate eaten by blending is still a pretty big limitation even on modern (i.e. Radeon 9600) cards. I can only reliably get about 21x896x600x60 pixels/second out of my PowerBook. So for some applications you will need to redesign things to reduce the amount of overdraw.
If you're drawing trees as intersecting billboards using alpha-masked textures, your speed hit is probably due to the blending. The GPU needs to read each pixel from the frame buffer before blending/writing a texel of the billboard. You can lessen the hit somewhat (depending on your textures) by enabling alpha test, so the read is skipped for fully transparent texels (or texels with an alpha lower than whatever you set the alpha test limit to.)
Here is my code for the trees
glEnable(GL_ALPHA_TEST);
glAlphaFunc(GL_GREATER, 0.05);
glAlphaFunc(GL_SRC_ALPHA,GL_ONE_MINUS_SRC_ALPHA);
glColor4f(1.0,1.0,1.0,1.0);
glDisable(GL_LIGHTING);
for (i=0; i < treeCount ; i++)
{
[lowTree1 render: 0 x: trees[i].x y: trees[i].y z: trees[i].z xr: 0.0 yr: trees[i].spin zr: 0.0 sender:sender];
}
glEnable(GL_LIGHTING);
glDisable(GL_ALPHA_TEST);
so I think I am using the alpha test correctly. Also, i have a glColor3f(1,1,1) in my code before the trees, will that slow things down? (I think I had to put it there because I used a different color before that and it was screwing things up).
GL Profiler is being weird, I can get EVERYTHING to work except the Show Stats to work, which is what I need the most.
Ok, OSC gave me some good tips, like drawling from front to back, and then getting rid of undrawn trees. I will try that and report back today
arekkusu
2004.01.01, 08:24 PM
Your second glAlphaFunc is going to produce GL_INVALID_ENUM, you meant to type glBlendFunc.
Sorting by draw order will save you some blending if foreground trees come out mostly opaque. But you pay some CPU for sorting tree submission by Z.
How big is treeCount? Ten? A billion?
How are trees submitted to GL inside render:? You say you are using vertex arrays, are you using VAR? How many vertices per array submission? Submitting six triangles at a time is no good...
OK, I will fix that blend function thing asap.
The tree count right now is anywhere from 30 to 200, but I would like that number to increase without dropping FPS
Wow.. I just realized thats my problem, sending 6 triangles at a time. That will be REALLY easy to fix though. What's VAR?
OK, I will fix that blend function thing asap.
The tree count right now is anywhere from 30 to 200, but I would like that number to increase without dropping FPS
Wow.. I just realized thats my problem, sending 6 triangles at a time. That will be REALLY easy to fix though. What's VAR?
VAR is Apple's sorry attempt at making vertex arrays faster. It is really hard to use and really frustrating to try to use it...
arekkusu
2004.01.02, 08:52 PM
First off, it sounds like you still need to figure out where your bottleneck is. CPU? Vertex submission? Transform? Fillrate? Use Shark, GLProfiler, and play with your code increasing the number of polygons / size of the polygons until you have a better feeling for what is "slow".
If you see that it is fillrate, do what OSC said about changing draw order to reduce blending cost.
if you see that it is vertex submission/transform, first optimize your regular vertex arrays. You can ask the GPU for a hint about it's maximum element size, just query GL_MAX_ELEMENTS_VERTICES. It's usually something like 150,000. I'm getting OK results submitting around 32k vertices at a time.
Once vertex arrays work, you'll still be wasting some time during vertex submission, so then look at VAR. VAR is Vertex Array Range, it is a simple extension to regular vertex arrays that maps your array into AGP space so the GPU can DMA copy the data instead of the CPU pushing it all. See Apple's sample code (http://developer.apple.com/samplecode/Sample_Code/Graphics_3D/Vertex_Optimization.htm). There was also a thread on this board where I showed exactly how to set up double buffered VAR, but Carlos seems to have nuked it in the big forum shuffle.
Also, you might want to test your code on some different machines. The bottleneck will be different on different GPUs. Plus! You'll discover all sorts of bugs! Because! The ATI/nvidia drivers! Don't! Work! The! Same! >:(
Allright, I fixed the problem with calling too many drawElements (I had 1 per tree before) into 1 big drawElements. I was using GL Profiler, its pretty cool, I am going to do some more optimization first (like drawing trees in order from front to back). I read about that VAR on the NeHe tutorials (well the PC equivalent), do most all video cards support it, because if its only the new ones it is probably a useless optimization.
I can't wait to get into driver problems, as if my own project builder problems aren't enough :(
arekkusu
2004.01.02, 10:42 PM
VAR (http://developer.apple.com/opengl/extensions.html#GL_APPLE_vertex_array_range) is supported on all Quartz Extreme-capable GPUs, so no Rage128 or software renderer support.
btw I'm putting together a better at-a-glance reference page (http://homepage.mac.com/arekkusu/bugs/GLInfo.html) trying to mirror i.e. delphi3d (http://www.delphi3d.net/hardware/index.php) but I need another trip to the lab to test a few more cards and dump some more implementation limits...
OneSadCookie
2004.01.03, 12:42 AM
Interestingly, despite it not being on that list, I've had success using VAR on the Rage 128, and a decent performance improvement from it (better than CVA).
I wonder what's stopping it being officially supported?
wadesworld
2004.01.03, 01:43 AM
VAR is Apple's sorry attempt at making vertex arrays faster. It is really hard to use and really frustrating to try to use it...
VAR certainly isn't an Apple-specific thing. It's used on the PC as well.
VAR is pretty ugly, but it's one of the fastest ways to get things drawn until the new ARB replacement for VAR comes out (the name of which escapes me at the moment).
Wade
VAR certainly isn't an Apple-specific thing. It's used on the PC as well.That may be, but PC users have VBOs which are (arguably) easier and faster so they don't have to use VAR.
Mars_999
2004.01.05, 01:54 PM
That may be, but PC users have VBOs which are (arguably) easier and faster so they don't have to use VAR.
Ditto and its a GL1.5 requirement. Whoo hoo! Bring on 1.5 Apple.
JeroMiya
2004.01.05, 05:35 PM
Macworld this week. Hrm... maybe? Probably not since they just released new versions of xcode and osx. maybe wwdc, maybe. I'm hoping for GLSLANG support in shader builder, but that's quite a jump.
Jeremy Bell
WolverineSoft Project Coordinator
www.umich.edu/~wsoft
OneSadCookie
2004.01.05, 05:36 PM
GLSLang support in shader builder seems a bit premature when there's no GLSLang support in the drivers yet...
arekkusu
2004.01.05, 09:03 PM
Ditto and its a GL1.5 requirement. Whoo hoo! Bring on 1.5 Apple.
Whoa, whoa. Let's get 1.4 first, huh? ARB_point_parameters, and working AA points?
Actually, how about just getting the 1.0/1.1/1.2/1.3 features working first? I'm sick of the driver crashing.
arekkusu
2004.01.13, 11:19 AM
Interestingly, despite it not being on that list, I've had success using VAR on the Rage 128, and a decent performance improvement from it (better than CVA).
I wonder what's stopping it being officially supported?
Ok, after spending another 8 hours in the lab, I also found that VAR works fine on Rage128 and Rage128Pro.
HOWEVER! It does not work on the original Radeon! The vertices come out fine but texture coordinates are wrong. This chipset shows up in a bunch of iBooks as a Radeon 7000.
So...... there are Quartz-Extreme capable machines (rectangle tex etc) which don't support VAR!
>>ARGGGGGGH<< (have to go write another whole rendering path now...)
arekkusu
2004.01.13, 12:54 PM
It's also worth noting that GL_MAX_VERTEX_ARRAY_RANGE_ELEMENT_APPLE changed from 65535 to 0 on the Rage128/Rage128 Pro, between Jaguar and Panther. I guess they are indicating that it is not supposed to be supported.
However, it is still 65535 on the Radeon 7000 in both Jaguar and Panther, even though it doesn't work. Hrrrrm.
BTW my at-a-glance OS X GLInfo page (http://homepage.mac.com/arekkusu/bugs/GLInfo.html) is now updated for 10.3.2 and more cards.
arekkusu
2004.01.14, 10:40 AM
so I think I am using the alpha test correctly.
Back on topic, in case alpha test is still unclear, there is some good documentation about it here:
http://developer.nvidia.com/attach/1436
arekkusu
2004.01.14, 03:02 PM
And just to reply to myself again,
There was also a thread on this board where I showed exactly how to set up double buffered VAR, but Carlos seems to have nuked it in the big forum shuffle.
Looks like the thread is back, see here (http://www.idevgames.com/forum/showthread.php?t=5674)
vBulletin® v3.6.8, Copyright ©2000-2008, Jelsoft Enterprises Ltd.