Drawing strategies for many small models

Oldtimer
Posts: 834
Joined: 2002.09
Post: #1
Hi all,
Just wanted some help to reason about how I should refactor my rendering code.
[Image: gal.png]
The game in question is Galder, where I have exactly 144 small (typically < 20 triangles) models onscreen at all times. Right now, due to the layout of code and game, I do the following for each and every of those 144 models:

- Push a matrix
- Translate
- Rotate
- Render
- Pop matrix

(Because of how the data structures look, I can't just translate to a given radius and then incrementally rotate them, I have to rotate first and translate after that, and then pop back to the origin and re-do it all for the next stone.)

Besides being an overhead nightmare, it's not very efficient. Most of the models aren't moving at a given time. It also messes with my lighting setup, since the light doesn't interact properly when I translate the view. I'd like to just throw them all into a display list or VBO or something and just push the entire thing to the card. However, it's not apparent how I should do this...

One approach would be to just transform all the stones by hand on the CPU, put the world-space vertices in a VBO and render that. This would allow me to sort the rendering on material instead of position, which would save state changes.

I know I should test it first and profile, but I'm just asking for a ballpark guess here: would it be totally insane to have one dynamic VBO with all the stones (say 4000 polys), and possibly changing it every frame?
Quote this message in a reply
Sage
Posts: 1,199
Joined: 2004.10
Post: #2
Is it really that bad right now? I mean, have you profiled it to see if it's really the rendering of the stones that's causing whatever performance hit you're perceiving?

I was rendering at least 100 models with hundreds of vertices each, plus terrain and atmospheric effects on my g4 with a 5200 and got about 30 to 45 fps. Now, I kept each model's geometry on the card via display lists, but otherwise my approach was basically the same. Push the model's transform, render, pop matrix.

Regarding incremental update in a VBO -- I'd stay away from that. Unless you keep an untransformed copy of each model somewhere else in RAM, you'll get progressive geometric deformation due to float errors. If I were going to take the one-big-vbo approach, I'd see if I couldn't use a vertex shader to perform the transformation. I'd use a vertex attribute to provide a transformation matrix for each vertex, or a vec4 representing a quaternion.
Quote this message in a reply
Oldtimer
Posts: 834
Joined: 2002.09
Post: #3
Yeah, I ran it through the GL profiler and the top four calls are CGLFlushDrawable (no big surprise), glBegin, glVertex3f and glPopAttrib, with all of the other immediate mode calls lining up below.
[Image: immediatemode.png]
Now that's a case for ditching immediate mode if I ever saw one... Wink

I tried putting the stones in display lists, but I just dropped a frame per second or two, and glCallLists appeared high up in the profile instead.

Regarding the one-VBO approach, I had intended to manually perform the local-to-world transform on the CPU every frame...
Quote this message in a reply
Oldtimer
Posts: 834
Joined: 2002.09
Post: #4
Hm, I did some more in-depth profiling, and while the stones are a big part of the performance problems, the GUI and grid are as well. Perhaps I should focus my efforts there for the time being, and refactor the stone renderer without regards for performance in that particular section. Thanks for the ideas, Shamyl! Smile
Quote this message in a reply
Member
Posts: 446
Joined: 2002.09
Post: #5
Display lists should always be faster than immediate mode but if they only contain a few vertexes and you're fill-limited you might not see a big improvement.

BTW: glPush/PopAttrib is evil. Try sorting your objects by texture/state as much as possible and just change states as needed (e.g. shadow the GL values in your own variables).
Quote this message in a reply
Sage
Posts: 1,199
Joined: 2004.10
Post: #6
Frank C. Wrote:BTW: glPush/PopAttrib is evil. Try sorting your objects by texture/state as much as possible and just change states as needed (e.g. shadow the GL values in your own variables).

I'm of the opinion that glPush/PopAttrib is so harmful it should be removed from OpenGL. It's such a temptation to use, and it always slows things down unacceptably.

You'll want to just get rid of it and carefully manage your state on your own.
Quote this message in a reply
Moderator
Posts: 1,140
Joined: 2005.07
Post: #7
What do you use to draw the fonts? I've used FTGL with polygon fonts before, and that killed performance. Right now I use FTGL with texture fonts (with a custom build that removes glPushAttrib/glPopAttrib, of course Rasp) and it runs much faster. Regardless of what you use, it should use textures. And at least some part of the font should reside in a display list.
Quote this message in a reply
Oldtimer
Posts: 834
Joined: 2002.09
Post: #8
Wow, thanks for all the replies.
Yeah, I guess that push/popattrib are ev0l. I'll give it a pass when I actually get to optimization. I fixed the bug in my shader, which kind of destroyed my initial motivation to refactor the rendering code. Still, I think it would be a good thing to batch the vertices together, and I've decided to do the objectspace->worldspace by hand on the CPU, partially because I'm only doing 2D transforms.

That said, I am using FTGL, and yes, turning off font rendering doubled my framerate, so that is getting a pass as well. akb825, what kind of customizations did you do on FTGL?
Quote this message in a reply
Moderator
Posts: 1,140
Joined: 2005.07
Post: #9
All I really did was take out the glPushAttrib and glPopAttrib calls. I have a framework version here with freetype embedded. (and it also includes that change)
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #10
I haven't read this thread, so maybe this is irrelevant, but I investigated how to draw many small models for Outnumbered, and the test program is here: http://onesadcookie.com/trac/browser/Kiloplane/ -- Press the number keys to switch drawing modes, and be aware that only modes 1 & 2 are guaranteed to work on Mac OS X (the others will render incorrectly or crash, depending on your video card and Mac OS X version). On NVidia/Linux, all work correctly, but none provide a measurable speed improvement over mode 2.

Coming soon to a $600 video card near you: EXT_draw_instanced
Quote this message in a reply
Member
Posts: 749
Joined: 2003.01
Post: #11
wait, are you people saying that also pushmatrix-popmatrix are evil? or just glPushAttrib?

©h€ck øut µy stuƒƒ åt ragdollsoft.com
New game in development Rubber Ninjas - Mac Games Downloads
Quote this message in a reply
Sage
Posts: 1,403
Joined: 2005.07
Post: #12
no one mentioned matrix, its fine to push and pop them btw

Sir, e^iπ + 1 = 0, hence God exists; reply!
Quote this message in a reply
Post Reply 

Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  Nice models and free models, new site! 3d4ya 4 4,086 Sep 18, 2008 07:36 AM
Last Post: NYGhost
  how small is a "small" display list? Diplomtennis 5 4,046 Oct 31, 2004 10:38 AM
Last Post: Hog
  Small nooB question. hyperzoanoid 3 2,745 Apr 25, 2003 10:23 AM
Last Post: DoG