2D engine done. Advice on optimisation, please?
warmi Wrote:Sounds good to me .... I can do about 400 32x32 alpha blended sprites in addition to a full screen dual-texture quad serving as the backdrop.
If I switch blending off , I can display about 1500 32x32 sprites while running at 30 fps.
That's on 2g devices ... on the latest iPhone ( 3gs) I can bump up my sprite count by about 5 times.
[Blinks] How?!
I know it's rude to ask, but having reduced the size of my sprites on-screen (they're now about 32-ish pixels square, randomly distributed in a huddle around a central point in world space), I can only get about 120 running at 30fps in my tests.
My sprites are sorted according to which texture atlas they use.
I have no render-state changes of any sort in my inner loop.
My colours now use ubytes.
I have a 3g ipod touch.
Madrayken Wrote:[Blinks] How?! I know it's rude to ask, but having reduced the size of my sprites on-screen (they're now about 32-ish pixels square, randomly distributed in a huddle around a central point in world space), I can only get about 120 running at 30fps in my tests. My sprites are sorted according to which texture atlas they use. I have no render-state changes of any sort in my inner loop. My colours now use ubytes. I have a 3g ipod touch.
I don’t know cause I don’t know what exactly you are doing :-)
Maxing out at 120 32x32 seem way to low … you should really check if you have any other bottlenecks.
Run your code with Instrument/CPU Sample and see what comes up on top.
If it is something like presentRenderbuffer(..) then you are fillrate limited which is fine … you should be fillrate limited but not at this low number of sprites.
In my code I am using shorts for vert coords:
VertX, VertY, VertZ, PAD,R,G,B,A, UVX,UVY
2, 2, 2, 2, 1, 1,1,1, 4, 4
But it doesn’t really make any difference with 2d rendering … I am considering going back to using floats for 2d stuff because of all that casting going on which is slowing things down ( and it is a complete waste on the latest iPhone)
Here is how I handle sprite drawing.
Code:
SpriteHandle Renderer2dSprite::draw(SpriteHandle handle,const RectangleI &screenRectangle, const RectangleI &textureRectangle,
float scale,float rotate,const ColorQuad &tint)
{
SpriteHandle encodedOffset=handle;
DynamicRenderablePrimitive *sCollection;
Material *sMaterial;
size_t spriteOffset;
if(encodedOffset==InvalidSpriteHandle)
{
assert(mCurrentSprites!=0);
sCollection=mCurrentSprites;
sMaterial=mCurrentMaterial;
spriteOffset=mCurrentSprites->getItemCount();
encodedOffset=encodeOffset(mCurrentMaterialIndex,spriteOffset);
if(spriteOffset>=mCurrentSprites->getItemCapacityCount())
{
mCurrentSprites->setCapacity(mCurrentSprites->getItemCapacityCount()*2);
}
mCurrentSprites->increaseItemCount();
}
else
{
getMaterialSpriteCollection(mSprites,handle,&sMaterial,&sCollection);
spriteOffset=decodeOffset(handle);
}
assert(sMaterial->getTextureUnit(0).getTexture()!=0);
const float texWidth=(float)sMaterial->getTextureUnit(0).getTexture()->getWidth();
const float texHeight=(float)sMaterial->getTextureUnit(0).getTexture()->getHeight();
Vector4 uvs(((float)textureRectangle.mX1)/texWidth,1.0f-((float)textureRectangle.mY1)/texHeight,
((float)textureRectangle.mX2+1)/texWidth,1.0f-((float)textureRectangle.mY2+1)/texHeight);
float cos_rot=1.0f;
float sin_rot=0.0f;
if(rotate!=0)
{
rotate=-rotate;
MathUtils::SinCos(rotate,sin_rot,cos_rot);
}
float width = ((float)(screenRectangle.mX2+1-screenRectangle.mX1))/2;
float height =((float)(screenRectangle.mY2+1-screenRectangle.mY1))/2;
float mid_u = ((float)screenRectangle.mX1)+width;
float mid_v = ((float)screenRectangle.mY1)+height;
width*=scale;
height*=scale;
const float cos_rot_w = cos_rot * width;
const float cos_rot_h = cos_rot * height;
const float sin_rot_w = sin_rot * width;
const float sin_rot_h = sin_rot * height;
float *vertexFData=(float*)sCollection->getItemData(spriteOffset);
unsigned int *vertexIData;
short *vertexSData=(short*)sCollection->getItemData(spriteOffset);
assert(vertexFData!=0);
*vertexSData++=(short)mid_u - cos_rot_w - sin_rot_h;
*vertexSData++=(short)mid_v - sin_rot_w + cos_rot_h;
*vertexSData++=(short)mCurrDepth;
vertexSData++;
vertexFData=(float*)(vertexSData);
*vertexFData++ = uvs.x;
*vertexFData++ = uvs.w;
vertexIData=(unsigned int*)(vertexFData);
*vertexIData++=tint.mColors[0];
vertexSData=(short*)(vertexIData);
*vertexSData++=(short)mid_u + cos_rot_w - sin_rot_h;
*vertexSData++=(short)mid_v + sin_rot_w + cos_rot_h;
*vertexSData++=(short)mCurrDepth;
vertexSData++;
vertexFData=(float*)(vertexSData);
*vertexFData++ = uvs.z;
*vertexFData++ = uvs.w;
vertexIData=(unsigned int*)(vertexFData);
*vertexIData++=tint.mColors[1];
vertexSData=(short*)(vertexIData);
*vertexSData++=(short)mid_u + cos_rot_w + sin_rot_h;
*vertexSData++=(short)mid_v + sin_rot_w - cos_rot_h;
*vertexSData++=(short)mCurrDepth;
vertexSData++;
vertexFData=(float*)(vertexSData);
*vertexFData++ = uvs.z;
*vertexFData++ = uvs.y;
vertexIData=(unsigned int*)(vertexFData);
*vertexIData++=tint.mColors[2];
vertexSData=(short*)(vertexIData);
*vertexSData++=(short)mid_u - cos_rot_w + sin_rot_h;
*vertexSData++=(short)mid_v - sin_rot_w - cos_rot_h;
*vertexSData++=(short)mCurrDepth;
vertexSData++;
vertexFData=(float*)(vertexSData);
*vertexFData++ = uvs.x;
*vertexFData++ = uvs.y;
vertexIData=(unsigned int*)(vertexFData);
*vertexIData++=tint.mColors[3];
return encodedOffset;
}
Thanks for that snippet - very generous of you.
Annoyingly, as far as I can see I'm doing everything 'right'. I'm having issues getting the openGL performance tool to work at the moment, which makes life harder (but that's another thread entirely).
I think there's something fundamental which I'm missing somewhere. Something to do with the format of my texture atlas, or the way I've used the EAGLview as the basis of my engine, or... something. Just not seeing it yet.
Annoyingly, as far as I can see I'm doing everything 'right'. I'm having issues getting the openGL performance tool to work at the moment, which makes life harder (but that's another thread entirely).
I think there's something fundamental which I'm missing somewhere. Something to do with the format of my texture atlas, or the way I've used the EAGLview as the basis of my engine, or... something. Just not seeing it yet.
Madrayken Wrote:Thanks for that snippet - very generous of you.
Annoyingly, as far as I can see I'm doing everything 'right'. I'm having issues getting the openGL performance tool to work at the moment, which makes life harder (but that's another thread entirely).
I think there's something fundamental which I'm missing somewhere. Something to do with the format of my texture atlas, or the way I've used the EAGLview as the basis of my engine, or... something. Just not seeing it yet.
Don't bother with OpenGL Perf tool .... just run it with Instruments/ CPU sample - it will tell you what is your CPU doing which is what you want to rule out first before you attempt to optimize OpenGL related parts.
Here's mine...
And here's where it's called from:
Code:
- (void) copyDrawDataWithCamera: (CCamera *) camera ToVertexDataArray: (GLfloat *) vertex_data_array
{
if(atlasRect != nil)
{
// Reset the local vertex array. Each object is centred at this point.
memcpy(&vertex_data_array[TOTAL_VERTEX_DATA_ELEMENTS * 0], &atlasRect.vertexCoords[ENTRIES_PER_VERTEX_COORD * 0], sizeof(GLfloat) * ENTRIES_PER_VERTEX_COORD);
memcpy(&vertex_data_array[TOTAL_VERTEX_DATA_ELEMENTS * 1], &atlasRect.vertexCoords[ENTRIES_PER_VERTEX_COORD * 1], sizeof(GLfloat) * ENTRIES_PER_VERTEX_COORD);
memcpy(&vertex_data_array[TOTAL_VERTEX_DATA_ELEMENTS * 2], &atlasRect.vertexCoords[ENTRIES_PER_VERTEX_COORD * 2], sizeof(GLfloat) * ENTRIES_PER_VERTEX_COORD);
memcpy(&vertex_data_array[TOTAL_VERTEX_DATA_ELEMENTS * 3], &atlasRect.vertexCoords[ENTRIES_PER_VERTEX_COORD * 3], sizeof(GLfloat) * ENTRIES_PER_VERTEX_COORD);
matrixInit(&spriteManipulationMatrix);
matrixTranslate(&spriteManipulationMatrix, self.worldCoord.X, self.worldCoord.Y, self.worldCoord.Z);
if([axis angleZ]) matrixRotateZ(&spriteManipulationMatrix, [axis angleZ]); //[self printMatrix];
for(int i = 0; i < VERTICES_PER_SPRITE; i++)
{
GLfloat *array = &vertex_data_array[TOTAL_VERTEX_DATA_ELEMENTS * i];
[self applyMatrix: &spriteManipulationMatrix ToGLVertexArrayElement: array];
}
// Convert the GLfloat vertex_data_array pointer into a GLubyte pointer
GLubyte *colour_array = ((GLubyte*)vertex_data_array) + COLOUR_INDEX_FLOAT_OFFSET * sizeof(GLfloat);
for(int i = 0; i < VERTICES_PER_SPRITE; i++)
{
int index = i * STRIDE_SIZE;
// Note - openGL for some reason expects all the colour information to be pre-multiplied.
// Therefore, we're multiplying all of the RGB values by the Alpha value.
colour_array[index + 0] = (GLubyte) (255.0f * red * alpha); // R
colour_array[index + 1] = (GLubyte) (255.0f * green * alpha); // G
colour_array[index + 2] = (GLubyte) (255.0f * blue * alpha); // B
colour_array[index + 3] = (GLubyte) (255.0f * alpha); // A
}
}
}And here's where it's called from:
Code:
- (void) drawWithCamera: (CCamera *)camera
{
NSArray *drawset = [spriteDictionary allValues]; // create a nice, sortable array
NSArray *sorted_objects = [drawset sortedArrayUsingSelector:@selector(compareByTextureID:)]; // Now sort based on the textureID of each sprite
// Format per point is:
// x,y,z, r,g,b,a, u,v
glEnable(GL_TEXTURE_2D);
glEnableClientState(GL_COLOR_ARRAY);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
glVertexPointer(ENTRIES_PER_VERTEX_COORD, GL_FLOAT, STRIDE_SIZE, &vertexDataArray[VERTEX_INDEX_FLOAT_OFFSET]);
glTexCoordPointer(ENTRIES_PER_TEXTURE_COORD, GL_FLOAT, STRIDE_SIZE, &vertexDataArray[TEXTURE_INDEX_FLOAT_OFFSET]);
glColorPointer(ENTRIES_PER_COLOUR_ELEMENT, GL_UNSIGNED_BYTE, STRIDE_SIZE, &vertexDataArray[COLOUR_INDEX_FLOAT_OFFSET]);
//------------------------------------------
// Go through the sorted objects and find out when the texture changes occur
// For each texture, store an index of which sprite it changes on,
// and how many sprites in the list after that start point use this texture
for(int i = 0; i < parentView.numberOfAtlasTextures; i++)
{
atlasToSortedSpriteIndexStart[i] = -1; // But first, kill the index list completely
atlasToSortedSpriteIndexCount[i] = -1;
int start = -1;
int count = 0;
int total_sprite_count = [sorted_objects count];
for (int j = 0; j < total_sprite_count; j++)
{
CSprite *sprite = [sorted_objects objectAtIndex: j];
if([sprite getTextureIndex] == i) // Check if the atlas's id matches the sprite's id
{
if(start == -1)
start = j;
count++;
}
}
atlasToSortedSpriteIndexStart[i] = start;
atlasToSortedSpriteIndexCount[i] = count;
}
//------------------------------------------
// Now draw all the sprites, sorting by texture-type
// We use a single, long vertex buffer and use clever indexing, as opposed to constantly
// creating and deleting new vertex buffers for each
for(int c_atlas_index = 0; c_atlas_index < parentView.numberOfAtlasTextures; c_atlas_index++)
{
int vertex_data_array_index = 0; // Keeps track of where to write next. Reset for each texture
int c_current_sprite_index = atlasToSortedSpriteIndexStart[c_atlas_index];
if(c_current_sprite_index < 0) // Skip this whole thing if there are no sprites using this atlas
continue;
glBindTexture(GL_TEXTURE_2D, [self getTextureIDWithIndex: c_atlas_index]);
// For every sprite using this atlas...
for (int h = 0; h < atlasToSortedSpriteIndexCount[c_atlas_index]; h++)
{
CSprite *sprite = [sorted_objects objectAtIndex: (c_current_sprite_index + h)];
// Check if the texture page has changed. Only then change the texture bindings
[sprite copyDrawDataWithCamera: camera ToVertexDataArray: &vertexDataArray[vertex_data_array_index]];
// Now use this information to build the Vertex buffer
for(int i = 0; i < VERTICES_PER_SPRITE; i++)
{
vertex_data_array_index += ENTRIES_PER_VERTEX_COORD; // skip the already copied vertex information
vertex_data_array_index += FLOATS_PER_COLOUR_ELEMENT; // skip the already copied colour information
// Create uv texture info
for(int j = 0; j < ENTRIES_PER_TEXTURE_COORD; j++)
{
vertexDataArray[vertex_data_array_index] = sprite.atlasRect.textureCoords[i * ENTRIES_PER_TEXTURE_COORD + j];
vertex_data_array_index++;
}
}
}
// Here's the magic.
int size = atlasToSortedSpriteIndexCount[c_atlas_index] * VERTEX_INDICES_PER_SPRITE;
glDrawElements(GL_TRIANGLES, size, GL_UNSIGNED_SHORT,indexArray);
}
// And now turn it all off again
glDisableClientState(GL_COLOR_ARRAY);
glDisableClientState(GL_TEXTURE_COORD_ARRAY);
glDisable(GL_TEXTURE_2D);
}
Possibly Related Threads...
| Thread: | Author | Replies: | Views: | Last Post | |
| new engine in works. advice? | godexsoft | 4 | 3,827 |
Apr 7, 2012 07:25 AM Last Post: godexsoft |
|
| Advice on 3D Room engine? | devGamer | 4 | 3,421 |
Mar 21, 2010 10:05 AM Last Post: Skorche |
|

