2D engine done. Advice on optimisation, please?

Member
Posts: 110
Joined: 2009.07
Post: #16
warmi Wrote:Sounds good to me .... I can do about 400 32x32 alpha blended sprites in addition to a full screen dual-texture quad serving as the backdrop.

If I switch blending off , I can display about 1500 32x32 sprites while running at 30 fps.

That's on 2g devices ... on the latest iPhone ( 3gs) I can bump up my sprite count by about 5 times.

[Blinks] How?!

I know it's rude to ask, but having reduced the size of my sprites on-screen (they're now about 32-ish pixels square, randomly distributed in a huddle around a central point in world space), I can only get about 120 running at 30fps in my tests.

My sprites are sorted according to which texture atlas they use.
I have no render-state changes of any sort in my inner loop.
My colours now use ubytes.
I have a 3g ipod touch.
Quote this message in a reply
Member
Posts: 166
Joined: 2009.04
Post: #17
Madrayken Wrote:[Blinks] How?! I know it's rude to ask, but having reduced the size of my sprites on-screen (they're now about 32-ish pixels square, randomly distributed in a huddle around a central point in world space), I can only get about 120 running at 30fps in my tests. My sprites are sorted according to which texture atlas they use. I have no render-state changes of any sort in my inner loop. My colours now use ubytes. I have a 3g ipod touch.

I don’t know cause I don’t know what exactly you are doing :-)

Maxing out at 120 32x32 seem way to low … you should really check if you have any other bottlenecks.
Run your code with Instrument/CPU Sample and see what comes up on top.
If it is something like presentRenderbuffer(..) then you are fillrate limited which is fine … you should be fillrate limited but not at this low number of sprites.

In my code I am using shorts for vert coords:
VertX, VertY, VertZ, PAD,R,G,B,A, UVX,UVY
2, 2, 2, 2, 1, 1,1,1, 4, 4

But it doesn’t really make any difference with 2d rendering … I am considering going back to using floats for 2d stuff because of all that casting going on which is slowing things down ( and it is a complete waste on the latest iPhone)
Quote this message in a reply
Member
Posts: 166
Joined: 2009.04
Post: #18
Here is how I handle sprite drawing.


Code:
SpriteHandle Renderer2dSprite::draw(SpriteHandle handle,const RectangleI &screenRectangle, const RectangleI &textureRectangle,
                                    float scale,float rotate,const ColorQuad &tint)
{
    SpriteHandle encodedOffset=handle;
    DynamicRenderablePrimitive *sCollection;
    Material    *sMaterial;
    size_t spriteOffset;

    if(encodedOffset==InvalidSpriteHandle)
    {
        assert(mCurrentSprites!=0);
        sCollection=mCurrentSprites;
        sMaterial=mCurrentMaterial;
        spriteOffset=mCurrentSprites->getItemCount();
        encodedOffset=encodeOffset(mCurrentMaterialIndex,spriteOffset);
        if(spriteOffset>=mCurrentSprites->getItemCapacityCount())
        {    
            mCurrentSprites->setCapacity(mCurrentSprites->getItemCapacityCount()*2);
        }        
        mCurrentSprites->increaseItemCount();
    }
    else
    {
        getMaterialSpriteCollection(mSprites,handle,&sMaterial,&sCollection);
        spriteOffset=decodeOffset(handle);
    }

    assert(sMaterial->getTextureUnit(0).getTexture()!=0);

    const float texWidth=(float)sMaterial->getTextureUnit(0).getTexture()->getWidth();
    const float texHeight=(float)sMaterial->getTextureUnit(0).getTexture()->getHeight();


    Vector4 uvs(((float)textureRectangle.mX1)/texWidth,1.0f-((float)textureRectangle.mY1)/texHeight,
        ((float)textureRectangle.mX2+1)/texWidth,1.0f-((float)textureRectangle.mY2+1)/texHeight);

    float      cos_rot=1.0f;
    float      sin_rot=0.0f;
    if(rotate!=0)
    {
        rotate=-rotate;            
        MathUtils::SinCos(rotate,sin_rot,cos_rot);
    }


    float width = ((float)(screenRectangle.mX2+1-screenRectangle.mX1))/2;
    float height =((float)(screenRectangle.mY2+1-screenRectangle.mY1))/2;
    float mid_u = ((float)screenRectangle.mX1)+width;
    float mid_v = ((float)screenRectangle.mY1)+height;
    width*=scale;
    height*=scale;

    const float cos_rot_w = cos_rot * width;
    const float cos_rot_h = cos_rot * height;
    const float sin_rot_w = sin_rot * width;
    const float sin_rot_h = sin_rot * height;


    float *vertexFData=(float*)sCollection->getItemData(spriteOffset);
    unsigned int *vertexIData;
    short *vertexSData=(short*)sCollection->getItemData(spriteOffset);

    assert(vertexFData!=0);
    
        
    *vertexSData++=(short)mid_u - cos_rot_w - sin_rot_h;
    *vertexSData++=(short)mid_v - sin_rot_w + cos_rot_h;        
    *vertexSData++=(short)mCurrDepth;
    vertexSData++;
    
    vertexFData=(float*)(vertexSData);
    *vertexFData++ = uvs.x;
    *vertexFData++ = uvs.w;
    
    vertexIData=(unsigned int*)(vertexFData);
    *vertexIData++=tint.mColors[0];    
    
    vertexSData=(short*)(vertexIData);
    *vertexSData++=(short)mid_u + cos_rot_w - sin_rot_h;
    *vertexSData++=(short)mid_v + sin_rot_w + cos_rot_h;
    *vertexSData++=(short)mCurrDepth;
    vertexSData++;

    vertexFData=(float*)(vertexSData);    
    *vertexFData++ = uvs.z;
    *vertexFData++ = uvs.w;
    
    vertexIData=(unsigned int*)(vertexFData);
    *vertexIData++=tint.mColors[1];    
    
    vertexSData=(short*)(vertexIData);
    *vertexSData++=(short)mid_u + cos_rot_w + sin_rot_h;
    *vertexSData++=(short)mid_v + sin_rot_w - cos_rot_h;
    *vertexSData++=(short)mCurrDepth;
    vertexSData++;
    
    vertexFData=(float*)(vertexSData);
    *vertexFData++ = uvs.z;
    *vertexFData++ = uvs.y;
    
    vertexIData=(unsigned int*)(vertexFData);
    *vertexIData++=tint.mColors[2];    
    
    vertexSData=(short*)(vertexIData);
    *vertexSData++=(short)mid_u - cos_rot_w + sin_rot_h;
    *vertexSData++=(short)mid_v - sin_rot_w - cos_rot_h;
    *vertexSData++=(short)mCurrDepth;
    vertexSData++;
    
    vertexFData=(float*)(vertexSData);
    *vertexFData++ = uvs.x;
    *vertexFData++ = uvs.y;
    
    vertexIData=(unsigned int*)(vertexFData);
    *vertexIData++=tint.mColors[3];            
    
    return encodedOffset;
}
Quote this message in a reply
Member
Posts: 110
Joined: 2009.07
Post: #19
Thanks for that snippet - very generous of you.

Annoyingly, as far as I can see I'm doing everything 'right'. I'm having issues getting the openGL performance tool to work at the moment, which makes life harder (but that's another thread entirely).

I think there's something fundamental which I'm missing somewhere. Something to do with the format of my texture atlas, or the way I've used the EAGLview as the basis of my engine, or... something. Just not seeing it yet.
Quote this message in a reply
Member
Posts: 166
Joined: 2009.04
Post: #20
Madrayken Wrote:Thanks for that snippet - very generous of you.

Annoyingly, as far as I can see I'm doing everything 'right'. I'm having issues getting the openGL performance tool to work at the moment, which makes life harder (but that's another thread entirely).

I think there's something fundamental which I'm missing somewhere. Something to do with the format of my texture atlas, or the way I've used the EAGLview as the basis of my engine, or... something. Just not seeing it yet.

Don't bother with OpenGL Perf tool .... just run it with Instruments/ CPU sample - it will tell you what is your CPU doing which is what you want to rule out first before you attempt to optimize OpenGL related parts.
Quote this message in a reply
Member
Posts: 110
Joined: 2009.07
Post: #21
Here's mine...

Code:
- (void) copyDrawDataWithCamera: (CCamera *) camera ToVertexDataArray: (GLfloat *) vertex_data_array
{
    if(atlasRect != nil)
    {        
        // Reset the local vertex array. Each object is centred at this point.
        memcpy(&vertex_data_array[TOTAL_VERTEX_DATA_ELEMENTS * 0], &atlasRect.vertexCoords[ENTRIES_PER_VERTEX_COORD * 0], sizeof(GLfloat) * ENTRIES_PER_VERTEX_COORD);
        memcpy(&vertex_data_array[TOTAL_VERTEX_DATA_ELEMENTS * 1], &atlasRect.vertexCoords[ENTRIES_PER_VERTEX_COORD * 1], sizeof(GLfloat) * ENTRIES_PER_VERTEX_COORD);
        memcpy(&vertex_data_array[TOTAL_VERTEX_DATA_ELEMENTS * 2], &atlasRect.vertexCoords[ENTRIES_PER_VERTEX_COORD * 2], sizeof(GLfloat) * ENTRIES_PER_VERTEX_COORD);            
        memcpy(&vertex_data_array[TOTAL_VERTEX_DATA_ELEMENTS * 3], &atlasRect.vertexCoords[ENTRIES_PER_VERTEX_COORD * 3], sizeof(GLfloat) * ENTRIES_PER_VERTEX_COORD);            
                    
        matrixInit(&spriteManipulationMatrix);
        matrixTranslate(&spriteManipulationMatrix, self.worldCoord.X, self.worldCoord.Y, self.worldCoord.Z);                

        if([axis angleZ])    matrixRotateZ(&spriteManipulationMatrix, [axis angleZ]);    //[self printMatrix];

        for(int i = 0; i < VERTICES_PER_SPRITE; i++)
        {    
            GLfloat *array = &vertex_data_array[TOTAL_VERTEX_DATA_ELEMENTS * i];
            [self applyMatrix: &spriteManipulationMatrix ToGLVertexArrayElement: array];
        }
        
        // Convert the GLfloat vertex_data_array pointer into a GLubyte pointer
        GLubyte *colour_array = ((GLubyte*)vertex_data_array) + COLOUR_INDEX_FLOAT_OFFSET * sizeof(GLfloat);

        for(int i = 0; i <  VERTICES_PER_SPRITE; i++)
        {        
            int index = i * STRIDE_SIZE;

            // Note - openGL for some reason expects all the colour information to be pre-multiplied.
            // Therefore, we're multiplying all of the RGB values by the Alpha value.
            colour_array[index + 0] = (GLubyte) (255.0f * red * alpha);        // R
            colour_array[index + 1] = (GLubyte) (255.0f * green * alpha);    // G
            colour_array[index + 2] = (GLubyte) (255.0f * blue * alpha);    // B
            colour_array[index + 3] = (GLubyte) (255.0f * alpha);            // A
        }        
    }    
}


And here's where it's called from:

Code:
- (void) drawWithCamera: (CCamera *)camera
{    
    NSArray *drawset        = [spriteDictionary allValues]; // create a nice, sortable array
    NSArray *sorted_objects = [drawset sortedArrayUsingSelector:@selector(compareByTextureID:)];    // Now sort based on the textureID of each sprite

    // Format per point is:
    // x,y,z, r,g,b,a, u,v    
    glEnable(GL_TEXTURE_2D);
    glEnableClientState(GL_COLOR_ARRAY);  
    glEnableClientState(GL_TEXTURE_COORD_ARRAY);

    glVertexPointer(ENTRIES_PER_VERTEX_COORD, GL_FLOAT, STRIDE_SIZE, &vertexDataArray[VERTEX_INDEX_FLOAT_OFFSET]);
    glTexCoordPointer(ENTRIES_PER_TEXTURE_COORD, GL_FLOAT, STRIDE_SIZE, &vertexDataArray[TEXTURE_INDEX_FLOAT_OFFSET]);           
    glColorPointer(ENTRIES_PER_COLOUR_ELEMENT, GL_UNSIGNED_BYTE, STRIDE_SIZE, &vertexDataArray[COLOUR_INDEX_FLOAT_OFFSET]);

    
    //------------------------------------------
    // Go through the sorted objects and find out when the texture changes occur
    // For each texture, store an index of which sprite it changes on,
    // and how many sprites in the list after that start point use this texture
    for(int i = 0; i < parentView.numberOfAtlasTextures; i++)
    {
        atlasToSortedSpriteIndexStart[i] = -1;    // But first, kill the index list completely
        atlasToSortedSpriteIndexCount[i] = -1;    
        
        int start = -1;
        int count = 0;
        
        int total_sprite_count = [sorted_objects count];
        for (int j = 0; j < total_sprite_count; j++)
        {
            CSprite *sprite = [sorted_objects objectAtIndex: j];
            
            if([sprite getTextureIndex] == i) // Check if the atlas's id matches the sprite's id
            {
                if(start == -1)
                    start = j;
                count++;
            }
        }
        atlasToSortedSpriteIndexStart[i] = start;
        atlasToSortedSpriteIndexCount[i] = count;        
    }    
    
    //------------------------------------------
    // Now draw all the sprites, sorting by texture-type
    // We use a single, long vertex buffer and use clever indexing, as opposed to constantly
    // creating and deleting new vertex buffers for each
    for(int c_atlas_index = 0; c_atlas_index < parentView.numberOfAtlasTextures; c_atlas_index++)
    {
        int vertex_data_array_index = 0; // Keeps track of where to write next. Reset for each texture
        
        int c_current_sprite_index = atlasToSortedSpriteIndexStart[c_atlas_index];        
        if(c_current_sprite_index < 0) // Skip this whole thing if there are no sprites using this atlas
            continue;
        
        glBindTexture(GL_TEXTURE_2D, [self getTextureIDWithIndex: c_atlas_index]);

        // For every sprite using this atlas...
        for (int h = 0; h < atlasToSortedSpriteIndexCount[c_atlas_index]; h++)
        {
            CSprite *sprite = [sorted_objects objectAtIndex: (c_current_sprite_index + h)];
    
            // Check if the texture page has changed. Only then change the texture bindings
            [sprite copyDrawDataWithCamera: camera ToVertexDataArray: &vertexDataArray[vertex_data_array_index]];
        
            // Now use this information to build the Vertex buffer
            for(int i = 0; i < VERTICES_PER_SPRITE; i++)
            {            
                vertex_data_array_index += ENTRIES_PER_VERTEX_COORD; // skip the already copied vertex information
                vertex_data_array_index += FLOATS_PER_COLOUR_ELEMENT; // skip the already copied colour information

                // Create uv texture info
                for(int j = 0; j < ENTRIES_PER_TEXTURE_COORD; j++)
                {
                    vertexDataArray[vertex_data_array_index] = sprite.atlasRect.textureCoords[i * ENTRIES_PER_TEXTURE_COORD + j];
                    vertex_data_array_index++;
                }            
            }    
        }
        
        // Here's the magic.
        int size = atlasToSortedSpriteIndexCount[c_atlas_index] * VERTEX_INDICES_PER_SPRITE;
        glDrawElements(GL_TRIANGLES, size, GL_UNSIGNED_SHORT,indexArray);
    }
    
    // And now turn it all off again
    glDisableClientState(GL_COLOR_ARRAY);
    glDisableClientState(GL_TEXTURE_COORD_ARRAY);
    glDisable(GL_TEXTURE_2D);
}
Quote this message in a reply
Post Reply 

Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  new engine in works. advice? godexsoft 4 4,506 Apr 7, 2012 07:25 AM
Last Post: godexsoft
  Advice on 3D Room engine? devGamer 4 3,824 Mar 21, 2010 10:05 AM
Last Post: Skorche