Accelerate and working with Integer Arrays SIMD

Member
Posts: 95
Joined: 2009.09
Post: #1
Hey guys,

I recently saw the WWDC Lecture on the accelerate framework and how to use it to speed up vector-arithmetic on the iPhone.
Now in order to fill my Interleaved Vertex-Arrays I have to do some vector-calculations and I thought I could improve my fill rates that way.

Sadly BLAS only seems to support floating-point operations.

But for my Elements-Array that tells OpenGL which Triangles to connect, I need vector operations on integers.
Basically I would want to add an "unsigned int" componentwise to a vector of "unsigned int"'s (in matlab-like pseudo-code):

bigArray[n:n+83] = smallArray[0:83] + v * ones(84);

EDIT: For the general interleaved Vertex Array of type:

Code:
typedef struct {
    float x;
    float y;
    float z;
} Vertex3;


typedef struct _iVertex3D
{
    unsigned int color;
    Vertex3 v;
    Vertex3 n;
    float uv[2];
} iVertex3D;

Using BLAS vector operations and some strides, I can overwrite all the float types in this struct but I'm having a hard time with the integer color component.
If I have to do a for-loop again to set the color, I would loose all the benefit of SIMD I aimed for in the first place.
Quote this message in a reply
Member
Posts: 95
Joined: 2009.09
Post: #2
Basically I'm searching for the integer equivalent to the BLAS function:

DAXPY
Quote this message in a reply
Sage
Posts: 1,232
Joined: 2002.10
Post: #3
Side note: unsigned int != GLubyte[4]. If you're going to pack colors like that, please be aware of endianness.
Quote this message in a reply
Member
Posts: 95
Joined: 2009.09
Post: #4
arekkusu Wrote:Side note: unsigned int != GLubyte[4]. If you're going to pack colors like that, please be aware of endianness.

Well I'm using "GL_UNSIGNED_BYTE" and that works quite well.
I mean the whole thing is running for quite a while now and I just thought I could save some battery power/speed it up a little by using vector operations for the filling of my interleaved array.

Anybody here not using for-loops to add vertices to an interleaved array?

I mean it might be possible to save quite a lot of fp-operations here, if done correctly. I can't be the only one wonderin about this!
Quote this message in a reply
Member
Posts: 95
Joined: 2009.09
Post: #5
Managed to get this working and posted a new blog entry with detailed speed comparison for OpenGL interleaved Array fill rates:

http://tacticadev.wordpress.com/2010/07/...p-vs-vdsp/
Quote this message in a reply
Member
Posts: 227
Joined: 2008.08
Post: #6
I'm not exactly sure if compilers optimize memcpy, but even so, it would be nice to see a comparison of the SIMD example comparing memcpy and a for-loop with:

Code:
//Assuming you're using triangles here
memcpy(dest, basicScrub, scrubVertexCount*sizeof(iVertex3D));
//versus
iVertex3D* dest = _interleavedVerts+_vertexCount;
for (int k=0; k<scrubVertexCount;k+=3) {
     dest[k+0]=basicScrub[k];
     dest[k+1]=basicScrub[k];
     dest[k+2]=basicScrub[k];
}

Just that I've only ever seen memcpy as a straight byte-for-loop-and-assign.

(Please prove me wrong here)
Quote this message in a reply
Member
Posts: 95
Joined: 2009.09
Post: #7
"iVertex3D" is the data structure for one vertex, that is a point in 3D.
So I tried the following sequential code:
Code:
iVertex3D* dest = _interleavedVerts+_vertexCount;
for (int k=0; k<scrubVertexCount;k++) {
    dest[k]=basicScrub[k];
}

Since I don't have my iPod-Cable with me, I could only test on the Simulator, that gives about 10% worse results then memcpy.
But since I stop the whole drawing process for one scrub, 10% of the whole thing seem to be quite a big chunk for memcopy alone. I'm rechecking at the iPod once I get home Smile

Update: It's roughly the same on the iPod, also about +10% time using the loop.
Quote this message in a reply
Post Reply 

Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  Weird problem passing integer variables.. quarus 6 4,866 Mar 15, 2009 12:47 PM
Last Post: quarus
  Getting a function to recognize an integer FlamingHairball 7 4,833 Jan 20, 2008 06:35 AM
Last Post: FlamingHairball
  reading integer input from user in C anthony 4 8,079 Nov 24, 2007 02:38 PM
Last Post: unknown
  Accelerate framework and an odd error LongJumper 3 3,923 Jul 3, 2005 05:02 PM
Last Post: LongJumper