float FBO on iPhone (invalid operation error)

Oldtimer
Posts: 834
Joined: 2002.09
Post: #1
Long time, no see, as usual. Love Playing around with an effect that works fine on desktop, but throws a GL error on sim/device.

I'm trying to setup a float-backed FBO as a render target:

Code:
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB16F_EXT, size.width, size.height, 0, GL_RGB, GL_HALF_FLOAT_OES, NULL);

On desktop, GL_RGB16F works fine, but in the iOS SDK I have to move to the _EXT version. This call throws GL_INVALID_OPERATION. The glTexImage2D docs only generates that error in situations when loading image data (which I'm not, hence the NULL parameter to data), depth textures and when playing with pixel unpackings. Looked into texture_storage and float_texture extension docs, but nope...

I can't find any (correct) docs on setting up float FBO:s on iOS, and I don't know how to dig deeper into the error state than this.

The entire texture creation:
Code:
glActiveTexture(GL_TEXTURE0);
    glGenTextures(1, &texture);
    glBindTexture(GL_TEXTURE_2D, texture);

    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
    glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
    glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
    
    GLenum err = glGetError();
    if (err != GL_NO_ERROR)
    {
        NSLog(@"Error before setting up texture: 0x%x", err);
    }
    
        // depth parameter is GL_RGB16F_EXT on iPhone, GL_RGB16F on OSX
#if TARGET_OS_IPHONE
    glTexImage2D(GL_TEXTURE_2D, 0, depth, size.width, size.height, 0, GL_RGB, GL_HALF_FLOAT_OES, NULL);
#else
    glTexImage2D(GL_TEXTURE_2D, 0, depth, size.width, size.height, 0, GL_RGB, GL_UNSIGNED_BYTE, NULL);
#endif
    
    err = glGetError();
    if (err != GL_NO_ERROR)
    {
        NSLog(@"Error setting up texture: 0x%x", err);
    }

Cheers!
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #2
Try RGBA instead of RGB, it doesn't surprise me that 3-channel formats might not work.

Be aware that SGX535 in earlier iOS devices does not support float rendering. iPad2+, iPhone4S+ have SGX543, which supports float16. Nothing supports float32.
Quote this message in a reply
Sage
Posts: 1,232
Joined: 2002.10
Post: #3
(Jan 6, 2013 04:39 AM)Fenris Wrote:  On desktop, GL_RGB16F works fine, but in the iOS SDK I have to move to the _EXT version.

ES2 TexImage does not allow sized <internalformat>. So you must create the texture like this:

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, size.width, size.height, 0, GL_RGB, GL_HALF_FLOAT_OES, NULL);

ES2 will infer the sized internalformat from the type (but not let you figure that out via glGetTexLevelParameter...Mad)

And RGB works, you don't need RGBA. But don't expect any memory savings.
Quote this message in a reply
Oldtimer
Posts: 834
Joined: 2002.09
Post: #4
Ah, that's it. Just to clarify something I've been wondering about and can't seem to get my head around: how does type and format factor into the storage? Before this, I was under the impression that those described the data in, eh, data, but I'm mistaken, right?

I really just need a single signed half-float component for this effect, but without memory savings, and from what I hear, no performance savings, there's no point in pursuing GL_R/GL_RED/GL_LUMINANCE?

Thanks, both of you. Smile
Quote this message in a reply
Oldtimer
Posts: 834
Joined: 2002.09
Post: #5
Hm, GL_RED_EXT does seem to bring be from 15 FPS up to around 22 FPS, but I'm seeing badly botched rendering. I'm rendering from a GL_RGB FBO into a GL_RED_EXT FBO (a vector-to-value operation, think particle sim).

EDIT: I just found out about glBindFragDataLocation, but it doesn't seem to be available in ES. It does suggest that writing four values into gl_FragColor is causing the repetition, and the pattern is broken by the tiler?

I'm drawing a single splat in the center of the screen, but I also get a smattering of smaller splats all over.

   

There seems to be some sort of pattern but I can't really make it out. Can this be caused by mixing 3-channel and 1-channel texture samplers in a single shader? Am I doing something wrong when writing four values into gl_FragColor on a 1-channel FBO?

I haven't done my homework on this one, just a braindump, so please to point me to docs or similar if it works as intended.

Thanks again.
Quote this message in a reply
Oldtimer
Posts: 834
Joined: 2002.09
Post: #6
Sorry for the triple-posting here, but I ran it through the Capture Frame magic thing on my iPhone 4S, and got a bit confused.

My call to glTexImage2D looks like this now:
Code:
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, size.width, size.height, 0, GL_RGB, GL_HALF_FLOAT_OES, 0);

In the GPU trace, I get this warning:
Code:
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, 240, 160, 0, GL_RGB, GL_HALF_FLOAT_OES, 3e2eb3c2)
: (invalid enum=0x8d61): Invalid enum for argument 'type'

Yes, 0x8d61 is GL_HALF_FLOAT_OES, and why does my 0 for data turn up as 0x3e2eb3c2 in this error? I use the exact same code path for all texture setup. No other GL errors (through glGetError at least).

Any guesses?
Quote this message in a reply
Sage
Posts: 1,232
Joined: 2002.10
Post: #7
(Jan 9, 2013 03:15 PM)Fenris Wrote:  how does type and format factor into the storage? Before this, I was under the impression that those described the data in, eh, data, but I'm mistaken, right?

<format> and <type> describe <data>. The data that your app is passing to the GL. <internalformat> is your request for how you want the GL to store the data.

In desktop GL, this transfer of data during TexImage can be pretty complicated. There is a state machine of the glPixelStore and glPixelTransfer state describing how to unpack each pixel, optionally process (scale, bias, map, convolve, histogram etc) and store the pixel (in, presumably, "VRAM".) There are zillions of possible conversions here, both in format (i.e. RGB -> LUMINANCE) and type (i.e. FLOAT -> UNSIGNED_BYTE).

Desktop GL supports this stuff because, as a "graphics library" it is providing common image processing utility. For example getting RGBA, UNSIGNED_BYTE data from your jpg/png/tiff loader, and converting it to RGB 565 for a texture. Or runtime-compressing it to DXT, RGTC, etc to save even more memory. Another reason is that <internalformat> is just a request; there are a bunch of internalformats like GL_LUMINANCE6_ALPHA2 which are unlikely to really be supported in hardware (perhaps they were on SGI workstations, 20 years ago). The driver is free to choose a similar format (like GL_LUMINANCE8_ALPHA8) instead, and conversions might be required.


However in many applications, there are no conversions, and in GLES, this functionality was deleted along with a bunch of the other desktop API. So the meaning of the arguments changes somewhat. ES1 changed the specification so that <internalformat> has to "match" <format>; no sized <internalformats> are allowed, and the valid combinations of <format> and <type> are strictly limited. So the effective internalformat chosen by the driver is supposed to be derived from <format> and <type>.

Reading between the lines, the intent of this was to eliminate all format conversions, serving two purposes: 1) all potentially "slow" conversion is shifted out of run-time to your app compile-time (you are expected to pre-process data and ship it in a "fast" format), and 2) the complexity (engineering and testing) of the image state machine can be deleted out of the driver, reducing bugs etc.

Of course that intent falls short in practice: 1) there are usage patterns where you need run-time conversion (like streaming pngs from a url; you might really want 565 or 4444 textures for memory savings) and that means every app gets to re-implement this functionality instead of the driver implementing it once (with multithreading and vector optimizations appropriate for every device; the OS/driver provider can do a better job at this than average developers) and 2) ES1 left a big gaping hole in the spec: CopyTexImage still requires format conversions to be supported in the driver, so that complexity is still there.


Then ES2 added sized <internalformat> for renderbuffers. But left TexImage the same as ES1, so is self-inconsistent and confusing.


The recently announced ES3 spec re-adds sized <internalformat> for TexImage, and the pile of new texture formats (integer, RGB9_E5 etc) are only available that way. This is a pretty clear recant of the broken ES1/ES2 changes. ES3 also allows a limited set of <type> conversions during TexImage, catering to cases like streaming images over the web. But the ES3 spec is still self-inconsistent, as BlitFramebuffer (and to a lesser degree CopyTexImage) allow any <format>,<type> conversion while TexImage is artificially constrained. Fail


Quote:I really just need a single signed half-float component for this effect, but without memory savings, and from what I hear, no performance savings, there's no point in pursuing GL_R/GL_RED/GL_LUMINANCE?

On iOS devices, RED and RG really will save memory compared to RGB or RGBA.
(Hardware likes to have power-of-two byte sizes for texels, so RGB is typically stored internally as RGBX.) Performance wins from RED or RG will simply be due to bandwidth reductions during sampling or framebuffer writes.


Quote:glBindFragDataLocation...doesn't seem to be available in ES. It does suggest that writing four values into gl_FragColor is causing the repetition, and the pattern is broken by the tiler?

BindFragDataLocation is only meaningful for MRT where you need to direct multiple fragment outputs to multiple FBO attachments. ES2 does not support MRT (there is no DrawBuffer API) and all fragment outputs default to FragDataLocation zero at program link time, i.e the single COLOR_ATTACHMENT0.

In ES2, gl_FragColor is a vec4. If your COLOR_ATTACHMENT0 has fewer than four components, the rules in "Conversion to Framebuffer-Attachable Image Components" apply. In your case, gl_FragColor.r will be written to a RED attachment. (gl_FragColor.a is still meaningful for raster operations like blending or alpha-to-coverage. gl_FragColor.gb will be ignored.)


(Jan 9, 2013 03:41 PM)Fenris Wrote:  I'm seeing badly botched rendering.

Lots of possible reasons, like forgetting to glClear the FBO (remember NULL allocation contents are undefined) or you have blending enabled but don't write to gl_FragColor.a.


Quote:In the GPU trace, I get this warning:
Code:
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, 240, 160, 0, GL_RGB, GL_HALF_FLOAT_OES, 3e2eb3c2)
: (invalid enum=0x8d61): Invalid enum for argument 'type'

Is that actually throwing INVALID_ENUM? That seems like a bug in the dev tools. The GL ultimately is mallocing storage for your NULL request, so perhaps the devtool is showing that pointer. But you'd hope it shows your original API arguments, so that also seems like a bug.
Quote this message in a reply
Oldtimer
Posts: 834
Joined: 2002.09
Post: #8
Wow, what a reply! :o
Thanks for taking the time. I still can't get it to run properly, but I'm fairly sure it's my fault again. Wink

So, now I get why I was confused about teximage, quite a bit of stuff to track, and it explains why it worked here but not there and vice versa. I think I've got that part under control now, but a bit of history is always illuminating.

Quote:Performance wins from RED or RG will simply be due to bandwidth reductions during sampling or framebuffer writes.
...and yeah, quite a bit of performance - an extra 40% since I'm hitting those single-channel FBO:s quite a few times per frame.

Quote:In this case, gl_FragColor.r will be written to a RED attachment.
...I'm seeing badly botched rendering.
Yeah, I figured that one out. I was doing gl_FragColor = tex2D(...) + <float> when I should have done gl_FragColor = tex2D(...) + vec4.x; Not sure why it caused the splattering, but it works and I can see I was wrong.

Quote:Is that actually throwing INVALID_ENUM?
Turns out it wasn't. I'll see if I can replicate in a smaller example, and I'll radar it.

Again, thanks. I think this is sorted, and I'll come back if the current problems aren't obvious mistakes.
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #9
Quote:Yeah, I figured that one out. I was doing gl_FragColor = tex2D(...) + <float> when I should have done gl_FragColor = tex2D(...) + vec4.x; Not sure why it caused the splattering, but it works and I can see I was wrong.

What do you mean by this? These two should be equivalent:

Code:
float f;

gl_FragColor = texture2D(...) + f;
gl_FragColor = texture2D(...) + vec4(f).x;

(both add f to all 4 components), but this is different:

Code:
gl_FragColor = texture2D(...) + vec4(f, 0.0, 0.0, 0.0);

(adding f only to the x component)

(just trying to make sure there isn't a shader compilation bug!)
Quote this message in a reply
Oldtimer
Posts: 834
Joined: 2002.09
Post: #10
OK, now I've spent a couple of days digging. I had an interesting bug where a number of black squares appeared randomly in one of my framebuffers and grew with every render pass. I clamped down a value that might have gone NaN (or whatever the shader equivalent is) and that seems to have fixed it. If this seems serious, I can probably provide a pretty reproduction case.

OSC: I can't exactly remember my earlier configuration, but there was a difference:

Code:
vec4 outColor = texture2D(...);

outColor.x += f;    // This works
outColor += f;      // This is discussed below

gl_FragColor = outColor;

I embrace that the latter version will add f to all channels of the texture. However, if the texture is HALF_FLOAT, GL_RED_EXT, then I got the spattered rendering above. I can sort of see it happen, but I'm not sure it's koscher.

I wonder if perhaps I've misused renderbuffer targets and texture targets. The ES Programming Guide writes:

Quote:If the framebuffer is used to perform offscreen image processing, attach a renderbuffer. [...] If the framebuffer image is used as an input to a later rendering step, attach a texture.

I'm doing both, I guess. I'm running a multi-stage shader effect that does, well, image processing offscreen. However, each stage is an input to a later render stage (I render quads with the previous stage's texture as input). I should still use texture attachments, right?

As always, I appreciate your time.
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #11
If you need the image afterward, you should use a texture attachment. If you don't (eg. it's a depth or stencil buffer you never need again), you should use a renderbuffer attachment.

The EXT_texture_rg spec seems clear that x will be from the texture, y and z will be 0, and w will be 1, so if you think you're not seeing that, or think that specific instructions in the shader are voiding that guarantee, it is worth filing a bug.
Quote this message in a reply
Oldtimer
Posts: 834
Joined: 2002.09
Post: #12
Excellent, makes sense.
I'll see if I can replicate it in a small example and file for it.
I think this case is closed now, thanks for your help. Let's keep moving! Grin
Quote this message in a reply
Post Reply 

Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  Best way to readback float pixels? kelvin 8 5,630 Feb 27, 2008 10:23 PM
Last Post: kelvin
  convert NSArray in float[] bonanza 6 7,075 Jun 25, 2007 10:33 AM
Last Post: TomorrowPlusX
  CGContextFlush: invalid context with awt.Toolkit and awt.Graphics Tools10 1 3,730 Jun 10, 2007 04:07 PM
Last Post: Tools10
  can display float numbers but not integers WhatMeWorry 5 5,140 Jan 3, 2005 10:27 AM
Last Post: ThemsAllTook