PDA

View Full Version : Manually antialiased primitives using texturing


arekkusu
2003.11.03, 06:08 PM
I'm looking for info. The question is below, but first comes the background.


My problem:

New hardware (ATI Radeon 9600 family) dropped support for hardware AA polygons. AA points and lines aren't exposed by the current drivers either, so if you want any type of hardware AA, you have to use FSAA which eats VRAM/fillrate and looks bad.

By "looks bad", I mean in maximum quality mode (6 sample buffers) you get 6 intermediate AA shades, whereas hardware AA on a Radeon 7500 gives you 8. 2 or 4 sample buffers look even worse.


My goal:

Replace GL_POINTS, GL_LINES, GL_TRIANGLES, GL_QUADS and derivatives with nicely antialiased primitives, using texture filtering hardware, which is about the only thing still reliable across currently sold video cards. AA quality is higher priority than speed, although speed is also important. Primitives in 2D space are much higher priority than in 3D space. Essentially, I want an OpenGL implementation of CoreGraphics primitives.


Examples:

Points can be replaced by one alpha textured quad, with an appropriately AA'd circle graphic. Texture filtering keeps the AA sharp, and a texture size of around 64x64 is enough for my needs (512x512 would be needed for maximum fidelity, but I think bilinear filtering provides a close enough approximation when the enlargement is limited to <= 4x, and blended 512 px points start to eat fillrate pretty fast. Not to mention native GL points can't get bigger than 10 px on some hardware.) The AA graphic can also have custom falloff to vary between hard a hard edged circle and a linear sphere. The point alpha texture is modulated with the current color or, with multitexturing, other textures.

Line segments similarly can be replaced by one alpha textured quad, but now some additional math is required. You have to project the quad corners out from the line endpoints, which involves finding the perpendicular vector and normalizing. Once this is done, an e.g. 1x64 px texture can be stretched along the length of the line to smooth edges. You get 256 shades, so it comes out looking better than any native hardware implementation. Endpoints aren't smoothed, so they need extra attention if you use thick lines (again, native GL limitation of 1 or 10 px wide lines goes away.)


My problem:

Polygons. I have not yet found a way to exactly duplicate the hardware AA, because it would involve subtracting alpha from some of the filled polygon, which might be doable with some destination alpha tricks but would really suck. What I do have is stroking the polygon edge with the textured line from the above example. This makes the polygon at least 1 px thicker than the aliased version, but I can probably live with that (the seams in natively AA'd tristrips wouldn't occur, at least...) There is also the problem of what to do at polygon vertices. If I stroke the line with a tristrip all around the boundary, GL will fill the vertex corners with one extra triangle, which looks pretty bad if you stroke with wide lines (hard corner, linear blending artifacts.) A better solution would be to fill the corners with the point texture to get rounded edges, but I am still working this out.

There is also a fairly large jump in the number of CPU-side calculations needed to get three normals per triangle, plus vertex normals if rounded corners are desired. And collinear edge cases have to be handled, so there are some extra compares around the actual drawing. And while interpolated color across the polygon looks all right with a wide stroke, I am not sure what to do with an existing texture. Repeat? Extend the border? Hrmmm.

So,


My question:

Can somebody point me towards existing implementations of manually antialiased GL primitives? I'm not interested in a full software rasterizer. I want to leverage the existing texturing hardware.

if you have ideas re: the polygon problems, I'm all ears. I'm also open to ATI vertex shader ideas, since the only hardware where I have to go through all this is guaranteed to have shader support. I don't know anything about writing shaders yet-- I imagine the normal vector part would be easy, but I think you can not generate any new vertices? E.g. two line endpoints -> four AA quad points, three polygon points -> up to 12 AA polygon points.

kelvin
2003.11.04, 12:59 AM
radeon 9800 fsaa is what you want :)

arekkusu
2003.11.04, 01:04 AM
Re-read my post. FSAA looks like crap.

Mark Levin
2003.11.04, 02:31 AM
Perhaps you could store polygons in the stencil buffer and then run a blurring pixel shader that only affects the boundaries?

kelvin
2003.11.04, 01:34 PM
Originally posted by arekkusu
Re-read my post. FSAA looks like crap. 9800 not 9600 :p

They made orders of magnitude improvement between the two chipsets. 9800 FSAA actually runs at fullspeed and doesn't look like crap as it actually outlines the polygons via depth buffer comparison.

arekkusu
2003.12.04, 01:32 AM
Originally posted by kelvin
9800 FSAA actually runs at fullspeed and doesn't look like crap as it actually outlines the polygons via depth buffer comparison.

After spending a day testing 12 different video cards, I can tell you that you're wrong on both counts. The FSAA on the 9800 pro looks identical to the 9600 (x2 = 1 shade, x4 = 3 shades, x6 = 5 shades...)

It still eats fillrate too. Here's some meaningless numbers applicable only to my limited test app (thousands of points/lines/triangles drawn per second):


ALIASED HARDWARE AA FSAA X2 FSAA X4 FSAA X6 SOFTWARE AA
Radeon9800Pro 420/540/75 420/539/75 420/298/60 420/150/38 420/121/33 55/11/0


You can see the steady drop in line/triangle throughput as the FSAA buffers increase.

That said, the 9800 Pro was the fastest card of the 12...

Check some comparison pictures. (http://homepage.mac.com/arekkusu/bugs/invariance/) (warning, graphics-heavy)

skyhawk
2003.12.04, 01:42 AM
1) how much easier is FSAA to implement than other methods?
2) can I get that program you used? it looks nifty.

arekkusu
2003.12.04, 01:56 AM
Activating FSAA is as easy as adding two lines to your pixel format creation, and enabling MULTISAMPLE.

Activating regular hardware AA is as easy as enabling SMOOTH for each primitive (3 lines of code.)

Now, getting it to look correct, on all hardware is very hard. I've got points 99% right. Still working on lines and triangles.

I'll post a third page ("so, what other options are there") when I get things working well, along with the program source.



Of course... the real advantage of FSAA is that the alpha blended edges work with any geometry without sorting (i.e. the multisampling happens at the scene level, instead of at the primitive level... hence, full-scene AA) which makes it a win for 3D. For apps where sorting is not an issue (2D: pdf, flash, paint apps...) there's no advantage whatsoever.

arekkusu
2004.01.14, 03:14 PM
Now, getting it to look correct, on all hardware is very hard.

Update, points and lines 100% working (but not line strips...)
Triangles still bogus.

More concise discussion, new pictures and benchmarks here (http://homepage.mac.com/arekkusu/bugs/invariance/).

OneSadCookie
2004.01.14, 03:32 PM
How close are you to a complete OpenGL implementation of Quartz? That would be very cool :)

skyhawk
2004.01.14, 04:01 PM
Update, points and lines 100% working (but not line strips...)
Triangles still bogus.

More concise discussion, new pictures and benchmarks here (http://homepage.mac.com/arekkusu/bugs/invariance/).
I am absolutely stunned at your research. Has Apple hired you to fix it all yet?

DoG
2004.01.14, 04:34 PM
This is indeed something interesting to do. I have been thinking that creating solid polygons from lines, etc, and drawing them with multiple passes would be the easiest, it would at least get the shape of points and lines right, even if it's anti-aliasing is probably not quite up to quartz levels, unless lots of passes are used.

arekkusu
2004.01.14, 05:21 PM
How close are you to a complete OpenGL implementation of Quartz? That would be very cool :)

That's not my intent, and beyond my ability (probably anyone's ability outside of Apple.) For starters Quartz allows a coord system much bigger than any GPU's max viewport size, so you'd have to tile chunks in multiple passes when i.e. printing. Or even stretching a window across two Cinema displays.

And speaking of printing of course you have to have portability across devices, which GL can't do. I know there are already GL->Postscript converters, so it's possible in theory... but I don't want to even think about it.

The biggest block is that Quartz has to be bit for bit identical across machines. My texturing approach isn't, because in GL texturing (like everything else) is subject to the implementation details of the GPU so there are tiny differences.

BUT, I'm sure there are plenty of developers who could live with that if it meant a 1000x speedup :) Just make it an optional Quartz rendering mode...

All I really want is hardware accelerated NSBezierPath. See? That's not too much to ask for, is it?

arekkusu
2004.01.14, 05:26 PM
I am absolutely stunned at your research. Has Apple hired you to fix it all yet?

Hrmmm. Not yet. But, errr, hmmmm. No comment.


I have been thinking that creating solid polygons from lines, etc, and drawing them with multiple passes would be the easiest.

That's the accumulation buffer method, and it's been around forever. It works fine, but it takes a lot of passes and is very slow, especially given that consumer GPUs don't have hardware acceleration for the accumulation buffer.

The texturing approach is quite fast. It is just tricky getting every little detail right.

DoG
2004.01.14, 07:25 PM
You said you use mipmapping, but why? I dont see the point of doing that. I would assume you shrink the outline of all polygons by 1/2 px, then add a 1px border with a full color->transparent gradient. The most intensive part of this is getting the shrinking and border right, but it is quite simple, algorithmically.

arekkusu
2004.01.14, 09:44 PM
You said you use mipmapping, but why?

For triangles, you're right. You inset the border 0.5 px and draw strips around it which outset 0.5 px as well (currently that's not what I'm doing, but I ought to be.) That gets you a 1 px strip fading from full to zero alpha.

But, what if you don't want a 1 px strip? What if you want 17 px? or 59.034 px? How big is your texture then? What if you don't want a linear fade on the strip?

Just look at the point case. How do you draw a 3 px point? I do it with trilinear mipmapping, so it blends the 4 px and 2 px mipmaps. There is some blur as I mention on the page but this is cheaper and easier algorithmically than keeping textures for sizes 1..N. (I did try it that way too, at first.)

The implementation details will be up in a while. The details are actually not as simple as you'd think-- there are a lot of edge cases to consider. Think about what happens to the 0.5 px inset when the triangle is only 0.9 px wide.

DoG
2004.01.15, 06:28 AM
For triangles, you're right. You inset the border 0.5 px and draw strips around it which outset 0.5 px as well (currently that's not what I'm doing, but I ought to be.) That gets you a 1 px strip fading from full to zero alpha.

But, what if you don't want a 1 px strip? What if you want 17 px? or 59.034 px? How big is your texture then? What if you don't want a linear fade on the strip?


I assume, that since anti-aliasing is considered, the border strip is always 1px, and you always want a linear fade. I don't see where you would use other widths, because it would just blur. And besides, if you can generate a 1px border, you can also do 17px or whatever.


Just look at the point case. How do you draw a 3 px point? I do it with trilinear mipmapping, so it blends the 4 px and 2 px mipmaps. There is some blur as I mention on the page but this is cheaper and easier algorithmically than keeping textures for sizes 1..N. (I did try it that way too, at first.)


I would personally draw a point by drawing a polygon disk with the mentioned 1px border around the disk. Maybe not as fast as a textured quad, but guaranteed more accurate.

In fact, I would draw lines the same way, by making them polygons, not by using textures.


The implementation details will be up in a while. The details are actually not as simple as you'd think-- there are a lot of edge cases to consider. Think about what happens to the 0.5 px inset when the triangle is only 0.9 px wide.

Of course, you could use a general bordering algorithm which is basically independent of border and inset width (but you set it to 1px anyway), and can deal with degenerate cases, where the border has to be changed as int the above case. All I can say is Voronoi spaces. I am fairly certain it would not be hard even to come up with such an algorithm and less so to implement it.

arekkusu
2004.01.15, 12:32 PM
I would personally draw a point by drawing a polygon disk with the mentioned 1px border around the disk. In fact, I would draw lines the same way, by making them polygons, not by using textures.

That works, just using a 1 px texture everywhere. But my original intention is to use these antialiased shapes in a paint program. So I decided to turn the workaround for lack of antialiasing into another feature... using variable sized textures allows fuzzy shapes with spherical falloff:

http://homepage.mac.com/arekkusu/bugs/t/AA_fuzzy.png

(note triangle bogosity)

kelvin
2004.01.16, 03:39 AM
After spending a day testing 12 different video cards, I can tell you that you're wrong on both counts. The FSAA on the 9800 pro looks identical to the 9600 (x2 = 1 shade, x4 = 3 shades, x6 = 5 shades...)

It still eats fillrate too. Here's some meaningless numbers applicable only to my limited test app (thousands of points/lines/triangles drawn per second):


ALIASED HARDWARE AA FSAA X2 FSAA X4 FSAA X6 SOFTWARE AA
Radeon9800Pro 420/540/75 420/539/75 420/298/60 420/150/38 420/121/33 55/11/0


You can see the steady drop in line/triangle throughput as the FSAA buffers increase.

That said, the 9800 Pro was the fastest card of the 12...

Check some comparison pictures. (http://homepage.mac.com/arekkusu/bugs/invariance/) (warning, graphics-heavy) You neglect to mention SmoothVision 2.1 which the 9800 has and the 9600 has not. SV2.1 makes a helluvalot of performance difference, and if you didn't test with it your results are skewed. Go figure out how to turn it on before making broad statements about 9800 and 9600 Radeons.