Random Optimization Questions
To expand slightly, there are 3 functions for measuring algorithmic complexity. O(f) is an upper bound on the complexity; Î©(f) is a lower bound on the complexity, and Î˜(f) is a tight bound on the complexity (which only exists where O(f) = Î©(f)).
For example, "quicksort" is Î©(n log n) but O(n^2), where "insertion sort" is Î©(n) but O(n^2) and "merge sort" is Î˜(n log n).
Notice that these say nothing about constant factors, how often the best or worst cases actually get hit or anything, they're just an indication of proportionally how much longer the algorithm will take to run as its input size increases.
As a kind of throw-away line, it's usually best to keep ones algorithms inside O(n log n) if at all possible
For example, "quicksort" is Î©(n log n) but O(n^2), where "insertion sort" is Î©(n) but O(n^2) and "merge sort" is Î˜(n log n).
Notice that these say nothing about constant factors, how often the best or worst cases actually get hit or anything, they're just an indication of proportionally how much longer the algorithm will take to run as its input size increases.
As a kind of throw-away line, it's usually best to keep ones algorithms inside O(n log n) if at all possible
I think I might actually get that! Maybe...
So in my case I'm searching for duplicates for each vertex, so the first operation is O(n). Then I'm doing a hash lookup, but I don't know what the efficiency of that is. Perhaps in the best case scenario it's O(1) (O(Î©) right?) and the worst case it's O(n) (O(f) right?) as Josh suggested. Could you then say that an unknown hash algorithm could be expected to yield O(n/2) as an average of Î© and f? If so, then I guess the vertex removal would work out to O(n * n/2) or O((n^2)/2) right? But according to my results with the NSDictionary it sounds more like I got similar benefits to akb's tree route, but perhaps not quite as good. Would it then be okay to say I'd estimate that it's perhaps more like O(n 2log(n)) ?
So in my case I'm searching for duplicates for each vertex, so the first operation is O(n). Then I'm doing a hash lookup, but I don't know what the efficiency of that is. Perhaps in the best case scenario it's O(1) (O(Î©) right?) and the worst case it's O(n) (O(f) right?) as Josh suggested. Could you then say that an unknown hash algorithm could be expected to yield O(n/2) as an average of Î© and f? If so, then I guess the vertex removal would work out to O(n * n/2) or O((n^2)/2) right? But according to my results with the NSDictionary it sounds more like I got similar benefits to akb's tree route, but perhaps not quite as good. Would it then be okay to say I'd estimate that it's perhaps more like O(n 2log(n)) ?
Hmm... Well after doing a short Google on it, it appears that since the probability of O(n) happening in a hash look up is so small, both the best case and average case are considered O(1). That's pretty incredible, because according to how I understand this now, the vertex removal would work out to O(n * 1), or simply O(n). Wow! The only thing limiting speed is the hashing function, which in my case is a combination of formatting a string for the key (fairly slow I suppose) and whatever NSDictionary uses to hash that key behind the scenes. Oh, plus obj-c messaging overhead, but that appears to be an awful lot lower than I used to think it was!
[edit]I just realized that I don't think O(n) can't happen with an NSMutableDictionary because if an equivalent object is found in the dictionary it will be replaced with the one being added. And as I read somewhere in the Apple docs, an object is considered equal if its hash is equal, so I guess there won't be any collisions. I could be all wrong about this though...
[edit]I just realized that I don't think O(n) can't happen with an NSMutableDictionary because if an equivalent object is found in the dictionary it will be replaced with the one being added. And as I read somewhere in the Apple docs, an object is considered equal if its hash is equal, so I guess there won't be any collisions. I could be all wrong about this though...
Very interesting stuff...thanks for all of the answers and tips guys! ^_^
I tried to avoid the problem of fast loading times for meshes by running all of my files (which were VRML 2.0 (.wrl)) through a program that I made that 'unrolled' the format and put it into a .mesh custom format that I made. This reduces the load time considerably, and puts the waiting time on the artist/programmer when they're making the game, instead of the gamer waiting for the game to load.
Again, thanks for all of the tips
-wyrmmage
I tried to avoid the problem of fast loading times for meshes by running all of my files (which were VRML 2.0 (.wrl)) through a program that I made that 'unrolled' the format and put it into a .mesh custom format that I made. This reduces the load time considerably, and puts the waiting time on the artist/programmer when they're making the game, instead of the gamer waiting for the game to load.
Again, thanks for all of the tips
-wyrmmage
Worlds at War (Current Project) - http://www.awkward-games.com/forum/
wyrmmage Wrote:I tried to avoid the problem of fast loading times for meshes by running all of my files (which were VRML 2.0 (.wrl)) through a program that I made that 'unrolled' the format and put it into a .mesh custom format that I made. This reduces the load time considerably, and puts the waiting time on the artist/programmer when they're making the game, instead of the gamer waiting for the game to load.I would most certainly advocate loading pre-optimized formats for any final release of a game.
In my case, I wanted to be able to load obj files from various sources, and specifically from my own 3D editors *during development*. I can't stand having to filter output into my own format every single time I want to try out a new object. A person could make the argument that you could write an export plugin for your 3D editor, and that is totally true, and something I have done as well. But the converse to that is: what about downloading models off the internet to act as stand-ins or to try them out? In that case, obj is pretty widely accepted, so I went with it for that purpose. So in this case (development) it makes lots of sense to do filtering on load. Heck I even added automatic tessellation a few days ago, so now I don't even have to bother triangulating before I export from whatever 3D editor I'm in at the time. A lot of models out there aren't made for games, and so contain un-optimized geometry such as n-sided polygons, and even complex polygons -- specifically, I have run across concave polygons in an obj file, even though I think the obj spec says there shouldn't be any concave polygons. So the bottom line is that during development, load times aren't about the gamer but me! I need to retain my sanity too.
[adding] I would much rather put up with seven seconds on load for an unknown 75,000 polygon model, rather than having to import that into a 3D editor, select options to triangulate it, export it back to disk, filter it through my own optimizer, and *then* load it. If I only did that once in a while it wouldn't bother me, but if you do that more than a dozen times in a day it gets old real fast!
AnotherJake Wrote:[edit]I just realized that I don't think O(n) can't happen with an NSMutableDictionary because if an equivalent object is found in the dictionary it will be replaced with the one being added. And as I read somewhere in the Apple docs, an object is considered equal if its hash is equal, so I guess there won't be any collisions. I could be all wrong about this though...
You are
If two objects are equal (as determined by isEqual, then they have to have the same hash. However, this doesn't work in reverse. Two objects can have the same hash yet be not equal. As an example, NSString considers the first eight characters, the last eight characters and the length when creating a hash value. It's easy to construct two strings that are not equal but give equal hash values. An example can be found here: http://www.omnigroup.com/mailman/archive...45655.html (first thing that popped up in Google)
The hash table would be Î©(1), but would be O(n) for a lookup. (remember: Î© is the lower bound, O is the upper bound) So your overall algorithm is Î©(n), but O(n^2). However, there are also other things you must consider, mainly when you're storing the vertices: inserting a single item is actually Î©(1) and O(n^2). This is because for every item, you can insert it directly to where the hash function says to, but it's possible that you will need to visit every item to insert it. That by itself isn't that bad, but when the hash table gets full, it must then rehash the items, or allocate a new array and insert each item individually using the hash function. That's where you get the O(n^2) from: you have n items, each of which is O(n) to insert since it's possible that it would need to visit every position in the table. Therefore, to insert each of your vertices into the hash table in the beginning, it's Î©(n), but it's O(n^3). Its average case isn't nearly as bad as n^3, probably much closer to between n and n^2 since you will need to rehash throughout your insert, but this is why I chose not to use a hash table.
I generally use a hash table when I have relatively few inserts, but lots of lookups, but a tree-based map when I have around the same number of inserts and lookups. (maybe somebody with more experience, such as OSC, can input their advice about this tactic?) When I do vertex elimination, I actually let the insert do all the work (using collisions to determine what index to use), then use an iterator to copy the vertices to an array when I'm done. In that case, I have n inserts but 0 lookups. (the time complexity is still the same as your method just with a tree instead of a hash table, though: Î˜(n*lg(n)); with what I said above, your overall algorithm with a hash table would be Î©(n), but O(n^3), though the average case isn't nearly as bad as the worst case)
I generally use a hash table when I have relatively few inserts, but lots of lookups, but a tree-based map when I have around the same number of inserts and lookups. (maybe somebody with more experience, such as OSC, can input their advice about this tactic?) When I do vertex elimination, I actually let the insert do all the work (using collisions to determine what index to use), then use an iterator to copy the vertices to an array when I'm done. In that case, I have n inserts but 0 lookups. (the time complexity is still the same as your method just with a tree instead of a hash table, though: Î˜(n*lg(n)); with what I said above, your overall algorithm with a hash table would be Î©(n), but O(n^3), though the average case isn't nearly as bad as the worst case)
Cochrane Wrote:You areThanks for pointing that out! I was wondering what the hash for strings would be. Unfortunately, that example you linked to must be out of date because I tried it and it did return two different hashes. Now I gotta go find out what the current rules are on string hashing and find out when they changed, because there could be a significant number of digits in the keys I'm generating, depending on the model, and that'd really screw things up. My confidence in the solidity of my method of using NSDictionary for duplicate vertex removal just took a big hit.
If two objects are equal (as determined by isEqual, then they have to have the same hash. However, this doesn't work in reverse. Two objects can have the same hash yet be not equal. As an example, NSString considers the first eight characters, the last eight characters and the length when creating a hash value. It's easy to construct two strings that are not equal but give equal hash values. An example can be found here: http://www.omnigroup.com/mailman/archive...45655.html (first thing that popped up in Google)
akb825 Wrote:The hash table would be Î©(1), but would be O(n) for a lookup. (remember: Î© is the lower bound, O is the upper bound) So your overall algorithm is Î©(n), but O(n^2). However, there are also other things you must consider, mainly when you're storing the vertices: inserting a single item is actually Î©(1) and O(n^2). This is becau ... blahblahblahblah...
Yeah, okay, I totally can't follow that. I guess I'm stuck in ignorance for now. Blissfully so though.
AnotherJake Wrote:It's good to know. No sense optimizing an implemention of an inefficient algorithm.
Yeah, okay, I totally can't follow that. I guess I'm stuck in ignorance for now. Blissfully so though.
http://en.wikipedia.org/wiki/Polynomial_time
-Jon
Thanks for the link and the little extra push, Jon.
My issue at the moment is not optimization of my method though; it's that according to that link Cochrane put up, it might not work at all! Although it has worked perfectly so far, so I don't know what's up with that yet. The speed is absolutely wonderful as far as I'm concerned, and I'd be pretty hard-pressed to beat the code simplicity. But if it can break, it's pretty much just a quick hack which may or may not work, depending on it's input. Doing a tree may very well be best in this situation after all (as I initially thought it would be), but I still haven't figured out a bullet-proof way to make a search key for that either.
BTW akb, after reading your post another ten times, it does start to make a little more sense to me. Looking at it from the standpoint of a hobbiest, this whole thing on polynomial time and big-O notation is easily one of the tougher things I've ever attempted to grasp. I really appreciate everybody's attempt here to shed a bit of light on my dim understanding of it.
My issue at the moment is not optimization of my method though; it's that according to that link Cochrane put up, it might not work at all! Although it has worked perfectly so far, so I don't know what's up with that yet. The speed is absolutely wonderful as far as I'm concerned, and I'd be pretty hard-pressed to beat the code simplicity. But if it can break, it's pretty much just a quick hack which may or may not work, depending on it's input. Doing a tree may very well be best in this situation after all (as I initially thought it would be), but I still haven't figured out a bullet-proof way to make a search key for that either.
BTW akb, after reading your post another ten times, it does start to make a little more sense to me. Looking at it from the standpoint of a hobbiest, this whole thing on polynomial time and big-O notation is easily one of the tougher things I've ever attempted to grasp. I really appreciate everybody's attempt here to shed a bit of light on my dim understanding of it.
This is the main idea with Big-O notation: Î©(f(n)) gives you the asymptotic lower bounds of an algorithm. That means that your algorithm can be no faster than the cost function f(n). O(g(n)) gives you the asymptotic upper bounds of an algorithm, which means that your algorithm can be no slower than the cost function g(n). If f(n) == g(n), then it is Î˜(f(n)), so your algorithm is directly proportional to f(n). In the case of a hash table, looking up an item could hit it directly each time if you have perfect hashing, so it is Î©(1). However, if you have lots of collisions, it's possible (though highly unlikely) that you need to visit each item, so it is O(n). There are similar cases for insertion, which I mentioned in my last post.
For your algorithm not working, you could make a simple test to see if the hash table checks to see if strings with identical hash values are stored separately. Otherwise, you could use a tree. For the stl map (which is what I've used), you only need to define less-then relations. What I do is I go element by element to see if they are less than the corresponding element. For example, I first check to see if this object's x position is < the other object's x position. If so, I return true (this object < the other object). If this x position > the other x position, then I return false. Otherwise, if they are equal, I move on tot the y direction, then the z direction, then the normal's x component, and so on. This works fine if you already have exact copies of the vertices (such as loading from a file), but if you're doing any kinds of calculations where they might be off by a rounding error, you may need to add an epsilon in your compares to ensure you aren't treating 2 equal vertices as unequal by a 0.0000000001 discrepancy.
For your algorithm not working, you could make a simple test to see if the hash table checks to see if strings with identical hash values are stored separately. Otherwise, you could use a tree. For the stl map (which is what I've used), you only need to define less-then relations. What I do is I go element by element to see if they are less than the corresponding element. For example, I first check to see if this object's x position is < the other object's x position. If so, I return true (this object < the other object). If this x position > the other x position, then I return false. Otherwise, if they are equal, I move on tot the y direction, then the z direction, then the normal's x component, and so on. This works fine if you already have exact copies of the vertices (such as loading from a file), but if you're doing any kinds of calculations where they might be off by a rounding error, you may need to add an epsilon in your compares to ensure you aren't treating 2 equal vertices as unequal by a 0.0000000001 discrepancy.
AnotherJake Wrote:Thanks for the link and the little extra push, Jon.
My issue at the moment is not optimization of my method though; it's that according to that link Cochrane put up, it might not work at all! Although it has worked perfectly so far, so I don't know what's up with that yet. The speed is absolutely wonderful as far as I'm concerned, and I'd be pretty hard-pressed to beat the code simplicity. But if it can break, it's pretty much just a quick hack which may or may not work, depending on it's input. Doing a tree may very well be best in this situation after all (as I initially thought it would be), but I still haven't figured out a bullet-proof way to make a search key for that either.
BTW akb, after reading your post another ten times, it does start to make a little more sense to me. Looking at it from the standpoint of a hobbiest, this whole thing on polynomial time and big-O notation is easily one of the tougher things I've ever attempted to grasp. I really appreciate everybody's attempt here to shed a bit of light on my dim understanding of it.
Don't worry too much about that. In fact, if I understand correctly what you are doing (i.e. using an NSDictionary instead of just using hash directly), don't worry about it at all. NSDictionary will make sure that even if the hash is the same, different objects still get treated as different objects. If you are writing your own code for this purpose, just do an isEqual once you know that the hashes are equal.
oh, another question: anyone know of a profiler that works well will .dylibs (I also need one that works well with .dll's on Windows) that is free?
Thanks,
-wyrmmage
Thanks,
-wyrmmage
Worlds at War (Current Project) - http://www.awkward-games.com/forum/
This thread is starting to seem like it might need a split.
Cochrane: Yeah, I'm just using NSDictionary, as listed exactly in the code snippet I put up above. I'm concerned with the example you linked to because he's saying that it only takes the last eight bytes and the first eight bytes to generate the hash from an NSString. Since I tried out his example, I can say that that does not hold true on my machine. But that doesn't change the fact that I still have been unable to find out whether or not generating a key by formatting a string with floats from my vertex data will screw up. It does say in the NSString documentation that the hash value it returns can not be relied upon across different versions of OS X, but I don't care about the value, only the method. I don't even know if NSDictionary uses that hash or creates it's own. Further, how many characters (or bytes) will it use for calculating a hash? Can I rely on NSDictionary to handle this consistently across different OS versions? All I know is that it appears to work perfectly right now, even with testing up to like 250,000 unique vertices (all loading and tessellation and duplicate removal via NSDictionary done in about seven seconds). But will that break down if I have a model with say, coordinates like 32134345345345.000000? Not that I will, but I wonder how many digits I can get away with? Since it works so far, the only sensible thing I can think to do right now is just keep an eye out for failures, since I don't know what's going on behind the scenes with NSDictionary.
akb825: I'm still trying to grasp the big-O concept. I really appreciate all your help so far! Let me run these thoughts by you, and maybe you can tell me if I'm getting warmer or colder:
I've been seeing O(n) referred to most often, presumably because O is the value we'd be most interested in when evaluating algorithmic efficiency. IOW, it's the worst-case and we're trying to design conservatively. A tree could be Î©(1), a linear search could be Î©(1), a hash lookup could be Î©(1). But (as an example) for a linear search it could just as well be O(n) and that is the value we assume for the worst case. In the case of vertices we're searching n (number of vertices) times through a list of n unique vertices we've already added to our array. So the worst case is O(n * n), or O(n^2). Best case is Î©(n). That's how I understood it earlier for the linear search. Insertion in this case is always O(1) because each vertex would be simply tacked onto the end of the array (according to Jon's link, that's called constant time). You could say it's Î˜(1) as well, since Î©(1) is also the best case, right?
"So your overall algorithm is Î©(n), but O(n^2)", which makes sense because I'm doing n lookups and if everything went perfectly on every lookup using the NSDictionary, it'd only have to iterate over one vertex every search (thus yielding Î©(n)). But if the hashing went poorly on insertion into the dictionary and they collided every time, I could be searching in one line of the table through all the existing vertices, for every lookup (thus yielding O(n*n)). Am I looking at this correctly?
Then you pointed out that this is the same case for insertion in the table as well (which I indeed did not think about). So you aren't just doing a lookup, you're also doing an insertion at some point, and that has a cost too -- specifically, it costs Î©(1) at best and O(n) at worst for each insertion. But again, since this has to be done n times, it actually winds up with a total cost of between Î©(n) and O(n^2) to build the table.
I think I'm following along okay up to here. But after that I'm hazy again.
If I have an insertion cost of between Î©(n) and O(n^2), and I have a lookup cost of Î©(n) and O(n^2), those need to be combined to get my overall cost with my NSDictionary vertex removal method don't they? If they were combined, wouldn't that be Î©(n^2) but O(n^4)? I still don't understand how you came up with Î©(n), but O(n^3) for it.
As to the float compares after calculations, so far I've been careful to structure the loader class so that I don't do calcs until after lookup. At one point in the development iteration I was doing some float compares but I was using something like if (fabs(v1.x - v2.x) < tolerance), etc. for that. Obviously that's not nearly as efficient as simply comparing the floats straight out of the file, so I re-wrote it to avoid that (at least for now).
Cochrane: Yeah, I'm just using NSDictionary, as listed exactly in the code snippet I put up above. I'm concerned with the example you linked to because he's saying that it only takes the last eight bytes and the first eight bytes to generate the hash from an NSString. Since I tried out his example, I can say that that does not hold true on my machine. But that doesn't change the fact that I still have been unable to find out whether or not generating a key by formatting a string with floats from my vertex data will screw up. It does say in the NSString documentation that the hash value it returns can not be relied upon across different versions of OS X, but I don't care about the value, only the method. I don't even know if NSDictionary uses that hash or creates it's own. Further, how many characters (or bytes) will it use for calculating a hash? Can I rely on NSDictionary to handle this consistently across different OS versions? All I know is that it appears to work perfectly right now, even with testing up to like 250,000 unique vertices (all loading and tessellation and duplicate removal via NSDictionary done in about seven seconds). But will that break down if I have a model with say, coordinates like 32134345345345.000000? Not that I will, but I wonder how many digits I can get away with? Since it works so far, the only sensible thing I can think to do right now is just keep an eye out for failures, since I don't know what's going on behind the scenes with NSDictionary.
akb825: I'm still trying to grasp the big-O concept. I really appreciate all your help so far! Let me run these thoughts by you, and maybe you can tell me if I'm getting warmer or colder:
I've been seeing O(n) referred to most often, presumably because O is the value we'd be most interested in when evaluating algorithmic efficiency. IOW, it's the worst-case and we're trying to design conservatively. A tree could be Î©(1), a linear search could be Î©(1), a hash lookup could be Î©(1). But (as an example) for a linear search it could just as well be O(n) and that is the value we assume for the worst case. In the case of vertices we're searching n (number of vertices) times through a list of n unique vertices we've already added to our array. So the worst case is O(n * n), or O(n^2). Best case is Î©(n). That's how I understood it earlier for the linear search. Insertion in this case is always O(1) because each vertex would be simply tacked onto the end of the array (according to Jon's link, that's called constant time). You could say it's Î˜(1) as well, since Î©(1) is also the best case, right?
"So your overall algorithm is Î©(n), but O(n^2)", which makes sense because I'm doing n lookups and if everything went perfectly on every lookup using the NSDictionary, it'd only have to iterate over one vertex every search (thus yielding Î©(n)). But if the hashing went poorly on insertion into the dictionary and they collided every time, I could be searching in one line of the table through all the existing vertices, for every lookup (thus yielding O(n*n)). Am I looking at this correctly?
Then you pointed out that this is the same case for insertion in the table as well (which I indeed did not think about). So you aren't just doing a lookup, you're also doing an insertion at some point, and that has a cost too -- specifically, it costs Î©(1) at best and O(n) at worst for each insertion. But again, since this has to be done n times, it actually winds up with a total cost of between Î©(n) and O(n^2) to build the table.
I think I'm following along okay up to here. But after that I'm hazy again.
If I have an insertion cost of between Î©(n) and O(n^2), and I have a lookup cost of Î©(n) and O(n^2), those need to be combined to get my overall cost with my NSDictionary vertex removal method don't they? If they were combined, wouldn't that be Î©(n^2) but O(n^4)? I still don't understand how you came up with Î©(n), but O(n^3) for it.
As to the float compares after calculations, so far I've been careful to structure the loader class so that I don't do calcs until after lookup. At one point in the development iteration I was doing some float compares but I was using something like if (fabs(v1.x - v2.x) < tolerance), etc. for that. Obviously that's not nearly as efficient as simply comparing the floats straight out of the file, so I re-wrote it to avoid that (at least for now).
Possibly Related Threads...
Thread: | Author | Replies: | Views: | Last Post | |
Obj-C optimization woes | NitroPye | 10 | 5,275 |
Apr 26, 2005 10:20 AM Last Post: NitroPye |