PDA

View Full Version : hash_map (and other c++ extensions) in osx


JeroMiya
2006.11.30, 11:41 PM
I couldn't get gcc to find #include <hash_map> the other day. I know it is an extension to the c++ standard, but I was under the impression that it was included in at least the 10.4 sdk, which I'm using. I've even found the header file existing somewhere in the 10.4 sdk subdirectories. Is there a switch or a #define or a -I<path> I'm missing somewhere?

OneSadCookie
2006.11.30, 11:49 PM
#include <ext/hash_map>

typedef __gnu_cxx::hash_map<int, int> intmap;

JeroMiya
2006.12.08, 11:01 AM
OK, after about 6 hours of debugging, I've discovered that hash_map exhibits pathological behavior in Mac OS X. (Successive calls to hash_map::find with the same key would sometimes work, sometimes not, sometimes work only every 4 out of 5 times, sometimes would find the element, but not modify it, sometimes it would, etc..). I switched to regular stl <map> and everything works beautifully now. Theoretically slower, but for a LISP interpreter, the performance difference is negligible.


Waste...
of..
time


-JeroMiya

akb825
2006.12.08, 02:21 PM
Where you using std::string as the key or char *? If you use char *, it's only going to keep a pointer to the string, so if you modify it the copy will be modified as well. You need to define a hashing functor class that just calls the hash function on the string's C string equivalent, then use std::string and that hashing class.

JeroMiya
2006.12.08, 05:53 PM
I was using const char*, and the eqstr equivalency functor from sgi's stl documentation. And hash<const char*> for the hash function. In truth I don't know if it was actually a bug in hash_map or just the way I was using hash_map, but I do know that once I switched my typedef from using a hash_map to just a map (without the hash function and without changing anything else), it all worked perfectly. I suspect it might have something to do with differences in the way iterators work with hash_map's verses with maps.

DoG
2006.12.08, 06:03 PM
By the way, gcc's manuals speak of unordered_map and unordered_set templates as a soon-to-be standard replacement for the ext/hash_* templates.

EDIT: after our talk on IRC the other day, I replaced lots of map<> types with hash_map<> and it yielded quite an impressive performance improvement.

akb825
2006.12.08, 06:27 PM
Yes, I've heard stories of people switching from map to hash_map and getting huge gains. I suppose that might be nullified if you need to constantly add and remove things, though, since hash tables aren't too happy when you have to re-hash them during a resize...

JeroMiya
2006.12.08, 07:03 PM
Yes, I've heard stories of people switching from map to hash_map and getting huge gains. I suppose that might be nullified if you need to constantly add and remove things, though, since hash tables aren't too happy when you have to re-hash them during a resize...

Well what I was writing was a LISP interpreter. The hash_map was mapping char* to Symbols. Symbols were never removed, although sometimes their value changed (key stays the same). A new entry was added to the symbol table (the hash map) every time the user defined a function or variable, but I didn't think that could be considered "frequent" enough as to break the hash_map.

akb825
2006.12.08, 07:20 PM
Did you use literal strings, or did you have a buffer you would constantly write into? If all you used were literal strings (using "" without any intermediate storage), you'd be fine, but if you used a buffer that you'd change, you're in trouble. Like I said, it copies the pointer for the key, it doesn't use strdup or anything like that to copy the actual data itself. (this would explain why it wouldn't work correctly)

To answer your question, as long as your tests greatly outweigh the number of times you insert something, you should be fine.

JeroMiya
2006.12.08, 08:05 PM
ah, well that was it then. I was passing in string::c_str() results into the hash_map, that's probably what broke it.

akb825
2006.12.08, 08:11 PM
Add this:
namespace __gnu_cxx
{
template<> struct hash< std::string >
{
size_t operator()( const std::string& x ) const
{
return hash< const char* >()( x.c_str() );
}
};
}
and pass that in to the hash_map as the hashing functor, and you'll be able to use normal strings. Note that you can also use regular C strings, as they'll automatically be converted into C++ string objects for you.