Dictionary Checking for word games

Posts: 27
Joined: 2012.05
Post: #1
So I have been working on a scrabble/jumble like word game for a while now, and I was wondering how I should do the dictionary checks for the game.

Should i run through the dictionary I have which has about 180,000 words everytime I wanna check whether the word is legal or not, or should I do some sort anagram testing before, and compare the words made to a smaller list of words?

What ae your suggestions?
Quote this message in a reply
Posts: 714
Joined: 2002.04
Post: #2
I'd suggest you begin by testing whether a linear search through the dictionary is fast enough for your needs – if it is, you need do nothing more Wink

If a linear search isn't fast enough for your needs, it is possible to optimise a linear search[1], or you could look into using eg. a hash table.

[1] Eg. keep the dictionary in order of word length, and keep a list of the indexes where the first word of length n occurs in the dictionary; then to test whether or not a word is legal you need to only search a portion of the dictionary (beginning at i = indexes[ strlen( word ) ], and – if the word isn't legal – ending when strlen( dictionary[ i ] ) != strlen( word ).)

Mark Bishop
Student and freelance OS X & iOS developer
Quote this message in a reply
Posts: 1,487
Joined: 2002.09
Post: #3
A neat hashing algorithm that I can't remember the name of applies here.

Basically you have a gigantic bit vector, say 100KB long. Then you need a hashing function that for each input word spits out say 10 bit positions. To build the hash, run through your dictionary file and set the 10 bit positions for each word to true. To check if a word is *probably* in the dictionary you check if the 10 bit positions are true.

While it does open up the possibility of false positives, even a bitvector that is a fraction of the size of the original dictionary size you can get very close to 100% accuracy. The /usr/share/dict/words file is 2.4MB as a comparison.

Aha! http://en.wikipedia.org/wiki/Bloom_filter

Scott Lembcke - Howling Moon Software
Author of Chipmunk Physics - A fast and simple rigid body physics library in C.
Quote this message in a reply
Posts: 112
Joined: 2002.06
Post: #4
Use a trie to store your word list in RAM and check for valid words. Tries are relatively easy to implement and are efficient in using memory. Another benefit of using tries is the speed of finding a word depends on the word length, not the size of the dictionary, so you can determine if a word is valid in a large dictionary quickly.

A trie is a tree-like data structure. At the top is the root of the trie. For a word game, you would have 26 nodes under the root, one for the letters A-Z. These nodes represent the first letter of words. Under those nodes would be nodes for the second letters of words. Then there would be nodes for the third letters and so on. You can find more information on tries, including links to source code implementations, in the following article:

Using Tries to Store Word Lists in RAM

Mark Szymczyk
Quote this message in a reply
⌘-R in Chief
Posts: 1,277
Joined: 2002.05
Post: #5
NSSet is an extremely simple choice. I threw 180,000 words into an array and a set, then tested both looking for 10,000 random words.

Linear Time: 0.008735 avg 0.024519 max
Set Time: 0.000001 avg 0.000010 max

The answer is on a Mac they're both ridiculously fast. On iOS, some significant factor slower.

However, writing a linear search is more code than a simple [NSSet containsObject:word.lowercaseString] check, so I'd say go with a set. Unless you're purposefully doing this the long way to learn something interesting.
Quote this message in a reply
Posts: 27
Joined: 2012.05
Post: #6
thanks a ton guys!!
Quote this message in a reply
Post Reply 

Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  Word list for a spelling (i.e., Scrabble) type game monkeyfacebag 2 6,694 Jul 28, 2009 12:31 PM
Last Post: szymczyk
  Mathmatical formula for integer checking Jones 12 11,464 Jan 24, 2006 01:58 AM
Last Post: zKing
  how to init and get a dictionary working kensuguro 4 6,908 Oct 2, 2005 02:33 PM
Last Post: blobbo
  Dictionary vs Class Justin Brimm 3 5,210 Feb 5, 2003 04:56 PM
Last Post: Justin Brimm