Building a Word Search

Posted Sat Oct 06 @ 09:43:32 AM PDT 2012

I decided to build a word search generator. Surprisingly, it only took about an hour to code up my initial version in JavaScript. The gist of the algorithm is like this:

  1. Initialize a 2D array, grid, with n rows and m columns
  2. For each word, word, in the list of words you want to put in the word search:
    1. Randomly pick a row and column (r, c) such that (r, c) < (n, m) (i.e. don't pick something outside the bounds of the array)
    2. Randomly pick a direction, d, in the set {North-To-South, South-To-North, West-To-East, East-To-West, NorthWest-To-SouthEast, SouthEast-To-NorthWest, NorthEast-To-SouthWest, SouthWest-To-NorthEast}
    3. Do a test to see if the word can be placed in the direction d, without overflowing the bounds of the array. For example, if your direction was North-To-South, you would want to make sure word.length + row <= n. For NorthWest-To-SouthEast you would need to test word.length + row <= n and word.length + col <= m
    4. Starting at (r1, c1) = (r, c), for each letter, letter in word:
      1. Make sure grid[r1][c1] is empty or grid[r1][c1] == letter (in this latter case, another letter was placed at grid[r1][c1] but because the letters are the same, there is nothing wrong [words are allowed to intersect on a word search])
      2. Update (r1, c1) appropriately based on the direction, d. For example, if the direction is North-To-South, leave c1 alone, but increment r1. For SouthEast-To-NorthWest do (r1, c1) = (r1 - 1, c1 - 1)
    5. If you get through all the previous tests, you can finally place the word on grid. Starting at (r1, c1) = (r, c), for each letter, letter in word:
      1. Set grid[r1][c1] = letter
      2. Update (r1, c1) appropriately based on the direction, d
  3. Now walk through your entire grid, and if grid[r][c] is empty, put a random letter in that cell

Here is a pretty printed example with the greek alphabet:

 01234567890123456789
0PNHVKLEJDNFZMPLOVOSF
1PCBDQRAPXKQLYXKDOMKM
2GZZZGFGUZFVQRNOEMEGR
3GAMMATVBBLHWXKKLIGQI
4TUSNEJHOOLGNWAUTCAXH
5XCEYVGHEZEXATPNARTZP
6ZUMLHYYNTHCRPPPAODJH
7GIMPCBETAASEPASXNJKG
8NWCVLJWNIJOZHZETATOJ
9DFIITRFHHOKEIYNYKLHS
0DGICHISIGMAALPHAKYRI
1IBURMSNHAUPYMBJPIYED
2SWQEPSILONIYKHQJQCTD
3FZGVULYPICXORFGFDPAZ
4ZHVIQQPZUFPMTYFSHSTH
5HSSCUUHZPDZZOALYLGJE
6ZWHBLAMBDASIBNERMXHM
7PDMPUPSILONSVTAUJIIU
8RHVSUZMBSCZPOMPPZFVX
9YANUXCFKAQFXWQIVOWEI

See any problems (besides my row and column numbering scheme, which is cutoff)? How about the word "HELL" starting at (row, col) = (6, 9) going South-To-North, or "CAWK" at (row, col) = (6, 10) going SouthWest-To-NorthEast? Some 5th grader would spend the whole day looking for words like that in his word search.

Now I'm working on a much harder problem: How to detect and eliminate words like that from generated word searches. I already came up with a clever (if I do say so myself), fast way to detect badwords. Unfortunately, I haven't been able to design a good way to eliminate them from the word search. Just randomly changing the letters in the badword to something else doesn't necessarily work, because it might create another badword somewhere else on the grid (or in the exact same spot).

<< Home