Have you ever been stuck while trying to solve a scrambled-word puzzle? You stare and stare at the letters, but no word reveals itself? You are stumped. Stymied.
I hope you didn't get stumped on the word puzzle I posted as an anniversary present for my wife. She breezed through it.
In a previous post, I showed that you can scramble and unscramble words by using permutations. I also showed that you can generate the set of all permutations on N elements. By putting these techniques together, you can write a SAS/IML program that solves scrambled-word puzzles.
Here's what you need:
- The PermuteChars module from my previous post.
- SAS data sets that contains the set of all permutations for N elements. In the scrambled-word game that appears in my local paper, the words have either five or six letters. The corresponding SAS data sets are named perm5 and perm6.
- A SAS data set that contains a list of English words.
A Dictionary of Words
In this internet age, it is easy to obtain a file that contains a dictionary of words. Some word lists are small. For example, the "Unix dictionary" contains just over 25 thousand entries. Others are quite a bit larger. I like the official Scrabble™ player's dictionary (OSPD) because it limits itself to words with eight or fewer letters.
Download your favorite wordlist and name the file "wordlist.txt." I used the following DATA step code to convert the word list to a SAS data set:
filename wordlist "C:Documents and Settings...wordlist.txt"; data Dictionary; /** keep first 8 chars unless you need longer words **/ length word $8.; infile wordlist; input word $; run; |
A SAS/IML Program to Unscramble Words
Suppose you have a list of 5- and 6-letter scrambled words that you want to unscramble. For this example, I will use the following set of scrambled statistical words:
proc iml; target = upcase({NAGER, RRREO, ORRCDE, LROANM}); |
You can read the dictionary and the permutations into SAS/IML matrices:
/** read word list; convert to uppercase **/ use Dictionary; read all var {Word}; close Dictionary; Word = upcase(Word); /** read permutations for 5- an 6-letter words **/ use perm5; read all var _NUM_ into p5; close perm5; use perm6; read all var _NUM_ into p6; close perm6; |
The following algorithm unscrambles the words:
free solution; do i = 1 to nrow(target); /** for each word **/ w = strip(target[i]); if length(w)=5 then p = p5; else p = p6; /** use permutation for N elements **/ idx = loc( length(Word) = length(w) ); list = Word[idx]; /** limit word list **/ foundWord = 0; /** how many permutations result in a word? **/ do n = 1 to nrow(p) while (foundWord < 5); /** for each permutation **/ perm = p[n, ]; s = PermuteChars(w, perm); /** apply permutation **/ idx = loc(list = s); /** check if this is a word **/ if ncol(idx) > 0 then do; /** remember solution **/ foundWord = foundWord + 1; solution = solution // (s + " is a solution for " + w); end; end; end; print solution; |
You can see that people who create scrambled-word puzzles need to be careful that there are not two or more permutations that each lead to valid words. For the letters NAGER, both RANGE and ANGER are common words that solve the puzzle. (I'd exclude REGNA.) Also notice that words with repeated letters, such as RRREO, are easier to solve in the sense that there are multiple permutations that unscramble the letters
So next time you are stuck, just fire up this SAS program to solve scrambled word puzzles.