02.19.09
Sky-High Boggle Scores with Simulated Annealing
Don’t let the sixteen month hiatus fool you. There’s just no end to Boggle posts on danvk.org!
In case you’d forgot, we’ve developed a blazing fast boggle solver capable of scoring 10,000 Boggle boards a second. What to do with this? Other than some interesting analyses, the most interesting question is:
What is the highest-scoring Boggle Board?
In this post, we’ll try to answer that question using Simulated Annealing. Here’s a sneak peak at one of the exceptionally word-rich boards we’ll find:
p | e | r | s |
l | a | t | g |
s | i | n | e |
t | e | r | s |
3625 points |
Follow me past the fold for more…
Here’s the basic idea of Simulated Annealing:
- Generate a random board.
- Score it.
- Change the board somehow.
- Score it.
- Based on the scores, decide whether to keep the old board or the new one.
- Repeat from step 3.
If we make good decisions in step #5, the boards we keep will be progressively higher scoring.
The big decisions come in steps #3 and #5. What does it mean to change a board?
I decided to use two basic permutations to change the board: re-rolling a die:
|
→ |
|
and swapping two dice:
|
→ |
|
Clearly one of these changes was beneficial and the other was not.
Generally speaking, we should accept the changes that increase the score in step #5 and reject the changes that don’t. This is a perfectly reasonable strategy. It’s called Gradient Descent and is widely-used.
With Simulated Annealing, we sometimes go with the lower-scoring board. The idea is to not get stuck in a rut. Maybe our board is pretty good. It’s scores higher than all the boards around it. But there are lots of boards a little more distant that are even higher-scoring. With Simulated Annealing, we can take a short-term loss for a long-term gain.
The exact details aren’t too interesting. If they interest you, take a look at my code.
Now for the fun part! Here are the results of a sample run. I’ve concatenated the letters on each row of the board so that they’ll fit on a single line.
Gen. | Score | Reign | Board |
---|---|---|---|
0 | 19 | 1 | kqiqwhyyjthdsibk |
1 | 4 | 1 | iqiqjhyywthdskbk |
2 | 5 | 1 | ibiqjhyywthdskqk |
3 | 9 | 1 | ibiqjhcywthdskqk |
4 | 10 | 1 | ibiqjychsthdwkqk |
5 | 8 | 1 | ibicjyqhsthdwkqk |
6 | 9 | 1 | ibicjyqdsthywkqk |
7 | 23 | 1 | ibicjygdhtsywkqk |
8 | 16 | 1 | ibncjygdhtsywkqa |
9 | 15 | 1 | ibncyygdhtsymkqa |
10 | 15 | 1 | ibncyygwhtsytkda |
11 | 38 | 1 | ibncyygahtsyhkdw |
12 | 146 | 2 | ibscyogahtnyhkdw |
14 | 140 | 4 | ibscyogahtnbhkdw |
18 | 165 | 2 | ibbcyogahtnshkdw |
20 | 159 | 1 | ibbcdogahynshktw |
21 | 169 | 1 | ibbcdogahynshmtw |
22 | 151 | 4 | ibbcdogahynshbtw |
26 | 150 | 3 | hbbcdogaiynshbtw |
29 | 142 | 2 | hbbcdogagynshbtw |
31 | 148 | 1 | hbbcdogaygnshbtw |
32 | 151 | 1 | hbbqdogaygnshbtw |
33 | 174 | 7 | hbtmdogaygnshbtw |
40 | 175 | 2 | hdtmbogaygnshbtw |
42 | 172 | 3 | hdtmbogaygnstbtw |
45 | 165 | 1 | hdtmbogaygnetbtw |
46 | 274 | 1 | hdtmbogaygnetbts |
47 | 298 | 5 | hdtmbogabgnetyts |
52 | 296 | 8 | hdtmbogabgnswyte |
60 | 387 | 1 | hdtmboeabgnswytg |
61 | 403 | 7 | hdtgboeabmnswytg |
68 | 409 | 3 | hdtpboeabmnswytg |
71 | 413 | 6 | hdtpboeabmnsgytg |
77 | 419 | 3 | hptdboeabmnsgytg |
80 | 416 | 8 | hptdbaeobmnsgytg |
88 | 421 | 9 | phtdbaeobmnsgytg |
97 | 449 | 40 | phtdbaeoymnsgytg |
137 | 454 | 2 | phtsbaeoymnsgytg |
139 | 463 | 11 | phtseaeoymnsgytg |
150 | 581 | 3 | phtsmaeoyensgstg |
153 | 662 | 26 | yhtsmaeopensgstg |
179 | 690 | 1 | yhtsmaespenogstg |
180 | 734 | 2 | lhtsmaespenogstg |
182 | 754 | 13 | lctsoaespenogstg |
195 | 849 | 1 | lctsoasepenogstg |
196 | 912 | 39 | lctseasepenogstg |
235 | 974 | 10 | lctseasepenoistg |
245 | 978 | 10 | lctseasedenoistg |
255 | 1046 | 25 | lctseasedsnoietg |
280 | 1281 | 53 | lctseasedsrointg |
333 | 1511 | 2 | lctseasedsronitg |
335 | 1511 | 8 | lctseasedsronitg |
343 | 1515 | 17 | lctseaiedsronitg |
360 | 1537 | 2 | lctseaiedsronitp |
362 | 1643 | 32 | lctseaiedtronisp |
394 | 1708 | 21 | lstseaiedtronisp |
415 | 1727 | 46 | lstseiaedtronisp |
461 | 1744 | 40 | lstseiaedtrinosp |
501 | 1873 | 36 | lstseiaentrinosp |
537 | 2053 | 9 | lstseiaentrisosp |
546 | 2162 | 13 | lstseiaentrigosp |
559 | 2445 | 42 | sltseiaentrigosp |
601 | 2445 | 21 | sltseiaentrigosp |
622 | 2524 | 58 | sltseiaentrigesp |
680 | 2597 | 37 | stlseiaentrigesp |
717 | 2614 | 232 | stlseiaentragesp |
949 | 3090 | 276 | stlseiaentrpgesa |
1225 | 3151 | 26 | stlseiaentrpdesa |
1251 | 3182 | n/a | stlseiaentrpdeso |
Some fun words on that last board: “desperate”, “entreaties”, “operated” and “serenest”. No shortage of 11-pointers on this board!
That’s all for now. Next time we’ll look at some of the highest-scoring boards that simulated annealing finds.
As usual, you can find all the source code for this project on the performance-boggle project at Google Code.
Craig Fratrik said,
February 19, 2009 at 10:09 am
Very cool dan.
The way I way I learned simulated annealing, as time went on, you were less likely to take the lesser board. Like steel particles as they cool during annealing.
Also, I seem to remember that you considered all boards that were one move away, and typically, you took the best one of those. The annealing part was that sometimes you just took a random choice.
Finally, it’s a good thing I came to comment. The color of the text didn’t come across in google reader.
Greg said,
February 19, 2009 at 10:38 am
Thats a pretty fun little analysis. I could see it being useful too. Imagine someone wanted to create a software-based boggle game for people to play online or whatnot. It would be more fun in general to play on high-scoring boards than low scoring ones, so weighting the algorithm for generating boards to be a little less random could improve the playability of the game.
danvk said,
February 19, 2009 at 12:41 pm
Craig: that’s one of the details that interested readers will find in my code (http://code.google.com/p/performance-boggle/source/browse/trunk/anneal.cc — only 150 lines!). I use the metallurgy analogy. There’s an ambient temperature that decreases over time. At lower temperatures, I’m less likely to choose a lower-scoring board. You can see this in the board progression. At first, it’s pretty random. Then it starts to only improve.
Craig paragraph 2: That’s not an approach I’d considered, but I like it. My code is fast enough that I can easily consider all boards at a distance <= 2. (more on that in a later post)
Craig paragraph 3: Reader must strip my CSS.
Greg: Sounds like an idea for a startup! =) It only took about two seconds to do the full annealing run I included in this post, so it would be completely feasible. You could even imagine requesting a low-, medium- or high-scoring board. Or a high-scoring board with a “qu” on it.
Greg said,
February 19, 2009 at 2:51 pm
I recall a story of two slot machines that had the exact same payout structure and frequency (ie: if you were blind, they were identical machines), but one would earn the casino far more money.
The difference was that one of them would show far more “almost wins” where the user was one click of one of the wheels away from winning alot of spare change. The psychological “near miss” kept people at the machine longer. Debatably, they enjoyed the experience more.
Mike said,
March 15, 2009 at 9:28 pm
Awesome stuff, Dan!
We should definitely get lunch together at some point… haven’t seen you in at least a year…
Also, as a security guy, I present for your amusement: cross-site boggling
danvk.org » Solving Boggle by Taking Option Three said,
August 4, 2009 at 6:18 pm
[...] we spoke about Boggle, we used Simulated Annealing to get at the question “what is the highest-scoring [...]
danvk.org » Breaking 3×3 Boggle said,
August 8, 2009 at 10:51 am
[...] Find a very high-scoring board (maybe this way) [...]
JohnPaul Adamovsky said,
February 22, 2010 at 12:04 am
10,000 Boards/second is impressive.
Question: How does it scale on multi-core shared memory CPU’s?
In November 2009, deepsearch.c was capable of scoring 70,000 random 5×5 boards per second with the TWL06 lexicon on a Core2 Q9450 @ 3GHZ.
How? An immutable finite state automaton operating as a perfect and complete hash table. That’s how.
RSLCS
DEIAE
GNTRP
SMIDR
This board is the best Best 5×5 Boggle board for TWL06. It’s worth 10,769 points. The word list is mind boggling.
search “Boggle deepsearch” on Google. http://www.pathcom.com/~vadco/deep.html
Final question: Why bother changing more than one tile, when the smallest change possible will lead to the best board every time? Does your algorithm guarantee the highest possible board in a period of time shorter than over night?
JohnPaul Adamovsky said,
February 22, 2010 at 12:12 am
Whoops, missed a line in that board.
RSLCS
DEIAE
GNTRP
ATESE
SMIDR
That’s the one, I’d love to be able to edit my comment.
All the very best,
JohnPaul Adamovsky