Substitution

I have really strong childhood memories of substitution ciphers – a cryptic block of letters, where every A was swapped for a T and every B was swapped for an X, all the way through the alphabet. But after googling this recently, it turns out this is a mostly forgotten diversion.

Of course, I was never any good at them, not that I remember, at least. Certainly, as an adult who tried very recently, I struggled. I thought my knowledge of the most common English letters would help, but it wasn’t nearly enough. So I started building a tool to help me.

This is perhaps a good time to point out that, if you can’t do something manually, teaching a computer how to do it can be pretty challenging. My first idea was to look at the short words and find potential matches there, and then use words with only a couple letters left to index into these massive dictionaries I’d made. This was a garbage approach. The smallest words all look alike, “XHA” and “KLU” look the same when you don’t know anything about the solution. So I’d never be able to get to where I have a bunch of words only one or two letters away. My clever pre-indexing system wasn’t clever enough to ever even be used.

It took quite a bit of floundering before it struck me, just before falling asleep last night. I want to look for long words with repeated letters. I can use the locations of the repeats to index into a standard English dictionary and work down to smaller words with fewer repeats. “SUBSTITUTION” is easy, because it looks like “ABCADEDBDEFG” (all those repeated letters narrow it down a lot), but “CAT” looks like “ABC”, and is almost totally generic until you solve some of the letters.

This turns out to be enough that the multi-stage solutions, or even keeping the old hinting interfaces, is superfluous. It’s not at the point where I think the computer could solve all of them on autopilot quickly (because you’d want to have a pretty exhaustive dictionary for that, so you’d know when to abandon a potential solution), but it’s pretty close. For human solvers, since you can’t see all the possibilities the computer can, it can still take a bit of cleverness to get going. It’s not just data entry, it’s just a much easier puzzle.

Substitution