## Archive for August 29, 2017

### My Insecurity Over Security Codes

Every time I attempt to access one of my company’s applications via our single sign-on (SSO) system, I’m required to request a validation code that is then sent to my smartphone, and then I enter that code on the login page.

It’s a minor nuisance that drives me insane.

The purpose of the codes are to provide an additional level of security, but given how un-random the codes seem to be, it doesn’t feel very secure to me. This screenshot shows some of the codes that I’ve received recently:

Here’s what I’ve observed:

- Every security code contains 6 digits.
- The first 3 digits in the code form either an arithmetic or geometric sequence, or the first 3 digits contain a repeated digit.
- Similarly, the last 3 digits in the code form either an arithmetic or geometric sequence, or the last 3 digits contain a repeated digit.

As an example, one of the codes in the screenshot above is 421774. The first 3 digits form the (descending) geometric sequence 4, 2, 1, and the digit 7 appears twice in the second half of the code.

I believe the reason for these patterns is to make the codes more memorable to those of us who have to transcribe them from our phones to our laptops.

This got me thinking. The likelihood of someone correctly guessing a six-digit code is 1 in 1,000,000. But what is the likelihood that someone could correctly guess a six-digit code if it adheres to the rules above?

If you’d like to answer this question on your own, stop reading here. To put some space between you and my solution, here’s a security-related joke:

“I don’t understand how someone stole my identity,” Lily said. “My PIN is so secure!”

“What’s your PIN?” Millie asked.

“The year of Knut Långe’s death,” Lily replied.

“Who is Knut Långe?”

“A King of Sweden who usurped the throne from Erik Eriksson.”

“And what year did he die?”

“1234.”

(Incidentally, Data Genetics reviewed 3.4 million stolen website passwords, and they found that 1234 was the most popular four-digit code. The researchers claimed that they could use this information to make predictions about ATM PINs, too, but I don’t think so. All this shows is that 1234 is the most commonly *stolen* password, and therefore this inference suffers from survivorship bias. Without having data on all the codes that were *not* stolen, it’s impossible to make a reasonable claim. But, I digress.)

To determine the number of validation codes that adhere to the patterns I observed, I started by counting the number of arithmetic sequences. With only 3 digits, there are 20 possible sequences:

- 012
- 024
- 036
- 048
- 123
- 135
- 147
- 159
- 234
- 246
- 258
- 345
- 357
- 369
- 456
- 468
- 567
- 579
- 678
- 789

But each of those could also appear in reverse (210, 975, etc.), giving a total of 40.

There are far fewer geometric sequences; in fact, only 3 of them:

- 124
- 139
- 248

And again, each of those could appear in reverse, giving a total of 6.

Finally, there are 10 × 9 × 8 = 720 three-digit numbers with no repeated digits, which means there are 1,000 ‑ 720 = 280 numbers with a repeated digit. (Here, “number” refers to any string of 3 digits, including those that start with a 0, like 007 or 092.)

Consequently, there are 40 + 6 + 280 = 326 possible combinations for the first 3 digits and also 326 combinations for the last 3 digits, which gives a total of **326 × 326 = 106,276 possible validation codes**.

That means that it would be about 10× more likely for a phisher to correctly guess a validation code that follows these rules than to guess a completely random six-digit code. But said another way, the odds are still significantly against a phisher who’s trying to steal my code. And quite frankly, if someone wants to exert that kind of effort to pirate my access to Microsoft Word online, well, I say, go for it.