Saturday, January 15, 2022

Deckware – A 224-bit entropy extractor (2021)

Introduction

I can't believe that it's been almost 3 years since my last blog post. Interestingly enough, that was on a deterministic card shuffle that I decided to call "Ouroboros". Well, this post is also about a deterministic algorithm with a deck of playing cards, but rather than shuffling the deck, we'll be extracting the entropy out of it.

The algorithm is called Deckware. I would have called it "Pokerware", but it was already taken by Chris Wellons. I could have called it "Solitaireware", and it does have a sort of ring to it, but I didn't want to confuse people with the Solitaire Cipher by Bruce Schneier. I debated calling it "Bridgeware", but I fear that the Bridge card game is a fringe game enjoyed only by old ladies in nursing homes drinking lemonade, and most people wouldn't get it. Ultimately, the randomness extractor is working through the whole deck, so it makes sense to call it "Deckware", even if it does sound a bit like a construction company.

The thing to understand about this algorithm however, is that it is not a generic passphrase generator like Diceware or Pokerware. As such, there is no word list provided with Deckware. Instead, it's a randomness extractor. it's designed such that you use your deck of playing cards as a random number generator, and this algorithm uniformly returns a 224-bit random number from that shuffle. Once you have that 224-bits of entropy, it's yours to do with as you wish:

  • Use it as a <= 224-bit cryptographic symmetric key.
  • Use it as a seed for a CSPRNG, such as reseeding your kernel RNG.
  • Use it for election auditing, lottery drawing, or randomized drug samples.
  • Convert the hexadecimal to a 14-word Niceware passphrase.

When you need a lot of randomness, Deckware might work, although it's not particularly fast.

Lehmer Code

The basis of Deckware is Lehmer code. Lehmer code is a factoradic algorithm for converting any specific permutation in a set to an integer. To understand how this works, let's look first at standard combinations that we're all familiar with.

In decimal, which we use every day, we're all familiar with the "ones" place, the "tens" place, "hundreds" place, "thousands" place, etc. So a number like "3481" is 3*1000 + 4*100 + 8*10 + 1*1, right? Simple enough.

Factoradic systems are a way to represent an integer as the sum of multiples of factorials. Instead of a decimal number system (or binay, octal, hexadecimal, etc), it's a factorial number system. If I wanted to take my previous example of "3481", I know that 6! = 720, so 7! = 5040. Thus, 3481/6! = 4 remainder 601. 601/5! is 5 remainder 1. Thus, 3481 = 4*6! + 5*5! + 1*1!.

Okay, but how do you do that with a permutation? Let's say we have a box with numbered chits 1, 2, & 3. How many permutations (order matters) are there? Well, we know it's 3! = 6. We could list them all quite easily:

  • 1, 2, 3
  • 1, 3, 2
  • 2, 1, 3
  • 2, 3, 1
  • 3, 1, 2
  • 3, 2, 1

Lehmer code converts each unique sequence to an integer. It does this by starting with the left-most value, and counting the values less than it to its right. So, starting with the first permutation of "1, 2, 3", "1" is our left-most value, and no values to its rights that are less than 1. So, for this factorial, it's multiplier would be "0". Next we move to the second value, which is "2". Again, there are no values to its right that are less than 2. So also for this factorial, its multiplier is also "0". Finally, on the last value, there are no values to its right, so it's value is "0". This is always the case for the right-most value in Lehmer code. So for "1, 2, 3", our Lehmer code would be 0*2! + 0*1! + 0*0! = 0. If we look at our second permutation of "1, 3, 2", applying Lehmer code, we get 0*2! + 1*1! + 0*0! = 1.

Let's complete the list:

  • 1, 2, 3 = 0*2! + 0*1! + 0*0! = 0
  • 1, 3, 2 = 0*2! + 1*1! + 0*0! = 1
  • 2, 1, 3 = 1*2! + 0*0! + 0*0! = 2
  • 2, 3, 1 = 1*2! + 1*1! + 0*0! = 3
  • 3, 1, 2 = 2*2! + 0*1! + 0*0! = 4
  • 3, 2, 1 = 2*2! + 1*1! + 0*0! = 5

Deckware uses Lehmer code, but with 52! permutations instead of 3! like our example above.

Playing Card Permutations

Knowing that there are 52 unique cards in a standard Poker or Bridge deck of playing cards, then we know there are 52! order permutations. 52! has 68 decimal digits. Converting to binary bits yields log2(52!) ~= 225.581. In case you forgot, it would take all the energy from a hypernova captured by a Dyson sphere to count from 0 to ~2^227. In all likelihood, a sufficiently shuffled deck has never been discovered before.

But how do you do the math? How do you compare the inequality of the Ace of Spades to the Ten of Diamonds, for example? To do this, we need to make some numerical assignments. We're going to use Bridge order for suits, and treat Ace as low, King as high. As such, we get:

  • Clubs: Ace - King = 1 - 13
  • Diamonds: Ace - King = 14 - 26
  • Hearts: Ace - King = 27 - 39
  • Spades: Ace - King = 40 - 52

Now that we have these numerical assignments, we can trivially do our inequality comparisons to build our Lehmer code. But we have a snag. Because the permutation space is larger than 225 bits but not quite 226 bits, we can't use the full space, or we'll end up with a biased extractor. As such, we need to discard anything larger than 2^225-1 (because we start counting with 0). So, when we compute our Lehmer code, if the value is 2^225 or greater, it's ignored, and the user needs to reshuffle the deck. Otherwise, we return the lower 224 bits of the extracted result to the user.

However, 2^225 is approximately 67% of 2^log2(52!). This means that on average, you will have to reshuffle the deck 33% of the time to prevent getting a biased result, or about 1 out of every 3 shuffles will be discarded. It's really unfortunate that it couldn't be better, but it is what it is.

Deckware versus Pokerware: FIGHT!

I think it's worth mentioning how Deckware compares to Pokerware and when you would want to choose one over the other, seeing as though they are both using a deck of playing cards as a source of randomness.

First off, as already mentioned, Deckware does not ship a word list. Technically speaking, Deckware is not a passphrase generator. It's an entropy extractor. This means that you need to bring your own word list to the Deckware table. By comparison, Pokerware provides both formal and slang word lists as part of the project.

Second, Pokerware can be executed trivially without any computing or calculating device. All you need is a deck of cards and a printed off indexed word list. To be fair, I don't think anyone actually keeps a printed off word list of Pokerware, or Diceware for that matter, with them, except for maybe the inventors themselves. I'm guessing most, if not all, are using a computer to generate the Diceware or Pokerware passphrase.

Deckware on the other hand could be executed 100% with a pencil and paper, but it would be painful and incredibly slow. That's something you would make inmates in prison do when they need something to do. I mean, this is essentially what it would take:

  1. Count inequalities for every card placement in the list.
  2. Find the Lehmer code using the factorial number system.
  3. Convert to base 16.

Yeah, no, I'll pass. I'll stick with the tool. However, Deckware has a couple of advantages over Pokerware though that might be worth considering.

First with Pokerware, after every draw, the deck needs to be reshuffled. As determined, this is at least 7 shuffles. That's 7 full deck shuffles for every passphrase word. At 6 words, that's a total of 42 shuffles you've performed on the deck. Deckware only requires 7 shuffles, and odds are 2 out of 3 you won't need to try again.

Second, for those 42 shuffles with Pokerware, you only returned 74 bits of security. For Deckware's 7 shuffles, you were able to extract 224 bits! That's a significant return for the cost, making it far more efficient.

In summary:

Pokerware:

  • Advantages:
    • Provides two word lists.
    • Simple and clean to execute.
    • Can be executed without a computer.
    • Stands on its own as a unique tool.
  • Disadvantages:
    • Cumbersome shuffling per generated word.
    • More time costly for similar security margins.

Deckware:

  • Advantages:
    • Maximizes deck entropy.
    • Small time commitment.
    • Can be used for security solutions other than passphrases.
  • Disadvantages:
    • Does not provide a word list.
    • Might be difficult to independently audit.
    • Requires a computer.
    • Can be replaced with SHA-224.

I think that last disadvantage actually speaks volumes. In the past, I would shuffle the deck, record the results, and hash it with SHA-224. That's perfectly acceptable, and I won't blame you for that approach. Even though using SHA-224 to hash your deck order is technically biased, the bias isn't significant enough to reduce security in practical terms, and so long as SHA-2 remains secure, you can't identify a biased result from an unbiased one.

Deckware is elegant in that not only is it uniform, it doesn't rely on any cryptographic primitives. It's just factorial math. This means you can trivially audit it for correctness. For example, extract the 224-bit hexadecimal string from an ordered deck, and it should return 0x00000000000000000000000000000000000000000000000000000000. Swap the King of Spades with the Queen of Spades, and it will return 0x00000000000000000000000000000000000000000000000000000001. This sort of vetting isn't accessible for SHA-2, although there is little to no reason to not trust its correctness.

I'm not going to say one is better than the other (Pokerware or Deckware), because as outlined, they have their own strengths and weaknesses. I have personally used Pokerware, and truth be told, I was adding it to my password and passphrase generator (how did I miss it?!), and it got me thinking: "how would I design a playing card algorithm without relying on cryptography?"

Deckware In Action

Here's a couple screenshots of an early release of the tool in action. Here, you can see the unshuffled deck on the "upper table". The suit symbols are emoji provided by OpenMoji. I added the text next to each suit using the DejaVu Serif font in Inkscape.

Here, I've dragged and dropped each card onto the lower table representing the shuffled deck. I'm not very good an JavaScript listening events, so I shamelessly took the code from W3Schools. No doubt it could use some polish, but it works.

Notice that I've clicked the "Calculate unique deck ID" button to extract the entropy (maybe I should change that button text now that I'm thinking about it). I got "b08bd2f0720ade917b842ee1e721fe1c6ad00429e1155f9201b50d82" returned. This gives be a 14-word Niceware passphrase of "random sporran ironclad tare lifeful cromwell trekked wrigglier imprudence amenable thai hajj affectionately barratry".

After extracting the entropy out of the deck, you should thoroughly reshuffle the deck or place it back in order to destroy the key, so the entropy cannot be re-extracted. You should also reload your web browser for the same reason. The tool is not using any persistent storage, but feel free to run the tool in a private browser window if you're paranoid.

Closing Thoughts

In practice, after shuffling the deck, I was able to record every card in the tool in 153 seconds, or around 2 - 3 minutes. That's not bad with drag-and-drop using the mouse, and I'm sure it can be improved with a keyboard listening event to type it in rather than using the mouse. Again though, I'm not proficient with JavaScript listening events, so maybe someone can help me out here. However, this tool or SHA-224, the bulk of the time is taken to record the cards in the shuffled deck, so from my point of view, it's sixes. Pick your poison. You need a tool, one way or the other.

For the time being, I've got this opened in a tab in my browser. When I need a password generator, I give the hexadecimal string to Niceware. 14 words is generally overkill for my usual password needs. Even dividing it in half at two 7 words each, gives me two 112-bit passphrases. Still overkill. But 5 words yields 80 bits of security, which is right on the money. I can get two 80-bit passphrases and one 64-bit passphrase out of a single shuffle. I'll have to see how this goes.



from Hacker News https://ift.tt/3vzDPnn

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.