Order-2 Markov Model using A Tale of Two Cities

This code generates passwords, using a markov model containing Huffman trees. Traversing the Huffman trees allows selection of a letter to consume few bits for common choices and many bits for uncommon ones, allowing "biasing" toward more common sequences without compromising the randomness of the generated passwords.

This particular engine is based on Charles Dickens' A Tale of Two Cities, although the entropy of the generated passwords is independent of the choice of source text.

Another way of looking at this is that we're "decompressing" a random bit string into english text.

How many bits?

Here are some fresh passwords for you. Click "refresh" for another set.

Which column would you rather memorize?

bitspassword
NOT YETNOT YET

Fine print: yes, they're secure. More specifically, there's a bijection between the bitstrings and the passwords, meaning that every password in the pool associated with the corpus and the number of bits is generated with equal likelihood.

Also, note that generating 8 of them and picking the most readable one (as I do here, and also when I'm generating my real passwords) essentially costs you 3 bits of entropy. So, if you really want 56 bits of entropy and some choice, set the entropy to 59 bits and take your favorite.

The passwords you see here are generated on your own machine. They are secure, unless

  1. the bit generator in your browser is weak,
  2. your machine is compromised,
  3. there's a bug in my code, or
  4. there's some other problem I failed to anticipate.

Comments on the source code are also welcome.

A note about the spaces: yes, the passwords contain spaces. This is A-OK for linux login, Apple OS X login, and Windows login. For all those other sites, you should probably be using a password manager anyway! (I personally use 1password.

For more details on the algorithm, the proof of entropy, and comparisons to other related methods, see our preprint at arxiv.

Or read the source code, at GitHub. Naturally, it's all in Racket

Also, in case it wasn't obvious, the tool generated its own name, "Molis Hai". In fact, it generated the string "There twass molis hai", which sounds something like a proclamation.

Finally, many thanks to Zachary Peterson, @znjp, for pointing me to all kinds of related work. Thanks!

Copyright 2015-2023 John Clements <clements@brinckerhoff.org>