Recipe 10.2. Generating Easily Remembered Somewhat-Random Passwords
Credit: Luther Blissett
Problem
You
need to create new passwords randomlyfor example, to assign
them automatically to new user accounts. You want the passwords to be
somewhat feasible to remember for typical users, so they
won't be written down.
Solution
We can use a pastiche approach for this, mimicking letter
n-grams in actual English words. A grander way
to look at the same approach is to call it a Markov Chain Simulation
of English:
import random, string
class password(object):
# Any substantial file of English words will do just as well: we
# just need self.data to be a big string, the text we'll pastiche
data = open("/usr/share/dict/words").read( ).lower( )
def renew(self, n, maxmem=3):
''' accumulate into self.chars `n' random characters, with a
maximum-memory "history" of `maxmem` characters back. '''
self.chars = [ ]
for i in range(n):
# Randomly "rotate" self.data
randspot = random.randrange(len(self.data))
self.data = self.data[randspot:] + self.data[:randspot]
# Get the n-gram
where = -1
# start by trying to locate the last maxmem characters in
# self.chars. If i<maxmem, we actually only get the last
# i, i.e., all of self.chars -- but that's OK: slicing
# is quite tolerant in this way, and it fits the algorithm
locate = ''.join(self.chars[-maxmem:])
while where<0 and locate:
# Locate the n-gram in the data
where = self.data.find(locate)
# Back off to a shorter n-gram if necessary
locate = locate[1:]
# if where==-1 and locate='', we just pick self.data[0] --
# it's a random item within self.data, tx to the rotation
c = self.data[where+len(locate)+1]
# we only want lowercase letters, so, if we picked another
# kind of character, we just choose a random letter instead
if not c.islower( ): c = random.choice(string.lowercase)
# and finally we record the character into self.chars
self.chars.append(c)
def _ _str_ _(self):
return ''.join(self.chars)
if _ _name_ _ == '_ _main_ _':
"Usage: pastiche [passwords [length [memory]]]"
import sys
if len(sys.argv)>1: dopass = int(sys.argv[1])
else: dopass = 8
if len(sys.argv)>2: length = int(sys.argv[2])
else: length = 10
if len(sys.argv)>3: memory = int(sys.argv[3])
else: memory = 3
onepass = password( )
for i in range(dopass):
onepass.renew(length, memory)
print onepass
Discussion
This recipe is useful when creating new user accounts and assigning
each user a different, random password: it uses passwords that a
typical user will find it feasible to remember, hopefully so they
won't get written down. See Recipe 10.1 if you prefer totally random
passwords.The recipe's idea is based on the good old pastiche
concept. Each letter (always lowercase) in the password is chosen
pseudo-randomly from data that is a collection of words in a natural
language familiar to the users. This recipe uses the file that is
/usr/share/dict/words supplied with Linux
systems (on my machine, a file of over 45,000 words), but any large
document in plain text will do just as well. The trick that makes the
passwords sort of memorable, and not fully random, is that each
letter is chosen based on the last few letters already picked for the
password as it stands so far. Thus, letter transitions will tend to
be "repetitive" according to
patterns that are familiar to the user.The code in the recipe takes some care to locate each time a random
occurrence, in the text being pastiched, of the last
maxmem characters picked so far. Since
it's easy to find the first
occurrence of a substring, the code
"rotates" the text string randomly,
to ensure that the first occurrence is a random one from the point of
view of the original text. If the substring made up with the last
maxmem characters picked is not found in the text,
the code "backs down" to search for
just the last maxmem-1, and so on, backing down
until, worst case, it just picks the first character in the rotated
text (which is a random character from the point of view of the
original text).A break in this Markov Chain process occurs when this picking
procedure chooses a character that is not a lowercase letter, in
which case, a random lowercase letter is chosen instead (any
lowercase letter is picked with equal probability).Here are a couple of typical sample runs of this
pastiche.py password-generation script:
[situ@tioni cooker]$ python pastiche.pyAs you can see, some of these are definitely word-like, others less
yjackjaceh
ackjavagef
aldsstordb
dingtonous
stictlyoke
cvaiwandga
lidmanneck
olexnarinl
[situ@tioni cooker]$ python pastiche.py
ptiontingt
punchankin
cypresneyf
sennemedwa
iningrated
fancejacev
sroofcased
nryjackman
so, but for a typical human being, none are more problematic to
remember than a sequence of even fewer totally random, uncorrelated
letters. No doubt some theoretician will complain (justifiably, in a
way) that they aren't as random as all that. Well,
tough. My point is that they had better not be, if some poor fellow
is going to have to remember them! You can compensate for this
limitation by making them a bit longer. If said theoretician
demonstrates how to compute the entropy per character of this method
of password generation (versus the obvious 4.7 bits/character, the
base-2 logarithm of 26, for passwords made up of totally random
lowercase letters), now that would be a useful contribution indeed.
Meanwhile, I'll keep generating passwords this way,
rather than in a totally random way. If nothing else,
it's the closest thing I've found
to a useful application for the lovely pastiche concept.The concept of passwords that are not totally random, but rather a
bit more memorable, goes back a long wayat least to the 1960s
and to works by Morrie Gasser and Daniel Edwards. A Federal
Information Processing Standard (FIPS), FIPS 181, specifies in detail
how "pronounceable" passwords are
to be generated; see http://www.itl.nist.gov/fipspubs/fip181.
See Also
Recipe 10.1; documentation
of the standard library module random in the
Library Reference and Python in a
Nutshell.