Recipe 18.12. Formatting Integers as Strings in Arbitrary Bases
Credit: Moon aka Sun, Raymond Hettinger
Problem
You need to display non-negative integers in arbitrary
basesthat is, you need to turn them into strings made up of
"digit" characters (which may
include letters for bases that are > 10).
Solution
A function is clearly the right way to package the
"Solution" to this task:
import string
def format(number, radix, digits=string.digits+string.ascii_lowercase):
"" format the given integer `number'
in the given `radix' using the given
`digits' (default: digits and lowercase ascii letters) ""
if not 2 <= radix <= len(digits):
raise ValueError, "radix must be in 2..%r,
not %r" % (len(digits), radix)
# build result as a list of "digit"s in
natural order (least-significant digit
# leftmost), at the end flip it around and
join it up into a single string
result = [ ]
addon = result.append # extract bound-method once
# compute 'sign' (empty for number>=0)
and ensure number >= 0 thereafter
sign = ''
if number < 0:
number = -number
sign = '-'
elif number == 0:
sign = '0'
_divmod = divmod # access to locals is faster
while number:
# like: rdigit = number % radix; number //= radix
number, rdigit = _divmod(number, radix)
# append appropriate string for the digit we just found
addon(digits[rdigit])
# append sign (if any),
flip things around, and join up into a string
addon(sign)
result.reverse( )
return ''.join(result)
Discussion
Here is a simple usage example, with the usual guard to let us append
the example to the same module where we define function
format. The usage example runs when the module is
run as a main script but not when the module is imported:
if _ _name_ _ == '_ _main_ _':This usage example is designed to be totally quiet when everything
as_str = 'qwertyuioplkjhgfdsazxcvbnm0987654321'
as_num = 79495849566202193863718934176854772085778985434624775545L
num = int( as_str, 36 )
assert num == as_num
res = format( num, 36 )
assert res == as_str
works fine, emitting messages only in case of problems.The code in this recipe is designed with careful attention to both
generality and performance. The string of digits
used by default is made up of all decimal digits followed by
lowercase ASCII letters, allowing a radix of up to
36; however, you can pass any sequence of strings (rather than just a
string, to be used as a sequence of characters), for example to
support even larger bases. Performance is vastly enhanced, with
respect to a naive approach to coding, by a few precautions taken in
the codein decreasing order of importance:
- Building the result as a list and then using
''.join to create a string containing all the list
items. (The alternative of adding each item to a string, one at a
time, would be much slower than the ''.join
approach.) - Building the result in natural order (least-significant digit
leftmost) and flipping it around at the end. Inserting each digit at
the front as it gets computed would be slow. - Extracting the bound method result.append into a
local variable. - Giving a local name _divmod to the
divmod buit-in.
extra price each time through the loop because lookup of local
variables is measurably faster than lookup of built-ins and quite a
bit faster than compound-name lookups such as
result.append.Here is an example of how you could use format with
"digits" that are not single
characters, but rather longer strings:
digs = [ d+'-' for d in
'zero one two three four five six seven eight nine'.split( ) ]
print format(315, 10, digs).rstrip('-')
# emits: three-one-five
See Also
Library Reference and Python in a
Nutshell docs for built-ins oct and
hex; Recipe 18.11 for displaying integers
specifically in binary.