Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources]

David Ascher, Alex Martelli, Anna Ravenscroft

نسخه متنی -صفحه : 394/ 346
نمايش فراداده

Recipe 18.12. Formatting Integers as Strings in Arbitrary Bases

Credit: Moon aka Sun, Raymond Hettinger

Problem

You need to display non-negative integers in arbitrary basesthat is, you need to turn them into strings made up of "digit" characters (which may include letters for bases that are > 10).

Solution

A function is clearly the right way to package the "Solution" to this task:

import string
def format(number, radix, digits=string.digits+string.ascii_lowercase):
"" format the given integer `number'
 in the given `radix' using the given
`digits' (default: digits and lowercase ascii letters) ""
if not 2 <= radix <= len(digits):
raise ValueError, "radix must be in 2..%r,
 not %r" % (len(digits), radix)
# build result as a list of "digit"s in
 natural order (least-significant digit
# leftmost), at the end flip it around and
 join it up into a single string
result = [  ]
addon = result.append # extract bound-method once
# compute 'sign' (empty for number>=0)
 and ensure number >= 0 thereafter
sign = ''
if number < 0:
number = -number
sign = '-'
elif number == 0:
sign = '0'
_divmod = divmod  # access to locals is faster
while number:
# like: rdigit = number % radix; number //= radix
number, rdigit = _divmod(number, radix)
# append appropriate string for the digit we just found
addon(digits[rdigit])
# append sign (if any),
 flip things around, and join up into a string
addon(sign)
result.reverse( )
return ''.join(result)

Discussion

Here is a simple usage example, with the usual guard to let us append the example to the same module where we define function format. The usage example runs when the module is run as a main script but not when the module is imported:

if _ _name_ _ == '_ _main_ _':
as_str = 'qwertyuioplkjhgfdsazxcvbnm0987654321'
as_num = 79495849566202193863718934176854772085778985434624775545L
num = int( as_str, 36 )
assert num == as_num
res = format( num, 36 )
assert res == as_str

This usage example is designed to be totally quiet when everything works fine, emitting messages only in case of problems.

The code in this recipe is designed with careful attention to both generality and performance. The string of digits used by default is made up of all decimal digits followed by lowercase ASCII letters, allowing a radix of up to 36; however, you can pass any sequence of strings (rather than just a string, to be used as a sequence of characters), for example to support even larger bases. Performance is vastly enhanced, with respect to a naive approach to coding, by a few precautions taken in the codein decreasing order of importance:

  1. Building the result as a list and then using ''.join to create a string containing all the list items. (The alternative of adding each item to a string, one at a time, would be much slower than the ''.join approach.)

  2. Building the result in natural order (least-significant digit leftmost) and flipping it around at the end. Inserting each digit at the front as it gets computed would be slow.

  3. Extracting the bound method result.append into a local variable.

  4. Giving a local name _divmod to the divmod buit-in.

Items 2 and 3 speed lookups that otherwise would extract a small extra price each time through the loop because lookup of local variables is measurably faster than lookup of built-ins and quite a bit faster than compound-name lookups such as result.append.

Here is an example of how you could use format with "digits" that are not single characters, but rather longer strings:

digs = [ d+'-' for d in
'zero one two three four five six seven eight nine'.split( ) ]
print format(315, 10, digs).rstrip('-')
# emits: three-one-five

See Also

Library Reference and Python in a Nutshell docs for built-ins oct and hex; Recipe 18.11 for displaying integers specifically in binary.