Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] - نسخه متنی

David Ascher, Alex Martelli, Anna Ravenscroft

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید


Recipe 16.8. Checking Whether a String Has Balanced Parentheses


Credit: Peter Cogolo


Problem


You need to check
whether a certain string has balanced parentheses, but regular
expressions are not powerful enough for this task.


Solution


We want a "true" parser to check a
string for balanced parentheses, since parsing theory proves that a
regular expression is not sufficient. Choosing one out of the many
Python parser generators, we'll use David
Beazley's classic but evergreen PLY:

# define token names, and a regular expression per each token
tokens = 'OPEN_PAREN', 'CLOS_PAREN', 'OTHR_CHARS'
t_OPEN_PAREN = r'\('
t_CLOS_PAREN = r'\)'
t_OTHR_CHARS = r'[^( )]+'# RE meaning: one or more non-parentheses
def t_error(t): t.skip(1)
# make the lexer (AKA tokenizer)
import lex
lexer = lex.lex(optimize=1)
# define syntax action-functions, with syntax rules in docstrings
def p_balanced(p):
''' balanced : balanced OPEN_PAREN balanced CLOS_PAREN balanced
| OTHR_CHARS
| '''
if len(p) == 1:
p[0] = ''
elif len(p) == 2:
p[0] = p[1]
else:
p[0] = p[1]+p[2]+p[3]+p[4]+p[5]
def p_error(p): pass
# make the parser (AKA scanner)
import yacc
parser = yacc.yacc( )
def has_balanced_parentheses(s):
if not s: return True
result = parser.parse(s, lexer=lexer)
return s == result


Discussion


Here's an example of use of this
recipe's code:

>> s = 'ba(be, bi(bo, bu))'
>> print s, is_balanced(s)
ba(be, bi(bo, bu)) True
>> s = 'ba(be, bi(bo), bu))'
>> print s, is_balanced(s)
ba(be, bi(bo), bu)) False

The first string has balanced parentheses, but the second one has an
extra closed parenthesis; therefore, its parentheses are not
balanced.

"How do I check a string for balanced
parentheses?" is a frequently asked question about
regular expressions. Programmers without a computer science
background are often surprised to hear that regular expressions just
aren't powerful enough for this apparently simple
task and a more complete form of grammar is required.
(Perl's regular expressions plus arbitrary
embedded expressions
kitchen sink
does sufficewhich just proves they
aren't anywhere near
"regular" expressions any more!)

For this very simplified parsing problem we're
presenting, any real parser is overkilljust loop over the
string's characters, keeping a running count of the
number of open and yet unclosed parentheses encountered at this
point, and return False if the running count ever
goes negative or doesn't go back down to exactly 0
at the end:

def has_bal_par(s):
op = 0
for c in s:
if c=='(':
op += 1
elif c==')':
if op == 0:
return False
op -= 1
return op == 0

However, using a parser when you need to parse is still a better
idea, in general, than hacking up special-purpose code such as this
has_bal_par function. As soon as the problem gets
extended a bit (and problems invariably do grow,
in real life, in often unpredictable directions), a real parser can
grow gracefully and proportionally with the problem, while ad hoc
code often must be thrown away and completely rewritten.

All over the web, you can find oodles of Python packages that are
suitable for lexing and parsing tasks. My favorite, out of all of
them, is still good old PLY, David Beazley's Python
Lexx and Yacc, which reproduces the familiar structure of Unix
commands lexx and yacc while
taking advantage of Python's extra power when
compared to the C language that those Unix commands support.

You can find PLY at http://systems.cs.uchicago.edu/ply/. PLY is a
pure Python package: download it (as a .tgz
compressed archive file), decompress and unarchive it (all reasonable
archiving tools now support this subtask on all platforms), open a
command shell, cd into the directory into which
you unarchived PLY, and run the usual python setup.py
install
, with the proper privileges to be able to write
into your Python installation's
site-packages directory (which privileges those
are depends on how you installed Python, and on what platform
you're running). Briefly, install it just as you
would install any other pure Python package.

As you can see from this recipe, PLY is quite easy to use, if you
know even just the fundamentals of lexing and parsing. First, you
define your grammar's
tokensmake a tuple or list of all their
names (conventionally uppercase) bound to name
tokens at your module's top
level, define for each token a regular expression bound to name
t_token_name (again at
the module's top level), import
lex, and call lex.lex to build
your tokenizer (lexer). Then, define your grammar's
action functions (each of them carries the relevant syntax
ruleproductionin its docstring
in BNF, Backus-Naur Form), import yacc, and call
yacc.yacc to build your parser (scanner). To parse
any string, call the parse method of your parser
with the string as an argument.

All the action is in your grammar's action
functions, as their name implies. Each action function receives as
its single argument p a list of production elements
corresponding to the production that has been matched to invoke that
function; the action function's job is to put into
p[0] whatever you need as "the
result" of that syntax rule getting matched. In this
recipe, we use as results the very strings we have been matching, so
that function is_balanced just needs to check
whether the whole string is matched by the parse operation.

When you run this script the first time, you will see a warning about
a shift/reduce conflict. Don't worry: as any old
hand at yacc can tell you, that's
the yacc equivalent of a rite of passage. If you
want to understand that message in depth, and maybe (if
you're an ambitious person) even do something about
it, open with your favorite browser the
doc/plyl file in the directory in which you
unpacked PLY. That file contains a rather thorough documentation of
PLY. As that file suggests, continue by studying the contents of the
examples directory and then read a textbook
about compilersI suggest Dick Grune and Ceriel J.H. Jacobs,
"Parsing Techniques, a Practical
Guide." The first edition, at the time of this
writing, is freely available for download as a PDF file from
http://www.cs.vu.nl/~dick/PTAPGl, and a
second edition should be available in technical bookstores around the
middle of 2005.


See Also


PLY web page at http://systems.cs.uchicago.edu/ply/; Dick
Grune and Ceriel J.H. Jacobs, "Parsing Techniques, a
Practical Guide," a PDF, downloadable from
http://www.cs.vu.nl/~dick/PTAPGl.

/ 394