lex を試す。トークン定義が正規表現でないと?

「t_PLUS = r'\+'」を「t_PLUS = '+'」にしてみる

#!/usr/bin/env python

import ply.lex as lex

tokens = (
  'CHAR',
  'PLUS',
)

t_CHAR    = r'\w+'
#t_PLUS    = r'\+'
t_PLUS    = '+'
t_ignore  = ' \t'

def t_error(t):
    print "Illegal character '%s'" % t.value[0]
    t.lexer.skip(1)

lex.lex()
lex.input("a + b + abc")

while 1:
    tok = lex.token()
    if not tok: break      # No more input
    print tok

で、

lex: Invalid regular expression for rule 't_PLUS'. nothing to repeat
Traceback (most recent call last):
  File "./20080229_ply00.py", line 18, in ?
    lex.lex()
  File "/usr/lib/python2.3/site-packages/ply/lex.py", line 759, in lex
    raise SyntaxError,"lex: Unable to build lexer."
SyntaxError: lex: Unable to build lexer.
  • t_ignore は正規表現でなくても良いみたいなのに?
  • 「t_PLUS = '+'」を「t_PLUS = '\+'」にしたら大丈夫だった