Mercurial > repo
comparison ply-3.8/CHANGES @ 7267:343ff337a19b
<ais523> ` tar -xf ply-3.8.tar.gz
author | HackBot |
---|---|
date | Wed, 23 Mar 2016 02:40:16 +0000 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
7266:61a39a120dee | 7267:343ff337a19b |
---|---|
1 Version 3.8 | |
2 --------------------- | |
3 10/02/15: beazley | |
4 Fixed issues related to Python 3.5. Patch contributed by Barry Warsaw. | |
5 | |
6 Version 3.7 | |
7 --------------------- | |
8 08/25/15: beazley | |
9 Fixed problems when reading table files from pickled data. | |
10 | |
11 05/07/15: beazley | |
12 Fixed regression in handling of table modules if specified as module | |
13 objects. See https://github.com/dabeaz/ply/issues/63 | |
14 | |
15 Version 3.6 | |
16 --------------------- | |
17 04/25/15: beazley | |
18 If PLY is unable to create the 'parser.out' or 'parsetab.py' files due | |
19 to permission issues, it now just issues a warning message and | |
20 continues to operate. This could happen if a module using PLY | |
21 is installed in a funny way where tables have to be regenerated, but | |
22 for whatever reason, the user doesn't have write permission on | |
23 the directory where PLY wants to put them. | |
24 | |
25 04/24/15: beazley | |
26 Fixed some issues related to use of packages and table file | |
27 modules. Just to emphasize, PLY now generates its special | |
28 files such as 'parsetab.py' and 'lextab.py' in the *SAME* | |
29 directory as the source file that uses lex() and yacc(). | |
30 | |
31 If for some reason, you want to change the name of the table | |
32 module, use the tabmodule and lextab options: | |
33 | |
34 lexer = lex.lex(lextab='spamlextab') | |
35 parser = yacc.yacc(tabmodule='spamparsetab') | |
36 | |
37 If you specify a simple name as shown, the module will still be | |
38 created in the same directory as the file invoking lex() or yacc(). | |
39 If you want the table files to be placed into a different package, | |
40 then give a fully qualified package name. For example: | |
41 | |
42 lexer = lex.lex(lextab='pkgname.files.lextab') | |
43 parser = yacc.yacc(tabmodule='pkgname.files.parsetab') | |
44 | |
45 For this to work, 'pkgname.files' must already exist as a valid | |
46 Python package (i.e., the directories must already exist and be | |
47 set up with the proper __init__.py files, etc.). | |
48 | |
49 Version 3.5 | |
50 --------------------- | |
51 04/21/15: beazley | |
52 Added support for defaulted_states in the parser. A | |
53 defaulted_state is a state where the only legal action is a | |
54 reduction of a single grammar rule across all valid input | |
55 tokens. For such states, the rule is reduced and the | |
56 reading of the next lookahead token is delayed until it is | |
57 actually needed at a later point in time. | |
58 | |
59 This delay in consuming the next lookahead token is a | |
60 potentially important feature in advanced parsing | |
61 applications that require tight interaction between the | |
62 lexer and the parser. For example, a grammar rule change | |
63 modify the lexer state upon reduction and have such changes | |
64 take effect before the next input token is read. | |
65 | |
66 *** POTENTIAL INCOMPATIBILITY *** | |
67 One potential danger of defaulted_states is that syntax | |
68 errors might be deferred to a a later point of processing | |
69 than where they were detected in past versions of PLY. | |
70 Thus, it's possible that your error handling could change | |
71 slightly on the same inputs. defaulted_states do not change | |
72 the overall parsing of the input (i.e., the same grammar is | |
73 accepted). | |
74 | |
75 If for some reason, you need to disable defaulted states, | |
76 you can do this: | |
77 | |
78 parser = yacc.yacc() | |
79 parser.defaulted_states = {} | |
80 | |
81 04/21/15: beazley | |
82 Fixed debug logging in the parser. It wasn't properly reporting goto states | |
83 on grammar rule reductions. | |
84 | |
85 04/20/15: beazley | |
86 Added actions to be defined to character literals (Issue #32). For example: | |
87 | |
88 literals = [ '{', '}' ] | |
89 | |
90 def t_lbrace(t): | |
91 r'\{' | |
92 # Some action | |
93 t.type = '{' | |
94 return t | |
95 | |
96 def t_rbrace(t): | |
97 r'\}' | |
98 # Some action | |
99 t.type = '}' | |
100 return t | |
101 | |
102 04/19/15: beazley | |
103 Import of the 'parsetab.py' file is now constrained to only consider the | |
104 directory specified by the outputdir argument to yacc(). If not supplied, | |
105 the import will only consider the directory in which the grammar is defined. | |
106 This should greatly reduce problems with the wrong parsetab.py file being | |
107 imported by mistake. For example, if it's found somewhere else on the path | |
108 by accident. | |
109 | |
110 *** POTENTIAL INCOMPATIBILITY *** It's possible that this might break some | |
111 packaging/deployment setup if PLY was instructed to place its parsetab.py | |
112 in a different location. You'll have to specify a proper outputdir= argument | |
113 to yacc() to fix this if needed. | |
114 | |
115 04/19/15: beazley | |
116 Changed default output directory to be the same as that in which the | |
117 yacc grammar is defined. If your grammar is in a file 'calc.py', | |
118 then the parsetab.py and parser.out files should be generated in the | |
119 same directory as that file. The destination directory can be changed | |
120 using the outputdir= argument to yacc(). | |
121 | |
122 04/19/15: beazley | |
123 Changed the parsetab.py file signature slightly so that the parsetab won't | |
124 regenerate if created on a different major version of Python (ie., a | |
125 parsetab created on Python 2 will work with Python 3). | |
126 | |
127 04/16/15: beazley | |
128 Fixed Issue #44 call_errorfunc() should return the result of errorfunc() | |
129 | |
130 04/16/15: beazley | |
131 Support for versions of Python <2.7 is officially dropped. PLY may work, but | |
132 the unit tests requires Python 2.7 or newer. | |
133 | |
134 04/16/15: beazley | |
135 Fixed bug related to calling yacc(start=...). PLY wasn't regenerating the | |
136 table file correctly for this case. | |
137 | |
138 04/16/15: beazley | |
139 Added skipped tests for PyPy and Java. Related to use of Python's -O option. | |
140 | |
141 05/29/13: beazley | |
142 Added filter to make unit tests pass under 'python -3'. | |
143 Reported by Neil Muller. | |
144 | |
145 05/29/13: beazley | |
146 Fixed CPP_INTEGER regex in ply/cpp.py (Issue 21). | |
147 Reported by @vbraun. | |
148 | |
149 05/29/13: beazley | |
150 Fixed yacc validation bugs when from __future__ import unicode_literals | |
151 is being used. Reported by Kenn Knowles. | |
152 | |
153 05/29/13: beazley | |
154 Added support for Travis-CI. Contributed by Kenn Knowles. | |
155 | |
156 05/29/13: beazley | |
157 Added a .gitignore file. Suggested by Kenn Knowles. | |
158 | |
159 05/29/13: beazley | |
160 Fixed validation problems for source files that include a | |
161 different source code encoding specifier. Fix relies on | |
162 the inspect module. Should work on Python 2.6 and newer. | |
163 Not sure about older versions of Python. | |
164 Contributed by Michael Droettboom | |
165 | |
166 05/21/13: beazley | |
167 Fixed unit tests for yacc to eliminate random failures due to dict hash value | |
168 randomization in Python 3.3 | |
169 Reported by Arfrever | |
170 | |
171 10/15/12: beazley | |
172 Fixed comment whitespace processing bugs in ply/cpp.py. | |
173 Reported by Alexei Pososin. | |
174 | |
175 10/15/12: beazley | |
176 Fixed token names in ply/ctokens.py to match rule names. | |
177 Reported by Alexei Pososin. | |
178 | |
179 04/26/12: beazley | |
180 Changes to functions available in panic mode error recover. In previous versions | |
181 of PLY, the following global functions were available for use in the p_error() rule: | |
182 | |
183 yacc.errok() # Reset error state | |
184 yacc.token() # Get the next token | |
185 yacc.restart() # Reset the parsing stack | |
186 | |
187 The use of global variables was problematic for code involving multiple parsers | |
188 and frankly was a poor design overall. These functions have been moved to methods | |
189 of the parser instance created by the yacc() function. You should write code like | |
190 this: | |
191 | |
192 def p_error(p): | |
193 ... | |
194 parser.errok() | |
195 | |
196 parser = yacc.yacc() | |
197 | |
198 *** POTENTIAL INCOMPATIBILITY *** The original global functions now issue a | |
199 DeprecationWarning. | |
200 | |
201 04/19/12: beazley | |
202 Fixed some problems with line and position tracking and the use of error | |
203 symbols. If you have a grammar rule involving an error rule like this: | |
204 | |
205 def p_assignment_bad(p): | |
206 '''assignment : location EQUALS error SEMI''' | |
207 ... | |
208 | |
209 You can now do line and position tracking on the error token. For example: | |
210 | |
211 def p_assignment_bad(p): | |
212 '''assignment : location EQUALS error SEMI''' | |
213 start_line = p.lineno(3) | |
214 start_pos = p.lexpos(3) | |
215 | |
216 If the trackng=True option is supplied to parse(), you can additionally get | |
217 spans: | |
218 | |
219 def p_assignment_bad(p): | |
220 '''assignment : location EQUALS error SEMI''' | |
221 start_line, end_line = p.linespan(3) | |
222 start_pos, end_pos = p.lexspan(3) | |
223 | |
224 Note that error handling is still a hairy thing in PLY. This won't work | |
225 unless your lexer is providing accurate information. Please report bugs. | |
226 Suggested by a bug reported by Davis Herring. | |
227 | |
228 04/18/12: beazley | |
229 Change to doc string handling in lex module. Regex patterns are now first | |
230 pulled from a function's .regex attribute. If that doesn't exist, then | |
231 .doc is checked as a fallback. The @TOKEN decorator now sets the .regex | |
232 attribute of a function instead of its doc string. | |
233 Changed suggested by Kristoffer Ellersgaard Koch. | |
234 | |
235 04/18/12: beazley | |
236 Fixed issue #1: Fixed _tabversion. It should use __tabversion__ instead of __version__ | |
237 Reported by Daniele Tricoli | |
238 | |
239 04/18/12: beazley | |
240 Fixed issue #8: Literals empty list causes IndexError | |
241 Reported by Walter Nissen. | |
242 | |
243 04/18/12: beazley | |
244 Fixed issue #12: Typo in code snippet in documentation | |
245 Reported by florianschanda. | |
246 | |
247 04/18/12: beazley | |
248 Fixed issue #10: Correctly escape t_XOREQUAL pattern. | |
249 Reported by Andy Kittner. | |
250 | |
251 Version 3.4 | |
252 --------------------- | |
253 02/17/11: beazley | |
254 Minor patch to make cpp.py compatible with Python 3. Note: This | |
255 is an experimental file not currently used by the rest of PLY. | |
256 | |
257 02/17/11: beazley | |
258 Fixed setup.py trove classifiers to properly list PLY as | |
259 Python 3 compatible. | |
260 | |
261 01/02/11: beazley | |
262 Migration of repository to github. | |
263 | |
264 Version 3.3 | |
265 ----------------------------- | |
266 08/25/09: beazley | |
267 Fixed issue 15 related to the set_lineno() method in yacc. Reported by | |
268 mdsherry. | |
269 | |
270 08/25/09: beazley | |
271 Fixed a bug related to regular expression compilation flags not being | |
272 properly stored in lextab.py files created by the lexer when running | |
273 in optimize mode. Reported by Bruce Frederiksen. | |
274 | |
275 | |
276 Version 3.2 | |
277 ----------------------------- | |
278 03/24/09: beazley | |
279 Added an extra check to not print duplicated warning messages | |
280 about reduce/reduce conflicts. | |
281 | |
282 03/24/09: beazley | |
283 Switched PLY over to a BSD-license. | |
284 | |
285 03/23/09: beazley | |
286 Performance optimization. Discovered a few places to make | |
287 speedups in LR table generation. | |
288 | |
289 03/23/09: beazley | |
290 New warning message. PLY now warns about rules never | |
291 reduced due to reduce/reduce conflicts. Suggested by | |
292 Bruce Frederiksen. | |
293 | |
294 03/23/09: beazley | |
295 Some clean-up of warning messages related to reduce/reduce errors. | |
296 | |
297 03/23/09: beazley | |
298 Added a new picklefile option to yacc() to write the parsing | |
299 tables to a filename using the pickle module. Here is how | |
300 it works: | |
301 | |
302 yacc(picklefile="parsetab.p") | |
303 | |
304 This option can be used if the normal parsetab.py file is | |
305 extremely large. For example, on jython, it is impossible | |
306 to read parsing tables if the parsetab.py exceeds a certain | |
307 threshold. | |
308 | |
309 The filename supplied to the picklefile option is opened | |
310 relative to the current working directory of the Python | |
311 interpreter. If you need to refer to the file elsewhere, | |
312 you will need to supply an absolute or relative path. | |
313 | |
314 For maximum portability, the pickle file is written | |
315 using protocol 0. | |
316 | |
317 03/13/09: beazley | |
318 Fixed a bug in parser.out generation where the rule numbers | |
319 where off by one. | |
320 | |
321 03/13/09: beazley | |
322 Fixed a string formatting bug with one of the error messages. | |
323 Reported by Richard Reitmeyer | |
324 | |
325 Version 3.1 | |
326 ----------------------------- | |
327 02/28/09: beazley | |
328 Fixed broken start argument to yacc(). PLY-3.0 broke this | |
329 feature by accident. | |
330 | |
331 02/28/09: beazley | |
332 Fixed debugging output. yacc() no longer reports shift/reduce | |
333 or reduce/reduce conflicts if debugging is turned off. This | |
334 restores similar behavior in PLY-2.5. Reported by Andrew Waters. | |
335 | |
336 Version 3.0 | |
337 ----------------------------- | |
338 02/03/09: beazley | |
339 Fixed missing lexer attribute on certain tokens when | |
340 invoking the parser p_error() function. Reported by | |
341 Bart Whiteley. | |
342 | |
343 02/02/09: beazley | |
344 The lex() command now does all error-reporting and diagonistics | |
345 using the logging module interface. Pass in a Logger object | |
346 using the errorlog parameter to specify a different logger. | |
347 | |
348 02/02/09: beazley | |
349 Refactored ply.lex to use a more object-oriented and organized | |
350 approach to collecting lexer information. | |
351 | |
352 02/01/09: beazley | |
353 Removed the nowarn option from lex(). All output is controlled | |
354 by passing in a logger object. Just pass in a logger with a high | |
355 level setting to suppress output. This argument was never | |
356 documented to begin with so hopefully no one was relying upon it. | |
357 | |
358 02/01/09: beazley | |
359 Discovered and removed a dead if-statement in the lexer. This | |
360 resulted in a 6-7% speedup in lexing when I tested it. | |
361 | |
362 01/13/09: beazley | |
363 Minor change to the procedure for signalling a syntax error in a | |
364 production rule. A normal SyntaxError exception should be raised | |
365 instead of yacc.SyntaxError. | |
366 | |
367 01/13/09: beazley | |
368 Added a new method p.set_lineno(n,lineno) that can be used to set the | |
369 line number of symbol n in grammar rules. This simplifies manual | |
370 tracking of line numbers. | |
371 | |
372 01/11/09: beazley | |
373 Vastly improved debugging support for yacc.parse(). Instead of passing | |
374 debug as an integer, you can supply a Logging object (see the logging | |
375 module). Messages will be generated at the ERROR, INFO, and DEBUG | |
376 logging levels, each level providing progressively more information. | |
377 The debugging trace also shows states, grammar rule, values passed | |
378 into grammar rules, and the result of each reduction. | |
379 | |
380 01/09/09: beazley | |
381 The yacc() command now does all error-reporting and diagnostics using | |
382 the interface of the logging module. Use the errorlog parameter to | |
383 specify a logging object for error messages. Use the debuglog parameter | |
384 to specify a logging object for the 'parser.out' output. | |
385 | |
386 01/09/09: beazley | |
387 *HUGE* refactoring of the the ply.yacc() implementation. The high-level | |
388 user interface is backwards compatible, but the internals are completely | |
389 reorganized into classes. No more global variables. The internals | |
390 are also more extensible. For example, you can use the classes to | |
391 construct a LALR(1) parser in an entirely different manner than | |
392 what is currently the case. Documentation is forthcoming. | |
393 | |
394 01/07/09: beazley | |
395 Various cleanup and refactoring of yacc internals. | |
396 | |
397 01/06/09: beazley | |
398 Fixed a bug with precedence assignment. yacc was assigning the precedence | |
399 each rule based on the left-most token, when in fact, it should have been | |
400 using the right-most token. Reported by Bruce Frederiksen. | |
401 | |
402 11/27/08: beazley | |
403 Numerous changes to support Python 3.0 including removal of deprecated | |
404 statements (e.g., has_key) and the additional of compatibility code | |
405 to emulate features from Python 2 that have been removed, but which | |
406 are needed. Fixed the unit testing suite to work with Python 3.0. | |
407 The code should be backwards compatible with Python 2. | |
408 | |
409 11/26/08: beazley | |
410 Loosened the rules on what kind of objects can be passed in as the | |
411 "module" parameter to lex() and yacc(). Previously, you could only use | |
412 a module or an instance. Now, PLY just uses dir() to get a list of | |
413 symbols on whatever the object is without regard for its type. | |
414 | |
415 11/26/08: beazley | |
416 Changed all except: statements to be compatible with Python2.x/3.x syntax. | |
417 | |
418 11/26/08: beazley | |
419 Changed all raise Exception, value statements to raise Exception(value) for | |
420 forward compatibility. | |
421 | |
422 11/26/08: beazley | |
423 Removed all print statements from lex and yacc, using sys.stdout and sys.stderr | |
424 directly. Preparation for Python 3.0 support. | |
425 | |
426 11/04/08: beazley | |
427 Fixed a bug with referring to symbols on the the parsing stack using negative | |
428 indices. | |
429 | |
430 05/29/08: beazley | |
431 Completely revamped the testing system to use the unittest module for everything. | |
432 Added additional tests to cover new errors/warnings. | |
433 | |
434 Version 2.5 | |
435 ----------------------------- | |
436 05/28/08: beazley | |
437 Fixed a bug with writing lex-tables in optimized mode and start states. | |
438 Reported by Kevin Henry. | |
439 | |
440 Version 2.4 | |
441 ----------------------------- | |
442 05/04/08: beazley | |
443 A version number is now embedded in the table file signature so that | |
444 yacc can more gracefully accomodate changes to the output format | |
445 in the future. | |
446 | |
447 05/04/08: beazley | |
448 Removed undocumented .pushback() method on grammar productions. I'm | |
449 not sure this ever worked and can't recall ever using it. Might have | |
450 been an abandoned idea that never really got fleshed out. This | |
451 feature was never described or tested so removing it is hopefully | |
452 harmless. | |
453 | |
454 05/04/08: beazley | |
455 Added extra error checking to yacc() to detect precedence rules defined | |
456 for undefined terminal symbols. This allows yacc() to detect a potential | |
457 problem that can be really tricky to debug if no warning message or error | |
458 message is generated about it. | |
459 | |
460 05/04/08: beazley | |
461 lex() now has an outputdir that can specify the output directory for | |
462 tables when running in optimize mode. For example: | |
463 | |
464 lexer = lex.lex(optimize=True, lextab="ltab", outputdir="foo/bar") | |
465 | |
466 The behavior of specifying a table module and output directory are | |
467 more aligned with the behavior of yacc(). | |
468 | |
469 05/04/08: beazley | |
470 [Issue 9] | |
471 Fixed filename bug in when specifying the modulename in lex() and yacc(). | |
472 If you specified options such as the following: | |
473 | |
474 parser = yacc.yacc(tabmodule="foo.bar.parsetab",outputdir="foo/bar") | |
475 | |
476 yacc would create a file "foo.bar.parsetab.py" in the given directory. | |
477 Now, it simply generates a file "parsetab.py" in that directory. | |
478 Bug reported by cptbinho. | |
479 | |
480 05/04/08: beazley | |
481 Slight modification to lex() and yacc() to allow their table files | |
482 to be loaded from a previously loaded module. This might make | |
483 it easier to load the parsing tables from a complicated package | |
484 structure. For example: | |
485 | |
486 import foo.bar.spam.parsetab as parsetab | |
487 parser = yacc.yacc(tabmodule=parsetab) | |
488 | |
489 Note: lex and yacc will never regenerate the table file if used | |
490 in the form---you will get a warning message instead. | |
491 This idea suggested by Brian Clapper. | |
492 | |
493 | |
494 04/28/08: beazley | |
495 Fixed a big with p_error() functions being picked up correctly | |
496 when running in yacc(optimize=1) mode. Patch contributed by | |
497 Bart Whiteley. | |
498 | |
499 02/28/08: beazley | |
500 Fixed a bug with 'nonassoc' precedence rules. Basically the | |
501 non-precedence was being ignored and not producing the correct | |
502 run-time behavior in the parser. | |
503 | |
504 02/16/08: beazley | |
505 Slight relaxation of what the input() method to a lexer will | |
506 accept as a string. Instead of testing the input to see | |
507 if the input is a string or unicode string, it checks to see | |
508 if the input object looks like it contains string data. | |
509 This change makes it possible to pass string-like objects | |
510 in as input. For example, the object returned by mmap. | |
511 | |
512 import mmap, os | |
513 data = mmap.mmap(os.open(filename,os.O_RDONLY), | |
514 os.path.getsize(filename), | |
515 access=mmap.ACCESS_READ) | |
516 lexer.input(data) | |
517 | |
518 | |
519 11/29/07: beazley | |
520 Modification of ply.lex to allow token functions to aliased. | |
521 This is subtle, but it makes it easier to create libraries and | |
522 to reuse token specifications. For example, suppose you defined | |
523 a function like this: | |
524 | |
525 def number(t): | |
526 r'\d+' | |
527 t.value = int(t.value) | |
528 return t | |
529 | |
530 This change would allow you to define a token rule as follows: | |
531 | |
532 t_NUMBER = number | |
533 | |
534 In this case, the token type will be set to 'NUMBER' and use | |
535 the associated number() function to process tokens. | |
536 | |
537 11/28/07: beazley | |
538 Slight modification to lex and yacc to grab symbols from both | |
539 the local and global dictionaries of the caller. This | |
540 modification allows lexers and parsers to be defined using | |
541 inner functions and closures. | |
542 | |
543 11/28/07: beazley | |
544 Performance optimization: The lexer.lexmatch and t.lexer | |
545 attributes are no longer set for lexer tokens that are not | |
546 defined by functions. The only normal use of these attributes | |
547 would be in lexer rules that need to perform some kind of | |
548 special processing. Thus, it doesn't make any sense to set | |
549 them on every token. | |
550 | |
551 *** POTENTIAL INCOMPATIBILITY *** This might break code | |
552 that is mucking around with internal lexer state in some | |
553 sort of magical way. | |
554 | |
555 11/27/07: beazley | |
556 Added the ability to put the parser into error-handling mode | |
557 from within a normal production. To do this, simply raise | |
558 a yacc.SyntaxError exception like this: | |
559 | |
560 def p_some_production(p): | |
561 'some_production : prod1 prod2' | |
562 ... | |
563 raise yacc.SyntaxError # Signal an error | |
564 | |
565 A number of things happen after this occurs: | |
566 | |
567 - The last symbol shifted onto the symbol stack is discarded | |
568 and parser state backed up to what it was before the | |
569 the rule reduction. | |
570 | |
571 - The current lookahead symbol is saved and replaced by | |
572 the 'error' symbol. | |
573 | |
574 - The parser enters error recovery mode where it tries | |
575 to either reduce the 'error' rule or it starts | |
576 discarding items off of the stack until the parser | |
577 resets. | |
578 | |
579 When an error is manually set, the parser does *not* call | |
580 the p_error() function (if any is defined). | |
581 *** NEW FEATURE *** Suggested on the mailing list | |
582 | |
583 11/27/07: beazley | |
584 Fixed structure bug in examples/ansic. Reported by Dion Blazakis. | |
585 | |
586 11/27/07: beazley | |
587 Fixed a bug in the lexer related to start conditions and ignored | |
588 token rules. If a rule was defined that changed state, but | |
589 returned no token, the lexer could be left in an inconsistent | |
590 state. Reported by | |
591 | |
592 11/27/07: beazley | |
593 Modified setup.py to support Python Eggs. Patch contributed by | |
594 Simon Cross. | |
595 | |
596 11/09/07: beazely | |
597 Fixed a bug in error handling in yacc. If a syntax error occurred and the | |
598 parser rolled the entire parse stack back, the parser would be left in in | |
599 inconsistent state that would cause it to trigger incorrect actions on | |
600 subsequent input. Reported by Ton Biegstraaten, Justin King, and others. | |
601 | |
602 11/09/07: beazley | |
603 Fixed a bug when passing empty input strings to yacc.parse(). This | |
604 would result in an error message about "No input given". Reported | |
605 by Andrew Dalke. | |
606 | |
607 Version 2.3 | |
608 ----------------------------- | |
609 02/20/07: beazley | |
610 Fixed a bug with character literals if the literal '.' appeared as the | |
611 last symbol of a grammar rule. Reported by Ales Smrcka. | |
612 | |
613 02/19/07: beazley | |
614 Warning messages are now redirected to stderr instead of being printed | |
615 to standard output. | |
616 | |
617 02/19/07: beazley | |
618 Added a warning message to lex.py if it detects a literal backslash | |
619 character inside the t_ignore declaration. This is to help | |
620 problems that might occur if someone accidentally defines t_ignore | |
621 as a Python raw string. For example: | |
622 | |
623 t_ignore = r' \t' | |
624 | |
625 The idea for this is from an email I received from David Cimimi who | |
626 reported bizarre behavior in lexing as a result of defining t_ignore | |
627 as a raw string by accident. | |
628 | |
629 02/18/07: beazley | |
630 Performance improvements. Made some changes to the internal | |
631 table organization and LR parser to improve parsing performance. | |
632 | |
633 02/18/07: beazley | |
634 Automatic tracking of line number and position information must now be | |
635 enabled by a special flag to parse(). For example: | |
636 | |
637 yacc.parse(data,tracking=True) | |
638 | |
639 In many applications, it's just not that important to have the | |
640 parser automatically track all line numbers. By making this an | |
641 optional feature, it allows the parser to run significantly faster | |
642 (more than a 20% speed increase in many cases). Note: positional | |
643 information is always available for raw tokens---this change only | |
644 applies to positional information associated with nonterminal | |
645 grammar symbols. | |
646 *** POTENTIAL INCOMPATIBILITY *** | |
647 | |
648 02/18/07: beazley | |
649 Yacc no longer supports extended slices of grammar productions. | |
650 However, it does support regular slices. For example: | |
651 | |
652 def p_foo(p): | |
653 '''foo: a b c d e''' | |
654 p[0] = p[1:3] | |
655 | |
656 This change is a performance improvement to the parser--it streamlines | |
657 normal access to the grammar values since slices are now handled in | |
658 a __getslice__() method as opposed to __getitem__(). | |
659 | |
660 02/12/07: beazley | |
661 Fixed a bug in the handling of token names when combined with | |
662 start conditions. Bug reported by Todd O'Bryan. | |
663 | |
664 Version 2.2 | |
665 ------------------------------ | |
666 11/01/06: beazley | |
667 Added lexpos() and lexspan() methods to grammar symbols. These | |
668 mirror the same functionality of lineno() and linespan(). For | |
669 example: | |
670 | |
671 def p_expr(p): | |
672 'expr : expr PLUS expr' | |
673 p.lexpos(1) # Lexing position of left-hand-expression | |
674 p.lexpos(1) # Lexing position of PLUS | |
675 start,end = p.lexspan(3) # Lexing range of right hand expression | |
676 | |
677 11/01/06: beazley | |
678 Minor change to error handling. The recommended way to skip characters | |
679 in the input is to use t.lexer.skip() as shown here: | |
680 | |
681 def t_error(t): | |
682 print "Illegal character '%s'" % t.value[0] | |
683 t.lexer.skip(1) | |
684 | |
685 The old approach of just using t.skip(1) will still work, but won't | |
686 be documented. | |
687 | |
688 10/31/06: beazley | |
689 Discarded tokens can now be specified as simple strings instead of | |
690 functions. To do this, simply include the text "ignore_" in the | |
691 token declaration. For example: | |
692 | |
693 t_ignore_cppcomment = r'//.*' | |
694 | |
695 Previously, this had to be done with a function. For example: | |
696 | |
697 def t_ignore_cppcomment(t): | |
698 r'//.*' | |
699 pass | |
700 | |
701 If start conditions/states are being used, state names should appear | |
702 before the "ignore_" text. | |
703 | |
704 10/19/06: beazley | |
705 The Lex module now provides support for flex-style start conditions | |
706 as described at http://www.gnu.org/software/flex/manual/html_chapter/flex_11.html. | |
707 Please refer to this document to understand this change note. Refer to | |
708 the PLY documentation for PLY-specific explanation of how this works. | |
709 | |
710 To use start conditions, you first need to declare a set of states in | |
711 your lexer file: | |
712 | |
713 states = ( | |
714 ('foo','exclusive'), | |
715 ('bar','inclusive') | |
716 ) | |
717 | |
718 This serves the same role as the %s and %x specifiers in flex. | |
719 | |
720 One a state has been declared, tokens for that state can be | |
721 declared by defining rules of the form t_state_TOK. For example: | |
722 | |
723 t_PLUS = '\+' # Rule defined in INITIAL state | |
724 t_foo_NUM = '\d+' # Rule defined in foo state | |
725 t_bar_NUM = '\d+' # Rule defined in bar state | |
726 | |
727 t_foo_bar_NUM = '\d+' # Rule defined in both foo and bar | |
728 t_ANY_NUM = '\d+' # Rule defined in all states | |
729 | |
730 In addition to defining tokens for each state, the t_ignore and t_error | |
731 specifications can be customized for specific states. For example: | |
732 | |
733 t_foo_ignore = " " # Ignored characters for foo state | |
734 def t_bar_error(t): | |
735 # Handle errors in bar state | |
736 | |
737 With token rules, the following methods can be used to change states | |
738 | |
739 def t_TOKNAME(t): | |
740 t.lexer.begin('foo') # Begin state 'foo' | |
741 t.lexer.push_state('foo') # Begin state 'foo', push old state | |
742 # onto a stack | |
743 t.lexer.pop_state() # Restore previous state | |
744 t.lexer.current_state() # Returns name of current state | |
745 | |
746 These methods mirror the BEGIN(), yy_push_state(), yy_pop_state(), and | |
747 yy_top_state() functions in flex. | |
748 | |
749 The use of start states can be used as one way to write sub-lexers. | |
750 For example, the lexer or parser might instruct the lexer to start | |
751 generating a different set of tokens depending on the context. | |
752 | |
753 example/yply/ylex.py shows the use of start states to grab C/C++ | |
754 code fragments out of traditional yacc specification files. | |
755 | |
756 *** NEW FEATURE *** Suggested by Daniel Larraz with whom I also | |
757 discussed various aspects of the design. | |
758 | |
759 10/19/06: beazley | |
760 Minor change to the way in which yacc.py was reporting shift/reduce | |
761 conflicts. Although the underlying LALR(1) algorithm was correct, | |
762 PLY was under-reporting the number of conflicts compared to yacc/bison | |
763 when precedence rules were in effect. This change should make PLY | |
764 report the same number of conflicts as yacc. | |
765 | |
766 10/19/06: beazley | |
767 Modified yacc so that grammar rules could also include the '-' | |
768 character. For example: | |
769 | |
770 def p_expr_list(p): | |
771 'expression-list : expression-list expression' | |
772 | |
773 Suggested by Oldrich Jedlicka. | |
774 | |
775 10/18/06: beazley | |
776 Attribute lexer.lexmatch added so that token rules can access the re | |
777 match object that was generated. For example: | |
778 | |
779 def t_FOO(t): | |
780 r'some regex' | |
781 m = t.lexer.lexmatch | |
782 # Do something with m | |
783 | |
784 | |
785 This may be useful if you want to access named groups specified within | |
786 the regex for a specific token. Suggested by Oldrich Jedlicka. | |
787 | |
788 10/16/06: beazley | |
789 Changed the error message that results if an illegal character | |
790 is encountered and no default error function is defined in lex. | |
791 The exception is now more informative about the actual cause of | |
792 the error. | |
793 | |
794 Version 2.1 | |
795 ------------------------------ | |
796 10/02/06: beazley | |
797 The last Lexer object built by lex() can be found in lex.lexer. | |
798 The last Parser object built by yacc() can be found in yacc.parser. | |
799 | |
800 10/02/06: beazley | |
801 New example added: examples/yply | |
802 | |
803 This example uses PLY to convert Unix-yacc specification files to | |
804 PLY programs with the same grammar. This may be useful if you | |
805 want to convert a grammar from bison/yacc to use with PLY. | |
806 | |
807 10/02/06: beazley | |
808 Added support for a start symbol to be specified in the yacc | |
809 input file itself. Just do this: | |
810 | |
811 start = 'name' | |
812 | |
813 where 'name' matches some grammar rule. For example: | |
814 | |
815 def p_name(p): | |
816 'name : A B C' | |
817 ... | |
818 | |
819 This mirrors the functionality of the yacc %start specifier. | |
820 | |
821 09/30/06: beazley | |
822 Some new examples added.: | |
823 | |
824 examples/GardenSnake : A simple indentation based language similar | |
825 to Python. Shows how you might handle | |
826 whitespace. Contributed by Andrew Dalke. | |
827 | |
828 examples/BASIC : An implementation of 1964 Dartmouth BASIC. | |
829 Contributed by Dave against his better | |
830 judgement. | |
831 | |
832 09/28/06: beazley | |
833 Minor patch to allow named groups to be used in lex regular | |
834 expression rules. For example: | |
835 | |
836 t_QSTRING = r'''(?P<quote>['"]).*?(?P=quote)''' | |
837 | |
838 Patch submitted by Adam Ring. | |
839 | |
840 09/28/06: beazley | |
841 LALR(1) is now the default parsing method. To use SLR, use | |
842 yacc.yacc(method="SLR"). Note: there is no performance impact | |
843 on parsing when using LALR(1) instead of SLR. However, constructing | |
844 the parsing tables will take a little longer. | |
845 | |
846 09/26/06: beazley | |
847 Change to line number tracking. To modify line numbers, modify | |
848 the line number of the lexer itself. For example: | |
849 | |
850 def t_NEWLINE(t): | |
851 r'\n' | |
852 t.lexer.lineno += 1 | |
853 | |
854 This modification is both cleanup and a performance optimization. | |
855 In past versions, lex was monitoring every token for changes in | |
856 the line number. This extra processing is unnecessary for a vast | |
857 majority of tokens. Thus, this new approach cleans it up a bit. | |
858 | |
859 *** POTENTIAL INCOMPATIBILITY *** | |
860 You will need to change code in your lexer that updates the line | |
861 number. For example, "t.lineno += 1" becomes "t.lexer.lineno += 1" | |
862 | |
863 09/26/06: beazley | |
864 Added the lexing position to tokens as an attribute lexpos. This | |
865 is the raw index into the input text at which a token appears. | |
866 This information can be used to compute column numbers and other | |
867 details (e.g., scan backwards from lexpos to the first newline | |
868 to get a column position). | |
869 | |
870 09/25/06: beazley | |
871 Changed the name of the __copy__() method on the Lexer class | |
872 to clone(). This is used to clone a Lexer object (e.g., if | |
873 you're running different lexers at the same time). | |
874 | |
875 09/21/06: beazley | |
876 Limitations related to the use of the re module have been eliminated. | |
877 Several users reported problems with regular expressions exceeding | |
878 more than 100 named groups. To solve this, lex.py is now capable | |
879 of automatically splitting its master regular regular expression into | |
880 smaller expressions as needed. This should, in theory, make it | |
881 possible to specify an arbitrarily large number of tokens. | |
882 | |
883 09/21/06: beazley | |
884 Improved error checking in lex.py. Rules that match the empty string | |
885 are now rejected (otherwise they cause the lexer to enter an infinite | |
886 loop). An extra check for rules containing '#' has also been added. | |
887 Since lex compiles regular expressions in verbose mode, '#' is interpreted | |
888 as a regex comment, it is critical to use '\#' instead. | |
889 | |
890 09/18/06: beazley | |
891 Added a @TOKEN decorator function to lex.py that can be used to | |
892 define token rules where the documentation string might be computed | |
893 in some way. | |
894 | |
895 digit = r'([0-9])' | |
896 nondigit = r'([_A-Za-z])' | |
897 identifier = r'(' + nondigit + r'(' + digit + r'|' + nondigit + r')*)' | |
898 | |
899 from ply.lex import TOKEN | |
900 | |
901 @TOKEN(identifier) | |
902 def t_ID(t): | |
903 # Do whatever | |
904 | |
905 The @TOKEN decorator merely sets the documentation string of the | |
906 associated token function as needed for lex to work. | |
907 | |
908 Note: An alternative solution is the following: | |
909 | |
910 def t_ID(t): | |
911 # Do whatever | |
912 | |
913 t_ID.__doc__ = identifier | |
914 | |
915 Note: Decorators require the use of Python 2.4 or later. If compatibility | |
916 with old versions is needed, use the latter solution. | |
917 | |
918 The need for this feature was suggested by Cem Karan. | |
919 | |
920 09/14/06: beazley | |
921 Support for single-character literal tokens has been added to yacc. | |
922 These literals must be enclosed in quotes. For example: | |
923 | |
924 def p_expr(p): | |
925 "expr : expr '+' expr" | |
926 ... | |
927 | |
928 def p_expr(p): | |
929 'expr : expr "-" expr' | |
930 ... | |
931 | |
932 In addition to this, it is necessary to tell the lexer module about | |
933 literal characters. This is done by defining the variable 'literals' | |
934 as a list of characters. This should be defined in the module that | |
935 invokes the lex.lex() function. For example: | |
936 | |
937 literals = ['+','-','*','/','(',')','='] | |
938 | |
939 or simply | |
940 | |
941 literals = '+=*/()=' | |
942 | |
943 It is important to note that literals can only be a single character. | |
944 When the lexer fails to match a token using its normal regular expression | |
945 rules, it will check the current character against the literal list. | |
946 If found, it will be returned with a token type set to match the literal | |
947 character. Otherwise, an illegal character will be signalled. | |
948 | |
949 | |
950 09/14/06: beazley | |
951 Modified PLY to install itself as a proper Python package called 'ply'. | |
952 This will make it a little more friendly to other modules. This | |
953 changes the usage of PLY only slightly. Just do this to import the | |
954 modules | |
955 | |
956 import ply.lex as lex | |
957 import ply.yacc as yacc | |
958 | |
959 Alternatively, you can do this: | |
960 | |
961 from ply import * | |
962 | |
963 Which imports both the lex and yacc modules. | |
964 Change suggested by Lee June. | |
965 | |
966 09/13/06: beazley | |
967 Changed the handling of negative indices when used in production rules. | |
968 A negative production index now accesses already parsed symbols on the | |
969 parsing stack. For example, | |
970 | |
971 def p_foo(p): | |
972 "foo: A B C D" | |
973 print p[1] # Value of 'A' symbol | |
974 print p[2] # Value of 'B' symbol | |
975 print p[-1] # Value of whatever symbol appears before A | |
976 # on the parsing stack. | |
977 | |
978 p[0] = some_val # Sets the value of the 'foo' grammer symbol | |
979 | |
980 This behavior makes it easier to work with embedded actions within the | |
981 parsing rules. For example, in C-yacc, it is possible to write code like | |
982 this: | |
983 | |
984 bar: A { printf("seen an A = %d\n", $1); } B { do_stuff; } | |
985 | |
986 In this example, the printf() code executes immediately after A has been | |
987 parsed. Within the embedded action code, $1 refers to the A symbol on | |
988 the stack. | |
989 | |
990 To perform this equivalent action in PLY, you need to write a pair | |
991 of rules like this: | |
992 | |
993 def p_bar(p): | |
994 "bar : A seen_A B" | |
995 do_stuff | |
996 | |
997 def p_seen_A(p): | |
998 "seen_A :" | |
999 print "seen an A =", p[-1] | |
1000 | |
1001 The second rule "seen_A" is merely a empty production which should be | |
1002 reduced as soon as A is parsed in the "bar" rule above. The use | |
1003 of the negative index p[-1] is used to access whatever symbol appeared | |
1004 before the seen_A symbol. | |
1005 | |
1006 This feature also makes it possible to support inherited attributes. | |
1007 For example: | |
1008 | |
1009 def p_decl(p): | |
1010 "decl : scope name" | |
1011 | |
1012 def p_scope(p): | |
1013 """scope : GLOBAL | |
1014 | LOCAL""" | |
1015 p[0] = p[1] | |
1016 | |
1017 def p_name(p): | |
1018 "name : ID" | |
1019 if p[-1] == "GLOBAL": | |
1020 # ... | |
1021 else if p[-1] == "LOCAL": | |
1022 #... | |
1023 | |
1024 In this case, the name rule is inheriting an attribute from the | |
1025 scope declaration that precedes it. | |
1026 | |
1027 *** POTENTIAL INCOMPATIBILITY *** | |
1028 If you are currently using negative indices within existing grammar rules, | |
1029 your code will break. This should be extremely rare if non-existent in | |
1030 most cases. The argument to various grammar rules is not usually not | |
1031 processed in the same way as a list of items. | |
1032 | |
1033 Version 2.0 | |
1034 ------------------------------ | |
1035 09/07/06: beazley | |
1036 Major cleanup and refactoring of the LR table generation code. Both SLR | |
1037 and LALR(1) table generation is now performed by the same code base with | |
1038 only minor extensions for extra LALR(1) processing. | |
1039 | |
1040 09/07/06: beazley | |
1041 Completely reimplemented the entire LALR(1) parsing engine to use the | |
1042 DeRemer and Pennello algorithm for calculating lookahead sets. This | |
1043 significantly improves the performance of generating LALR(1) tables | |
1044 and has the added feature of actually working correctly! If you | |
1045 experienced weird behavior with LALR(1) in prior releases, this should | |
1046 hopefully resolve all of those problems. Many thanks to | |
1047 Andrew Waters and Markus Schoepflin for submitting bug reports | |
1048 and helping me test out the revised LALR(1) support. | |
1049 | |
1050 Version 1.8 | |
1051 ------------------------------ | |
1052 08/02/06: beazley | |
1053 Fixed a problem related to the handling of default actions in LALR(1) | |
1054 parsing. If you experienced subtle and/or bizarre behavior when trying | |
1055 to use the LALR(1) engine, this may correct those problems. Patch | |
1056 contributed by Russ Cox. Note: This patch has been superceded by | |
1057 revisions for LALR(1) parsing in Ply-2.0. | |
1058 | |
1059 08/02/06: beazley | |
1060 Added support for slicing of productions in yacc. | |
1061 Patch contributed by Patrick Mezard. | |
1062 | |
1063 Version 1.7 | |
1064 ------------------------------ | |
1065 03/02/06: beazley | |
1066 Fixed infinite recursion problem ReduceToTerminals() function that | |
1067 would sometimes come up in LALR(1) table generation. Reported by | |
1068 Markus Schoepflin. | |
1069 | |
1070 03/01/06: beazley | |
1071 Added "reflags" argument to lex(). For example: | |
1072 | |
1073 lex.lex(reflags=re.UNICODE) | |
1074 | |
1075 This can be used to specify optional flags to the re.compile() function | |
1076 used inside the lexer. This may be necessary for special situations such | |
1077 as processing Unicode (e.g., if you want escapes like \w and \b to consult | |
1078 the Unicode character property database). The need for this suggested by | |
1079 Andreas Jung. | |
1080 | |
1081 03/01/06: beazley | |
1082 Fixed a bug with an uninitialized variable on repeated instantiations of parser | |
1083 objects when the write_tables=0 argument was used. Reported by Michael Brown. | |
1084 | |
1085 03/01/06: beazley | |
1086 Modified lex.py to accept Unicode strings both as the regular expressions for | |
1087 tokens and as input. Hopefully this is the only change needed for Unicode support. | |
1088 Patch contributed by Johan Dahl. | |
1089 | |
1090 03/01/06: beazley | |
1091 Modified the class-based interface to work with new-style or old-style classes. | |
1092 Patch contributed by Michael Brown (although I tweaked it slightly so it would work | |
1093 with older versions of Python). | |
1094 | |
1095 Version 1.6 | |
1096 ------------------------------ | |
1097 05/27/05: beazley | |
1098 Incorporated patch contributed by Christopher Stawarz to fix an extremely | |
1099 devious bug in LALR(1) parser generation. This patch should fix problems | |
1100 numerous people reported with LALR parsing. | |
1101 | |
1102 05/27/05: beazley | |
1103 Fixed problem with lex.py copy constructor. Reported by Dave Aitel, Aaron Lav, | |
1104 and Thad Austin. | |
1105 | |
1106 05/27/05: beazley | |
1107 Added outputdir option to yacc() to control output directory. Contributed | |
1108 by Christopher Stawarz. | |
1109 | |
1110 05/27/05: beazley | |
1111 Added rununit.py test script to run tests using the Python unittest module. | |
1112 Contributed by Miki Tebeka. | |
1113 | |
1114 Version 1.5 | |
1115 ------------------------------ | |
1116 05/26/04: beazley | |
1117 Major enhancement. LALR(1) parsing support is now working. | |
1118 This feature was implemented by Elias Ioup (ezioup@alumni.uchicago.edu) | |
1119 and optimized by David Beazley. To use LALR(1) parsing do | |
1120 the following: | |
1121 | |
1122 yacc.yacc(method="LALR") | |
1123 | |
1124 Computing LALR(1) parsing tables takes about twice as long as | |
1125 the default SLR method. However, LALR(1) allows you to handle | |
1126 more complex grammars. For example, the ANSI C grammar | |
1127 (in example/ansic) has 13 shift-reduce conflicts with SLR, but | |
1128 only has 1 shift-reduce conflict with LALR(1). | |
1129 | |
1130 05/20/04: beazley | |
1131 Added a __len__ method to parser production lists. Can | |
1132 be used in parser rules like this: | |
1133 | |
1134 def p_somerule(p): | |
1135 """a : B C D | |
1136 | E F" | |
1137 if (len(p) == 3): | |
1138 # Must have been first rule | |
1139 elif (len(p) == 2): | |
1140 # Must be second rule | |
1141 | |
1142 Suggested by Joshua Gerth and others. | |
1143 | |
1144 Version 1.4 | |
1145 ------------------------------ | |
1146 04/23/04: beazley | |
1147 Incorporated a variety of patches contributed by Eric Raymond. | |
1148 These include: | |
1149 | |
1150 0. Cleans up some comments so they don't wrap on an 80-column display. | |
1151 1. Directs compiler errors to stderr where they belong. | |
1152 2. Implements and documents automatic line counting when \n is ignored. | |
1153 3. Changes the way progress messages are dumped when debugging is on. | |
1154 The new format is both less verbose and conveys more information than | |
1155 the old, including shift and reduce actions. | |
1156 | |
1157 04/23/04: beazley | |
1158 Added a Python setup.py file to simply installation. Contributed | |
1159 by Adam Kerrison. | |
1160 | |
1161 04/23/04: beazley | |
1162 Added patches contributed by Adam Kerrison. | |
1163 | |
1164 - Some output is now only shown when debugging is enabled. This | |
1165 means that PLY will be completely silent when not in debugging mode. | |
1166 | |
1167 - An optional parameter "write_tables" can be passed to yacc() to | |
1168 control whether or not parsing tables are written. By default, | |
1169 it is true, but it can be turned off if you don't want the yacc | |
1170 table file. Note: disabling this will cause yacc() to regenerate | |
1171 the parsing table each time. | |
1172 | |
1173 04/23/04: beazley | |
1174 Added patches contributed by David McNab. This patch addes two | |
1175 features: | |
1176 | |
1177 - The parser can be supplied as a class instead of a module. | |
1178 For an example of this, see the example/classcalc directory. | |
1179 | |
1180 - Debugging output can be directed to a filename of the user's | |
1181 choice. Use | |
1182 | |
1183 yacc(debugfile="somefile.out") | |
1184 | |
1185 | |
1186 Version 1.3 | |
1187 ------------------------------ | |
1188 12/10/02: jmdyck | |
1189 Various minor adjustments to the code that Dave checked in today. | |
1190 Updated test/yacc_{inf,unused}.exp to reflect today's changes. | |
1191 | |
1192 12/10/02: beazley | |
1193 Incorporated a variety of minor bug fixes to empty production | |
1194 handling and infinite recursion checking. Contributed by | |
1195 Michael Dyck. | |
1196 | |
1197 12/10/02: beazley | |
1198 Removed bogus recover() method call in yacc.restart() | |
1199 | |
1200 Version 1.2 | |
1201 ------------------------------ | |
1202 11/27/02: beazley | |
1203 Lexer and parser objects are now available as an attribute | |
1204 of tokens and slices respectively. For example: | |
1205 | |
1206 def t_NUMBER(t): | |
1207 r'\d+' | |
1208 print t.lexer | |
1209 | |
1210 def p_expr_plus(t): | |
1211 'expr: expr PLUS expr' | |
1212 print t.lexer | |
1213 print t.parser | |
1214 | |
1215 This can be used for state management (if needed). | |
1216 | |
1217 10/31/02: beazley | |
1218 Modified yacc.py to work with Python optimize mode. To make | |
1219 this work, you need to use | |
1220 | |
1221 yacc.yacc(optimize=1) | |
1222 | |
1223 Furthermore, you need to first run Python in normal mode | |
1224 to generate the necessary parsetab.py files. After that, | |
1225 you can use python -O or python -OO. | |
1226 | |
1227 Note: optimized mode turns off a lot of error checking. | |
1228 Only use when you are sure that your grammar is working. | |
1229 Make sure parsetab.py is up to date! | |
1230 | |
1231 10/30/02: beazley | |
1232 Added cloning of Lexer objects. For example: | |
1233 | |
1234 import copy | |
1235 l = lex.lex() | |
1236 lc = copy.copy(l) | |
1237 | |
1238 l.input("Some text") | |
1239 lc.input("Some other text") | |
1240 ... | |
1241 | |
1242 This might be useful if the same "lexer" is meant to | |
1243 be used in different contexts---or if multiple lexers | |
1244 are running concurrently. | |
1245 | |
1246 10/30/02: beazley | |
1247 Fixed subtle bug with first set computation and empty productions. | |
1248 Patch submitted by Michael Dyck. | |
1249 | |
1250 10/30/02: beazley | |
1251 Fixed error messages to use "filename:line: message" instead | |
1252 of "filename:line. message". This makes error reporting more | |
1253 friendly to emacs. Patch submitted by François Pinard. | |
1254 | |
1255 10/30/02: beazley | |
1256 Improvements to parser.out file. Terminals and nonterminals | |
1257 are sorted instead of being printed in random order. | |
1258 Patch submitted by François Pinard. | |
1259 | |
1260 10/30/02: beazley | |
1261 Improvements to parser.out file output. Rules are now printed | |
1262 in a way that's easier to understand. Contributed by Russ Cox. | |
1263 | |
1264 10/30/02: beazley | |
1265 Added 'nonassoc' associativity support. This can be used | |
1266 to disable the chaining of operators like a < b < c. | |
1267 To use, simply specify 'nonassoc' in the precedence table | |
1268 | |
1269 precedence = ( | |
1270 ('nonassoc', 'LESSTHAN', 'GREATERTHAN'), # Nonassociative operators | |
1271 ('left', 'PLUS', 'MINUS'), | |
1272 ('left', 'TIMES', 'DIVIDE'), | |
1273 ('right', 'UMINUS'), # Unary minus operator | |
1274 ) | |
1275 | |
1276 Patch contributed by Russ Cox. | |
1277 | |
1278 10/30/02: beazley | |
1279 Modified the lexer to provide optional support for Python -O and -OO | |
1280 modes. To make this work, Python *first* needs to be run in | |
1281 unoptimized mode. This reads the lexing information and creates a | |
1282 file "lextab.py". Then, run lex like this: | |
1283 | |
1284 # module foo.py | |
1285 ... | |
1286 ... | |
1287 lex.lex(optimize=1) | |
1288 | |
1289 Once the lextab file has been created, subsequent calls to | |
1290 lex.lex() will read data from the lextab file instead of using | |
1291 introspection. In optimized mode (-O, -OO) everything should | |
1292 work normally despite the loss of doc strings. | |
1293 | |
1294 To change the name of the file 'lextab.py' use the following: | |
1295 | |
1296 lex.lex(lextab="footab") | |
1297 | |
1298 (this creates a file footab.py) | |
1299 | |
1300 | |
1301 Version 1.1 October 25, 2001 | |
1302 ------------------------------ | |
1303 | |
1304 10/25/01: beazley | |
1305 Modified the table generator to produce much more compact data. | |
1306 This should greatly reduce the size of the parsetab.py[c] file. | |
1307 Caveat: the tables still need to be constructed so a little more | |
1308 work is done in parsetab on import. | |
1309 | |
1310 10/25/01: beazley | |
1311 There may be a possible bug in the cycle detector that reports errors | |
1312 about infinite recursion. I'm having a little trouble tracking it | |
1313 down, but if you get this problem, you can disable the cycle | |
1314 detector as follows: | |
1315 | |
1316 yacc.yacc(check_recursion = 0) | |
1317 | |
1318 10/25/01: beazley | |
1319 Fixed a bug in lex.py that sometimes caused illegal characters to be | |
1320 reported incorrectly. Reported by Sverre Jørgensen. | |
1321 | |
1322 7/8/01 : beazley | |
1323 Added a reference to the underlying lexer object when tokens are handled by | |
1324 functions. The lexer is available as the 'lexer' attribute. This | |
1325 was added to provide better lexing support for languages such as Fortran | |
1326 where certain types of tokens can't be conveniently expressed as regular | |
1327 expressions (and where the tokenizing function may want to perform a | |
1328 little backtracking). Suggested by Pearu Peterson. | |
1329 | |
1330 6/20/01 : beazley | |
1331 Modified yacc() function so that an optional starting symbol can be specified. | |
1332 For example: | |
1333 | |
1334 yacc.yacc(start="statement") | |
1335 | |
1336 Normally yacc always treats the first production rule as the starting symbol. | |
1337 However, if you are debugging your grammar it may be useful to specify | |
1338 an alternative starting symbol. Idea suggested by Rich Salz. | |
1339 | |
1340 Version 1.0 June 18, 2001 | |
1341 -------------------------- | |
1342 Initial public offering | |
1343 |