CookCC is a lexer and parser (LALR (1)) generator project, combined. It is written in Java, but the target languages can vary.
CookCC comes with two unique features, which were the original motivations for this work.
CookCC uses a unique approach of storing and loading DFA tables in Java that significantly reduces the starting up time. Many efforts have been made to maximize the generated Java lexer and parser performances, painstakingly line-by-line, case-by-case fine turning the lexer and parser code. I believe that CookCC is the fastest lexer for Java (see the performance test). CookCC allows lexer/parser patterns and rules to be specified using Java annotation. This feature greatly simplifies and eases the writing of lexer and parser for Java. Additionally, CookCC can produce highly compressed DFA tables.
CookCC requires JRE 1.5+ to run, but the generated Java code can be compiled and run with earlier versions of Java.
The current release is 0.3.3.
I am currently working on the island grammar on interfaces feature of 0.4, which is taking much longer than I anticipated.
Note: the BSD license only applies to CookCC itself. The code generated belongs to you.
Road Map for 0.5add re2c-like direct code generation option for Lexer rather than only using table lookup (as of now). Possible mixed mode of execution to reduce table size (and code size), by reducing the number of states to be stored and possibly fewer equivalent classes. Performance gain for Java is questionable but I have thought out the way doing it. C and C++ code generation. More of the long term because right now I do not have need to do so. The performance of flex for C is extremely difficult to beat anyways. What's New0.4 (Upcoming Release)added -extend option to set the parent class of the generated class updated debugLexer, debugLexerBackup, debugParser signature so that it is actually meaningful overload these debugging functions. Issue 20 : allowed the start symbol to be specified in Java annotation input (by default it was the LHS of the first @Rule). Issue 19: allowed grammar on Java interfaces. Issue 18 : added -generics option to generate Java code that use generics. Issue 17 : added optional / optional list / list grammar shortcuts. Possible tree generator (grammar on Java annotations only) 0.3.3allowed the internal buffer to be automatically increased for long matches. Issue 14 : added yyPushInput, yyPopInput, yyInputStackSize, yyWrap functions (and yywrap option). Issue 13 : turn on backup lex state warning only when requested. Issue 12 : added setBOL function to set the next token to be at BOL. Issue 11 : yacc output does not have %start. Issue 10 : yacc output fails on empty TokensDoc. 0.3.2added yacc grammar input and output. added yyPushLexerState and yyPopLexerState functions. added line number information for the error messages for Java input. added "parserprolog" section for the generated Java code. Issue 9 : unable to handle '\'' terminals in the grammar. Issue 8 : incorrect lalr item lookahead calculation. Now tested against bison using several major language grammars. Issue 7 : disable APT compile for the CookCC Ant task to prevent class files from generated. Issue 6 : erroneous warning of unreachable pattern when a lex pattern is shared among multiple lex states. Issue 5 : tag state attribute did not work. 0.3.1added single quoted literal string as lex patterns. 0.3added input using Java annotation. Issue 2 : multiple incomplete state can cause internal lex error due to reassignment of the internal pattern case values. Issue 1 : incorrectly generated parser if start non-terminal is not specified 0.2added parser generator. 0.1initial release. Only includes lexer generator.
Copyright © 2013 Black Duck Software, Inc. and its contributors, Some Rights Reserved. Unless otherwise marked, this work is licensed under a Creative Commons Attribution 3.0 Unported License . Ohloh ® and the Ohloh logo are trademarks of Black Duck Software, Inc. in the United States and/or other jurisdictions. All other trademarks are the property of their respective holders.