CookCC is a lexer and parser (LALR (1)) generator project, combined. It is written in Java, but the target languages can vary.
CookCC comes with two unique features, which were the original
... [More]
motivations for this work.
CookCC uses a unique approach of storing and loading DFA tables in Java that significantly reduces the starting up time. Many efforts have been made to maximize the generated Java lexer and parser performances, painstakingly line-by-line, case-by-case fine turning the lexer and parser code. I believe that CookCC is the fastest lexer for Java (see the performance test). CookCC allows lexer/parser patterns and rules to be specified using Java annotation. This feature greatly simplifies and eases the writing of lexer and parser for Java. CookCC requires JRE 1.5+ to run, but the generated Java code can be compiled and run with earlier versions of Java.
The current release is 0.3.3.
What's New0.4 (Upcoming Release) Issue 20 : allow the start symbol to be specified in Java annotation input (by default it was the LHS of the first @Rule). Issue 19: allow grammar on Java interfaces. Issue 18 : add -generics option to generate Java code that use generics. Issue 17 : add optional / optional list / list grammar shortcuts. Possible tree generator (grammar on Java annotations only) 0.3.3allowed the internal buffer to be automatically increased for long matches. Issue 14 : added yyPushInput, yyPopInput, yyInputStackSize, yyWrap functions (and yywrap option). Issue 13 : turn on backup lex state warning only when requested. Issue 12 : added setBOL function to set the next token to be at BOL. Issue 11 : yacc output does not have %start. Issue 10 : yacc output fails on empty TokensDoc. 0.3.2added yacc grammar input and output. added yyPushLexerState and yyPopLexerState functions. added line number information for the error messages for Java input. added "parserprolog" section for the generated Java code. Issue 9 : unable to handle '\'' terminals in the grammar. Issue 8 : incorrect lalr item lookahead calculation. Now tested against bison using several major language grammars. Issue 7 : disable APT compile for the CookCC Ant task to prevent class files from generated. Issue 6 : erroneous warning of unreachable pattern when a lex pattern is shared among multiple lex states. Issue 5 : tag state attribute did not work. 0.3.1added single quoted literal string as lex patterns. 0.3added input using Java annotation. Issue 2 : multiple incomplete state can cause internal lex error due to reassignment of the internal pattern case values. Issue 1 : incorrectly generated parser if start non-terminal is not specified 0.2added parser generator. 0.1initial release. Only includes lexer generator. [Less]