Select a tag to browse associated projects and drill deeper into the tag cloud.
ANother Tool for Language Recognition (ANTLR) is the name of a parser generator that uses LL(k) parsing. ANTLR is the successor to the Purdue Compiler Construction Tool Set (PCCTS), first developed in 1989, and is under active development. Its maintainer is professor Terence Parr of the University of San Francisco.
NLTK — the Natural Language Toolkit — is a suite of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.
Java Compiler Compiler is the most popular parser generator for use with Java applications. A parser generator is a tool that reads a grammar specification and converts it to a Java program that can recognize matches to the grammar. In addition to the parser generator itself, JavaCC provides other ... [More]
Spirit is an object-oriented, recursive descent parser generator framework implemented using template meta-programming techniques. Expression templates allow Spirit to approximate the syntax of Extended Backus Normal Form (EBNF) completely in C++. The Spirit framework enables a target grammar to be ... [More]
Ragel compiles finite state machines from regular languages into executable C, C++, Objective-C, D, Java or Ruby code. Ragel state machines can not only recognize byte sequences as regular expression machines do, but can also execute code at arbitrary points in the recognition of a regular language. ... [More]
Happy is a parser generator system for Haskell, similar to the tool `yacc' for C. Like `yacc', it takes a file containing an annotated BNF specification of a grammar and produces a Haskell module containing a parser for the grammar. Happy is flexible: you can have several Happy parsers ... [More]
JFlex is a lexical analyzer generator (also known as scanner generator) for Java. It is a fork of JLex, and can read JLex files. JFlex is a flex-like lexer generator written in Java with emphasis on speed and full Unicode support. It has some not so usual features like negation in regexps and nested input streams.
Irony is a development kit for implementing languages on .NET platform. Unlike most existing yacc/lex-style solutions Irony does not employ any scanner or parser code generation from grammar specifications written in a specialized meta-language. In Irony the target language grammar is coded directly ... [More]
Treex (formerly TectoMT) is a highly modular NLP software system implemented in Perl programming language under Linux. It is primarily aimed at Machine Translation, making use of the ideas and technology created during the Prague Dependency Treebank project. At the same time, it is also hoped to ... [More]
Provides an ANTLR plugin (including grammar file editor with outline page and project nature with incremental builder) for the Eclipse platform