Select a tag to browse associated projects and drill deeper into the tag cloud.
dom4j is an easy to use, open source library for working with XML, XPath, and XSLT on the Java platform, using the Java Collections Framework, and with full support for DOM, SAX, and JAXP.
NLTK — the Natural Language Toolkit — is a suite of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.
Rome is a set of Atom/RSS Java utilities that make it easy to work in Java with most syndication formats. Today it accepts all flavors of RSS (0.90, 0.91, 0.92, 0.93, 0.94, 1.0 and 2.0) and Atom 0.3 feeds. Rome includes a set of parsers and generators for the various flavors of feeds, as well as ... [More]
Java Compiler Compiler is the most popular parser generator for use with Java applications. A parser generator is a tool that reads a grammar specification and converts it to a Java program that can recognize matches to the grammar. In addition to the parser generator itself, JavaCC provides other ... [More]
args4j is a small Java class library that makes it easy to parse command line options/arguments in your CUI application.
This project is a modern C++ library with a focus on portability and program correctness. It strives to be easy to use right and hard to use wrong. Thus, it comes with extensive documentation and thorough debugging modes. The library provides a platform abstraction layer for common tasks such as ... [More]
Html Agility Pack is an agile HTML parser library that proposes a read/write DOM and supports plain XPATH or XSLT. It allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world" malformed HTML. The object model is very similar to what proposes System.Xml, but for HTML documents (or streams).
Stratego/XT is a language and toolset for program transformation. The Stratego language provides rewrite rules for expressing basic transformations, programmable rewriting strategies for controlling the application of rules, concrete syntax for expressing the patterns of rules in the syntax of ... [More]