Select a tag to browse associated projects and drill deeper into the tag cloud.
NLTK — the Natural Language Toolkit — is a suite of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.
The VISL Constraint Grammar Compiler is a natural language parser generator. It is an implementation of Pasi Tapanainen's CG-2 constraint grammar formalism. VISL CG-3 is feature-wise backwards compatible with CG-2 and VISLCG.
IceNLP is an open source Natural Language Processing (NLP) toolkit for analyzing and processing Icelandic text. The toolkit is implemented in Java.
NTextCat - free Language Identification API for .NET (C#): 280+ languages available out of the box. Recognizes language and encoding (UTF-8, Windows-1252, Big5, etc.) of text. Mono compatible.
TestEl is a Java-based learning analyzer for HTML (and possibly other) structured documents. It can be trained to detect structures in such documents and renders hits in XML.
Assorted Thai language processing tools. Very crude, proof-of-concept version for now (0.01). DictionaryThai-to-English dictionary GUI with fuzzy romanized lookup. Type in approximate word in latin (English) alphabet, and get the possible matches in Thai, along with English translation. ... [More]
various codes for natural language text processing, esp. Thai text. memeswimmer is intended to be a replacement of "digiboard" (deployed on sites like http://siit.net/webboard/ and http://daytag.org/webboard/ ) Why Meme Swimmer ?Meme (/miːm/) consists of any unit of cultural ... [More]
Attempts to determine the natural language of a selection of Unicode (utf-8) text. Based on guesslanguage.cpp by Jacob R Rideout for KDE which itself is based on Language::Guess by Maciej Ceglowski. Detects over 60 languages; Greek (el), Korean (ko), Japanese (ja), Chinese (zh) and all the ... [More]
The objective of the OSDT project is to develop novel methods for incremental statistical dependency parsing that improve the state-of-the-art, and to implement the methods in an open-source toolkit written in Java. The proposed system will be the first parsing system that uses error-guided repair ... [More]
The fuverse is a project geared toward developing an open ended system of interactive fiction. Fuverse will be a set of tools in which to create simulations where a player may dynamically participate, shaping their own experience. Fuverse Goals A framework which will allow for the creation of an ... [More]