Browsing projects by Tag(s)

Select a tag to browse associated projects and drill deeper into the tag cloud.

Showing page 1 of 3

NLTK — the Natural Language Toolkit — is a suite of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.

5.0
 
  0 reviews  |  40 users  |  214,336 lines of code  |  43 current contributors  |  Analyzed 4 days ago
 
 

The VISL Constraint Grammar Compiler is a natural language parser generator. It is an implementation of Pasi Tapanainen's CG-2 constraint grammar formalism. VISL CG-3 is feature-wise backwards compatible with CG-2 and VISLCG.

5.0
 
  0 reviews  |  2 users  |  27,885 lines of code  |  2 current contributors  |  Analyzed 7 days ago
 
 

IceNLP is an open source Natural Language Processing (NLP) toolkit for analyzing and processing Icelandic text. The toolkit is implemented in Java.

5.0
 
  0 reviews  |  2 users  |  31,322 lines of code  |  2 current contributors  |  Analyzed 1 day ago
 
 

NTextCat - free Language Identification API for .NET (C#): 280+ languages available out of the box. Recognizes language and encoding (UTF-8, Windows-1252, Big5, etc.) of text. Mono compatible.

5.0
 
  0 reviews  |  1 user  |  415,249 lines of code  |  2 current contributors  |  Analyzed 6 days ago
 
 

TestEl is a Java-based learning analyzer for HTML (and possibly other) structured documents. It can be trained to detect structures in such documents and renders hits in XML.

0
 
  0 reviews  |  1 user  |  7,595 lines of code  |  0 current contributors  |  Analyzed 9 days ago
 
 

Assorted Thai language processing tools. Very crude, proof-of-concept version for now (0.01). DictionaryThai-to-English dictionary GUI with fuzzy romanized lookup. Type in approximate word in latin (English) alphabet, and get the possible matches in Thai, along with English translation. ... [More] LookupImprovementIdeas Library functionsWord breaker (wrapping Windows API word breaker for Thai). Outputs to console. XDict - dictionary class with indexes. StringMapping - many-to-many mapping class with indexes both ways (e.g. mapping word to pronounciation) Parsers for a few free dictionary formats (coming in formats such as SQL or mangled XML), outputting an XDict xml file. Support Lexitron, Teddy, CU Thai Romanization 1.25 Lookup by romanized form (transform input by given rules, lookup in the hashtable) [Less]

0
 
  0 reviews  |  0 users  |  3,472 lines of code  |  0 current contributors  |  Analyzed 3 months ago
 
 

various codes for natural language text processing, esp. Thai text. memeswimmer is intended to be a replacement of "digiboard" (deployed on sites like http://siit.net/webboard/ and http://daytag.org/webboard/ ) Why Meme Swimmer ?Meme (/miːm/) consists of any unit of cultural ... [More] information, such as a practice or idea, that gets transmitted verbally or by repeated action from one mind to another. Gene is for biological, meme is for cultural. Gene pool is for diversity of life, biologically. Meme pool is for the same, culturally. And we are all the "meme swimmer". [Less]

0
 
  0 reviews  |  0 users  |  7,067 lines of code  |  0 current contributors  |  Analyzed 2 days ago
 
 

Attempts to determine the natural language of a selection of Unicode (utf-8) text. Based on guesslanguage.cpp by Jacob R Rideout for KDE which itself is based on Language::Guess by Maciej Ceglowski. Detects over 60 languages; Greek (el), Korean (ko), Japanese (ja), Chinese (zh) and all the ... [More] languages listed in the trigrams directory. Code is available from svn. [Less]

0
 
  0 reviews  |  0 users  |  499 lines of code  |  0 current contributors  |  Analyzed 4 days ago
 
 

The objective of the OSDT project is to develop novel methods for incremental statistical dependency par­sing that improve the state-of-the-art, and to implement the methods in an open-source toolkit written in Java. The proposed system will be the first parsing system that uses error-guided repair ... [More] operations that can change the analyses destructive­ly; and it will be the first dependency parser that is based on a sophisticated gene­rative statistical dependency model that includes specific submodels for the generation of complements and adjuncts, non-projective word order and island constraints, secondary dependencies (in control con­struc­tions, relative clauses, elliptic coordinations, parasitic gaps), punctuation, and the time-course of partial analyses (where dependents may be waiting for their heads to occur in the input, and vice versa). [Less]

0
 
  0 reviews  |  0 users  |  66,448 lines of code  |  0 current contributors  |  Analyzed 2 days ago
 
 

The fuverse is a project geared toward developing an open ended system of interactive fiction. Fuverse will be a set of tools in which to create simulations where a player may dynamically participate, shaping their own experience. Fuverse Goals A framework which will allow for the creation of an ... [More] interactive world. Initially allowing the creator to represent objects in 2D space with sprites and textual descriptions. A role playing rule system will be integrated with the world to allow player interaction. The player will be able to interact with the world through context sensitive text commands. All interaction will be governed by simple rules. The player will be able to define his or her own custom actions that will affect the world. check out updates for latests realease notes [Less]

0
 
  0 reviews  |  0 users  |  18 lines of code  |  0 current contributors  |  Analyzed 10 days ago
 
 
 
 

Creative Commons License Copyright © 2013 Black Duck Software, Inc. and its contributors, Some Rights Reserved. Unless otherwise marked, this work is licensed under a Creative Commons Attribution 3.0 Unported License . Ohloh ® and the Ohloh logo are trademarks of Black Duck Software, Inc. in the United States and/or other jurisdictions. All other trademarks are the property of their respective holders.