Browsing projects by Tag(s)

Select a tag to browse associated projects and drill deeper into the tag cloud.

Showing page 1 of 2

NLTK — the Natural Language Toolkit — is a suite of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.

5.0
 
  0 reviews  |  40 users  |  214,336 lines of code  |  43 current contributors  |  Analyzed 4 days ago
 
 

Apertium is an open-source machine translation platform, aimed at related-language pairs but expanded to deal with more divergent language pairs. The platform provides 1. a language-independent machine translation engine 2. tools to manage the linguistic data necessary to build a machine ... [More] translation system for a given language pair and 3. linguistic data for a growing number of language pairs. Apertium uses a shallow-transfer machine translation engine which processes the input text in stages, as in an assembly line: de-formatting, morphological analysis, part-of-speech disambiguation, shallow structural transfer, lexical transfer, morphological generation, and re-formatting. [Less]

5.0
 
  0 reviews  |  10 users  |  16,830,653 lines of code  |  41 current contributors  |  Analyzed 6 days ago
 
 

Apache OpenNLP is a Java machine learning toolkit for natural language processing (NLP).

5.0
 
  0 reviews  |  10 users  |  467,231 lines of code  |  5 current contributors  |  Analyzed 4 days ago
 
 

LanguageTool is an Open Source language checker for English, German, Polish, Dutch, and other languages. It's rule based, i.e. it will find errors for which a rule is defined in an XML configuration files. Rules for more complicated errors can be written in Java.

4.0
   
  0 reviews  |  6 users  |  297,879 lines of code  |  15 current contributors  |  Analyzed about 12 hours ago
 
 

Treex (formerly TectoMT) is a highly modular NLP software system implemented in Perl programming language under Linux. It is primarily aimed at Machine Translation, making use of the ideas and technology created during the Prague Dependency Treebank project. At the same time, it is also hoped to ... [More] significantly facilitate and accelerate development of software solutions of many other NLP tasks, especially due to re-usability of the numerous integrated processing modules (called blocks), which are equipped with uniform object-oriented interfaces. [Less]

5.0
 
  0 reviews  |  3 users  |  266,984 lines of code  |  21 current contributors  |  Analyzed 9 days ago
 
 

RelEx is an English-language semantic relationship extractor, built on the Carnegie-Mellon link parser. It can identify subject, object, indirect object and many other relationships between words in a sentence. It can also provide part-of-speech tagging, noun-number tagging, verb tense tagging ... [More] , gender tagging, and so on. Relex includes a basic implementation of the Hobbs anaphora (pronoun) resolution algorithm. Optionally, it can use GATE for entity detection. RelEx also provides semantic relationship framing, similar to that of FrameNet. [Less]

0
 
  0 reviews  |  2 users  |  17,564 lines of code  |  1 current contributor  |  Analyzed 9 days ago
 
 

The Link Grammar Parser is a syntactic parser of English, based on link grammar, an original theory of English syntax. Given a sentence, the system assigns to it a syntactic structure, which consists of a set of labeled links connecting pairs of words. The parser also produces a ... [More] "constituent" (Penn tree-bank style phrase tree) representation of a sentence (showing noun phrases, verb phrases, etc.). [Less]

0
 
  0 reviews  |  1 user  |  53,459 lines of code  |  1 current contributor  |  Analyzed 6 days ago
 
 

WebAnno: A Flexible, Web-based and Visually Supported System for Distributed Annotations

5.0
 
  0 reviews  |  1 user  |  41,205 lines of code  |  5 current contributors  |  Analyzed 3 days ago
 
 

webXcreta sucks down the latest entries for the currently most popular blogs on the Intarweb. It then parses each weblog entry using natural language processing (NLTK) and figures out what words are verbs, nouns, adjectives, definite articles, etc. Next, it creates weighted values based on how ... [More] high-ranking each blog is (higher ranking blogs have a greater influence over sentence count, word order, and vocabulary). The reassembled bits get spit out and posted here. [Less]

5.0
 
  0 reviews  |  0 users  |  2,643 lines of code  |  0 current contributors  |  Analyzed 4 days ago
 
 

An engine for creating and annotating textual corpora

0
 
  0 reviews  |  0 users  |  21,352 lines of code  |  5 current contributors  |  Analyzed 3 days ago
 
 
 
 

Creative Commons License Copyright © 2013 Black Duck Software, Inc. and its contributors, Some Rights Reserved. Unless otherwise marked, this work is licensed under a Creative Commons Attribution 3.0 Unported License . Ohloh ® and the Ohloh logo are trademarks of Black Duck Software, Inc. in the United States and/or other jurisdictions. All other trademarks are the property of their respective holders.