Browsing projects by Tag(s)

Select a tag to browse associated projects and drill deeper into the tag cloud.

Showing page 1 of 1

TreeTagger for Java is a Java wrapper around the popular TreeTagger package by Helmut Schmid. It was written with a focus on platform-independence and easy integration into applications. It is written in Java 5 and has been tested on OS X, Ubuntu Linux, and Windows.

5.0
 
  0 reviews  |  11 users  |  2,460 lines of code  |  1 current contributor  |  Analyzed 8 days ago
 
 

Hunpos is an open source reimplementation of TnT, the well known part-of-speech tagger by Thorsten Brants. FeaturesFree and open source, even for commercial use. For languages with more complex morphologies, HMM tagging could be quite competitive with the current generation of learning ... [More] algorithms applying e.g. SVM and CRF methods. A major advantage is that the training/tagging cycle is orders of magnitude faster than in more complex models. Precision of tagging on unknown and unseen words was a major priority for us during the development of hunpos. Works smoothly with large tag sets. For example in Hungarian, as in other highly inflecting languages, it is important to preserve detailed morphological information in the POS tags in order to provide useful clues for higher level processing tasks. This leads to a significantly larger tagset than is common in English (744 tags here as opposed to the 36 standardly used in Treebank work), but does not degrade training and tagging performance. Although it would make the training process of non-generative models computationally expensive. Effortless integration of knowledge from morphological analyzers/dictionaries into best path calculation. Contextualized lexical probabilities with a context window of any size. Unlike traditional HMM models, HunPos estimates emission (lexical) probabilities based on the current tag and previous tags as well. Hunpos has been implemented in OCaml, a high-level language which supports a succinct, well-maintainable coding style. OCaml has a high-performance compiler that produces native code with speed comparable to C/C++ implementations. [Less]

0
 
  0 reviews  |  0 users  |  2,098 lines of code  |  0 current contributors  |  Analyzed 5 days ago
 
 

Free and open source, even for commercial use.

0
 
  0 reviews  |  0 users  |  0 current contributors  |  Analyzed 5 days ago
 
 
Compare

Mulm is a state-of-the-art Hidden Markov Model toolkit.

0
 
  0 reviews  |  0 users  |  5,046 lines of code  |  2 current contributors  |  Analyzed 8 days ago
 
 
 
 

Creative Commons License Copyright © 2013 Black Duck Software, Inc. and its contributors, Some Rights Reserved. Unless otherwise marked, this work is licensed under a Creative Commons Attribution 3.0 Unported License . Ohloh ® and the Ohloh logo are trademarks of Black Duck Software, Inc. in the United States and/or other jurisdictions. All other trademarks are the property of their respective holders.