Projects tagged ‘nlp’


[188 total ]

19 Users
 

NLTK — the Natural Language Toolkit — is a suite of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of ... [More] NLP tasks, with distributions for Windows, Mac OSX and Linux. [Less]
Created over 3 years ago.

7 Users

Apertium is an open-source machine translation platform, aimed at related-language pairs but expanded to deal with more divergent language pairs. The platform provides 1. a language-independent ... [More] machine translation engine 2. tools to manage the linguistic data necessary to build a machine translation system for a given language pair and 3. linguistic data for a growing number of language pairs. Apertium uses a shallow-transfer machine translation engine which processes the input text in stages, as in an assembly line: de-formatting, morphological analysis, part-of-speech disambiguation, shallow structural transfer, lexical transfer, morphological generation, and re-formatting. [Less]
Created about 1 year ago.

7 Users

GATE (General Architecture for Text Engineering) is an architecture, framework and development environment for developing, evaluating and embedding Human Language Technology
Created over 3 years ago.

4 Users

A generic, language-neutral framework for extending Ruby objects with linguistic methods.
Created over 2 years ago.

3 Users

SunPinyin is an opensource'd (in CDDL/LGPLv2.1) and SLM (Statistical Language Model) based Chinese PinYin input method engine. Currently, it's available on IIIMF, SCIM, BeCJK, and now on Mac OS X (Leopard only).
Created 10 months ago.

3 Users

Project OverviewThe S-Space Package is a collection of algorithms for building Semantic Spaces. These algorithms process text corpora and map semantic representations for words onto high dimensional ... [More] vectors. These approaches are known by many names, such as word spaces, semantic spaces, or distributed semantics. The research and development is being done by the Natural Language Processing group at UCLA led by David Jurgens and Keith Stevens, under the advisory of Dr. Michael Dyer. See the Getting Started page for a quick introduction on how to use the S-Space package, or see the Package Overview for information on the code and available features. GoalOur initial goal is to provide a uniform implementation for many common semantic space algorithms in order to facilitate research in semantic spaces and provide an accurate, reproducible way to compare different algorithms. Second, we aim to provide a comprehensive framework for researchers to easily develop new algorithms without having to replicate much of the shared software. As a part of this, we have implemented a variety of text and numeric based utilities for interacting with matrices, vectors, parsers, clustering algorithms and SVD. Where possible, all libraries are implemented in Java for maximum portability. We also support a limited set of Java-bindings to native libraries primarily for high-performance SVD and clustering operations. For those looking to implement their own Semantic Space algorithm within the S-Space package, we recommend looking at the Introduction page. AlgorithmsFor a list of the currently supported algorithms please see our algorithms page. Further resources for each algorithm may also be found on the publications page. ContactFor help using the S-Space Package, please contain our user mailing list: mailto:s-space-users@googlegroups.com. For question on development, bug reports or other code-specific questions, please contact our development mailing list: mailto:s-space-research-dev@googlegroups.com. If you have further questions, please feel free to contact David Jurgens or Keith Stevens. License and RestrictionsThe S-Space software package is free software released under the GPL v. 2 license. See our license and restrictions page for full details. [Less]
Created 8 months ago.

3 Users

Ruby-WordNet is a Ruby interface to the WordNet® Lexical Database. WordNet? is an online lexical reference system whose design is inspired by current psycholinguistic theories of human lexical ... [More] memory. English nouns, verbs, adjectives and adverbs are organized into synonym sets, each representing one underlying lexical concept. Different relations link the synonym sets. [Less]
Created over 2 years ago.

3 Users
 

bamboo is a chinese natrual language processing system. Currently, it includes chinese word tokenization, part of speech tagging and name entity recognition. ... [More] bamboo是一个中文语言处理系统。目前包括中文分词、词性标注和命名实体识别。 [Less]
Created about 1 year ago.

3 Users
   

Mac 版本使用者,請改至 github 取得最新版本: http://github.com/lukhnos/openvanilla-oranje/downloads ,並請參考 0.9.0a1 的版本發佈說明: ... [More] http://github.com/lukhnos/openvanilla-oranje/blob/master/Documents/20090826-Announcement.markdown The source tree for the Mac version has been moved to http://github.com/lukhnos/openvanilla-oranje OpenVanilla is an input method project. An input method is a system software program that people use to enter characters not found on their keyboards. The Project includes a large collection of Traditional and Simplified Chinese input methods as well as support for Taiwanese, Japanese, Tibetan, Unicode symbols, among others. Input method development used to be platform-dependent. OpenVanilla provides a lightweight API that simplifies the process. Because of its minimalistic design, OpenVanilla is highly portable. Input method modules developed with OpenVanilla's API run smoothly on Mac OS X, Linux/FreeBSD, and Windows. The Project also explores the possibilities of input method design. Topics include better user interface design and use cases beyond Asian languages. It's an open source project, participation welcome! OpenVanilla(開放香草輸入法計劃)提供多種輸入法工具,可以在Mac OS X、Linux/FreeBSD/Windows等平台上使用。 OpenVanilla的輕量設計,簡化了輸入法的開發工作。同時OpenVanilla也關心輸入法的各種可能性,例如更好的使用者介面設計,以及如何讓輸入法也能使用於其他語言。 [Less]
Created about 1 year ago.

2 Users
 

Ellogon is a multi-lingual, cross-platform, general-purpose language engineering environment, developed in order to aid both researchers who are doing research in computational linguistics, as well as ... [More] companies who produce and deliver language engineering systems. Ellogon as a language engineering platform offers an extensive set of facilities, including tools for processing and visualising textual/HTML/XML data and associated linguistic information, support for lexical resources (like creating and embedding lexicons), tools for creating annotated corpora, accessing databases, comparing annotated data, or transforming linguistic information into vectors for use with various machine learning algorithms. [Less]
Created over 2 years ago.