Projects tagged ‘corpora’ and ‘tools’


[4 total ]

20 Users
 

NLTK — the Natural Language Toolkit — is a suite of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of ... [More] NLP tasks, with distributions for Windows, Mac OSX and Linux. [Less]
Created over 3 years ago.

1 Users
   

Affisix is a program for automatic recognition of affixes. It takes large amount of words and according to the user setting it tries to determine which segments of these words are prefixes.
Created over 2 years ago.

1 Users

CORSIS (formerly Tenka Text) is a performance‐oriented, open‐source library for corpus analysis. It utilizes typed assembly, task‐specific compilers and parallelization to deliver the best ... [More] performance with elegant design. Demonstrative GUI of the project comes with Wordlister - an advanced, extremely fast graphical wordlist tool and a regex concordance tool. CORSIS - the open-source answer to WordSmith Tools. [Less]
Created over 3 years ago.

0 Users

CorpusCatcher is a corpus collection toolset. It can help you to build language or topic specific corpora from publicly available web resources. This can be very useful for many purposes, especially for data to build spell checkers.
Created about 1 year ago.