Browsing projects by Tag(s)

Select a tag to browse associated projects and drill deeper into the tag cloud.

Showing page 1 of 3

HPCC Systems ECL Machine Learning Library

5.0
 
  0 reviews  |  2 users  |  4,331 lines of code  |  9 current contributors  |  Analyzed 8 days ago
 
 

KeplerWeka adds the functionality of the open-source machine learning and data mining workbench WEKA to the free and open-source, scientific workflow application, Kepler.

0
 
  0 reviews  |  2 users  |  21,240 lines of code  |  0 current contributors  |  Analyzed 5 days ago
 
 

The Yooreeka project is created based on the code of the book "Algorithms of the Intelligent Web" (Manning 2009 -- http://www.manning.com/marmanis). Although the term "Web" prevailed in the title, in essence, the algorithms are valuable in any software application. Our goal is ... [More] to maintain that source code as an open source project that everyone can use in their own projects. For that purpose, we release the Yooreeka library under the LGPL license. The library is written 100% in the Java language. [Less]

5.0
 
  0 reviews  |  1 user  |  60,704 lines of code  |  2 current contributors  |  Analyzed about 16 hours ago
 
 

Grid-Soccer Simulator is a multi-agent soccer simulator in a grid-world environment. The environment provides a test-bed for machine-learning, and control algorithms, especially multi-agent reinforcement learning.

0
 
  0 reviews  |  1 user  |  46,714 lines of code  |  1 current contributor  |  Analyzed 8 days ago
 
 

A collection of Algorithms from my Thesis work. This code is mostly not production-ready or documented and under heavy development. Please contact me if you have specific questions. Thanks, Gabe. Cantor Encoder - A variant of cantor coding. A backpropagation algorithm that allows for the use of ... [More] multiplicative units. Mori, an evolutionary reinforcement algorithm to optimize real-valued vector representable entities. With specific focus on highly partitioned to pseudo-fractal error landscapes in recurrent neural networks. (This project is named after priska, in honor of her telling me forcefully to set up a version-management system.) [Less]

0
 
  0 reviews  |  1 user  |  3,316 lines of code  |  0 current contributors  |  Analyzed 1 day ago
 
 

The OpenRecommender project was started on (Canada Day) July 1st, 2008 with the mission of creating the world’s leading Free and Open Source Recommendation Engine. The goal is that many people, from many different fields, will come together in using and improving the underlying technologies of the Recommender System.

0
 
  0 reviews  |  1 user  |  0 current contributors  |  Analyzed over 1 year ago
 
 

Ensemble and tree learning library for Python.

0
 
  0 reviews  |  1 user  |  1,625 lines of code  |  0 current contributors  |  Analyzed about 7 hours ago
 
 

FrontPageWikipedia said: Sentence boundary disambiguation (SBD) is the problem in natural language processing of deciding where the beginning and ends of sentences are. Detecting sentence boundary is one of the most important function in NLP(natural language processing) area. like language ... [More] morphological analyzer, part-of-speech tagger. We usually use delimiters or punctuations to segment phrase or document. but problem is the accuracy of sentence boundary. So i would like to create two types of sentence boundary detector that are rule based SBD(sentence boundary detector) and machine learning based SBD. General InformationRunning environment Pre-requirement Install/Setup install maxent install weka install sentence boundary detector (for python) Data/Corpus QuickStart User DocumentationDevelopment DocumentationRoadmap Development Environment Related Resources [Less]

0
 
  0 reviews  |  0 users  |  72,721 lines of code  |  0 current contributors  |  Analyzed 1 day ago
 
 

MaLeCoLi (MAchine LEarning in COmmon LIsp) is a framework for Machine Learning in Common Lisp.

0
 
  0 reviews  |  0 users  |  12,815 lines of code  |  0 current contributors  |  Analyzed 7 days ago
 
 

OverviewPyRSVD provides an efficient python implementation of a regularized singular value decomposition solver. The module is primarily aimed at applications in collaborative filtering, in particular the Netflix competition. Matrix FactorizationThe solver is used to compute a low-rank ... [More] approximation of a rating matrix R, which is usually a large partial matrix (i.e. lots of missing values). More formally, The goodness of the approximation is measured in terms of the frobenius norm with respect to the known ratings. Minimizing the frobenius norm between the rating matrix R and the factorization is equivalent to minimize the squared error. Due to the huge number of parameters, overfitting is a serious problem. It is avoided by adding a regularization term to the squared error function, which penalizes large parameters. The regularized error function is given by, The solver uses stochastic gradient decent to minimize the above error function. Matrix approximation has been applied very successfully in collaborative filtering. The factors reveal some of the latent structure in the rating data which is subsequently used to predict user preferences. The factorization produced by the solver can directly be used to predict ratings or as a preprocessing step, e.g. to represent each user by a vector of latent factors he or she is interested in. DependenciesThe python module makes heavy use of numpy. The critical sections are written in cython. Although the module does not depend on pyflix, it nicely integrates into the pythonic Netflix library. PerformanceThe runtime of the algorithm depends on two parameters: a) the number of latent factors k and b) the number of epochs. The table below shows the performance of the algorithm on the training set of the Netflix data. The table shows the average time per epoch as well as the RMSE of the final model on the training and probe set. The baseline of Netflix (CineMatch) scores 0.9474 on the probe set (lower is better). Factors Epochs sec/Epoch Train RMSE Probe RMSE 10 100 34 0.8125 0.9260 64 100 149 0.7785 0.9165 128 104 190 0.694209 0.907149 256 106 350 0.660420 0.905564 The plot below shows the learning curve of a model with 256 factors (learn rate=0.001, regularization=0.011). The probe RMSE is plotted against the number of training epochs. The ticks on the right mark the performance of a simple movie average predictor, CineMatch and the qualification RMSE for the Grand Prize. InstallationTo install the module simply run, python setup.py installIf you modify rsvd.pyx you have to run the cython compiler. Cython will create the file rsvd.c. The reference C compiler for the project is GCC 4.2.3. To avoid structure padding you have to invoke ./instrument.py which adds __attribute__ ((__packed__)) to the Rating struct. cython rsvd.pyx ./instrument.py rsvd.c python setup.py installUsageTo train a model, simply use the RSVD.train classmethod: import numpy as np from rsvd import RSVD, rating_t ratings=np.fromfile('training.arr',dtype=rating_t) probeRatings=np.fromfile('probe.arr',dtype=rating_t) model = RSVD.train(10,ratings,(17770,480189),probeRatings) # predict r_ij, the rating of user j and movie i model(i,j) For more information on the arguments of RSVD.train type help(RSVD.train). The rating data is assumed to be stored in a numpy record array. Each record is a triple (movieID,userID,rating) where movieID is a uint16, userID is a uint32 and rating is a uint8 float (see rating_t). Furthermore, it is assumed that the movie ids start from 1 whereas the user ids start from 0 (missing user and movie ids are not permitted). So for the netflix dataset you can leave the movie ids as they are but you have to map the user ids to the interval 0,480189. You can also use the rsvd_train shell script to train a model. For more information type rsvd_train --help. $ ./rsvd_train -f 10 -l 0.001 -r 0.02 --probe data/probe.arr data/training.arr 17770 480189 models/t_10_001_02_100Further InformationPyRSVD trains the factors simultaneously - other approaches train one factor at a time. For further information on factor-at-a-time approaches see: LingPipe's SVDMatrix Timely Development Simon Funk SVD [Less]

0
 
  0 reviews  |  0 users  |  6,600 lines of code  |  0 current contributors  |  Analyzed 5 days ago
 
 
 
 

Creative Commons License Copyright © 2013 Black Duck Software, Inc. and its contributors, Some Rights Reserved. Unless otherwise marked, this work is licensed under a Creative Commons Attribution 3.0 Unported License . Ohloh ® and the Ohloh logo are trademarks of Black Duck Software, Inc. in the United States and/or other jurisdictions. All other trademarks are the property of their respective holders.