Browsing projects by Tag(s)

Select a tag to browse associated projects and drill deeper into the tag cloud.

Showing page 1 of 2
Compare

Weka is a collection of machine learning algorithms for solving real-world data mining problems. It is written in Java and runs on almost any platform. The algorithms can either be applied directly to a dataset or called from your own Java code.

4.13333
   
  0 reviews  |  36 users  |  535,883 lines of code  |  4 current contributors  |  Analyzed 5 days ago
 
 

Apache Mahout's goal is to build scalable machine learning libraries. With scalable we mean: Scalable to reasonably large data sets. Our core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm. ... [More] However we do not restrict contributions to Hadoop based implementations: Contributions that run on a single node or on a non-Hadoop cluster are welcome as well. The core libraries are highly optimized to allow for good performance also for non-distributed algorithms [Less]

4.25
   
  0 reviews  |  23 users  |  128,512 lines of code  |  14 current contributors  |  Analyzed 4 days ago
 
 

RapidMiner (formerly YALE) is the most comprehensive open-source software for intelligent data analysis, data mining, knowledge discovery, machine learning, predictive analytics, forecasting, and analytics in business intelligence (BI). RapidMiner provides more than 400 data mining operators, a ... [More] graphical user interface (GUI), an online tutorial with hands-on data mining applications, a comprehensive PDF tutorial, many visualization schemes for data sets and data mining results, many different learning and meta-learning schemes ranging from decision tree and rule learners to neural networks, SVMs, ensemble methods, etc. RapidMiner is implemented in Java and available under GPL (GNU General Public License) as well as under a developer license (OEM license) for closed-source developers [Less]

5.0
 
  0 reviews  |  17 users  |  534,895 lines of code  |  1 current contributor  |  Analyzed over 2 years ago
 
 

Python module to ease pattern classification analyses of large datasets. It provides high-level abstraction of typical processing steps (e.g. data preparation, classification, feature selection, generalization testing), a number of implementations of some popular algorithms (e.g. kNN, Ridge ... [More] Regressions, Sparse Multinomial Logistic Regression, GPR. RFE, I-RELIEF), and bindings to external ML libraries (libsvm, shogun, R). While it is not limited to neuroimaging data (e.g. FMRI) it is eminently suited for such datasets. [Less]

5.0
 
  0 reviews  |  9 users  |  113,476 lines of code  |  7 current contributors  |  Analyzed 5 days ago
 
 

MonetDB is a open-source columnar database system for high-performance applications. It comes with a feature rich SQL interface, ready to perform analytical queries on large datasets with an unusual speed.

5.0
 
  0 reviews  |  7 users  |  985,739 lines of code  |  19 current contributors  |  Analyzed about 17 hours ago
 
 

Orange is a component-based data mining software. It includes a range of data visualization, exploration, preprocessing and modelling techniques. It can be used through a nice and intuitive user interface or, for more advanced users, as a module for Python programming language.

4.5
   
  1 review  |  6 users  |  285,223 lines of code  |  25 current contributors  |  Analyzed 5 days ago
 
 

Developed by the Knowledge Discovery Lab at UMass directed by Prof. Jensen, Proximity is tool for describing, exploring, manipulating, and modeling the interconnections of people, places, things, and events. It has a high-performance engine for storing very large datasets (tens of M of nodes and ... [More] links), a visual query language for complex searches, and a set of powerful machine learning modules that build human-readable explanations of how the nodes interact and influence each other. Proximity has modeled the Hollywood world, studied how to prevent securities fraud, helped analysts in the intelligence community, modeled protein interactions in cells, described the behaviour of P2P networks, predicted the success of NFL coaches, and improved anonymization techniques in social networks [Less]

0
 
  0 reviews  |  2 users  |  342,580 lines of code  |  0 current contributors  |  Analyzed over 1 year ago
 
 

BioMart is a query-oriented data management system developed jointly by the European Bioinformatics Institute (EBI) and Cold Spring Harbor Laboratory (CSHL). The system can be used with any type of data and comes with a range of query interfaces and administration tools, including 'out of the ... [More] box' website that can be installed, configured and customised according to requirements. The system simplifies the task of creation and maintenance of advanced query interfaces backed by a relational database and it is particularly suited for providing the 'data mining' like searches of complex descriptive (e.g. biological) data. BioMart can work with existing data repositories by converting them to a required BioMart format as well as newly created databases. [Less]

4.0
   
  0 reviews  |  2 users  |  106,855 lines of code  |  0 current contributors  |  Analyzed about 23 hours ago
 
 

Myrrix is a complete, real-time, scalable recommender system, evolved from Apache Mahoutâ„¢. Just as we take for granted easy access to powerful, economical storage and computing today, Myrrix will let you take for granted easy access to large-scale learning from data. The Serving Layer component ... [More] of Myrrix is open source, and can even function as a stand-alone system suitable for moderate scale. This project hosts the open source Serving Layer from Myrrix. [Less]

5.0
 
  0 reviews  |  2 users  |  18,745 lines of code  |  1 current contributor  |  Analyzed about 5 hours ago
 
 

The Java Data Mining Package (JDMP) is an open source Java library for data analysis and machine learning. It facilitates the access to data sources and machine learning algorithms (e.g. clustering, regression, classification, graphical models, optimization) and provides visualization modules. It ... [More] includes a matrix library for storing and processing any kind of data, with the ability to handle very large matrices even when they do not fit into memory. Import and export interfaces are provided for JDBC data bases, TXT, CSV, Excel, Matlab, Latex, MTX, HTML, WAV, BMP and other file formats. JDMP provides a number of algorithms and tools, but also interfaces to other machine learning and data mining packages (Weka, LibSVM, Mallet, Lucene, Octave). [Less]

0
 
  0 reviews  |  1 user  |  41,328 lines of code  |  1 current contributor  |  Analyzed 3 days ago
 
 
 
 

Creative Commons License Copyright © 2013 Black Duck Software, Inc. and its contributors, Some Rights Reserved. Unless otherwise marked, this work is licensed under a Creative Commons Attribution 3.0 Unported License . Ohloh ® and the Ohloh logo are trademarks of Black Duck Software, Inc. in the United States and/or other jurisdictions. All other trademarks are the property of their respective holders.