Select a tag to browse associated projects and drill deeper into the tag cloud.
Weka is a collection of machine learning algorithms for solving real-world data mining problems. It is written in Java and runs on almost any platform. The algorithms can either be applied directly to a dataset or called from your own Java code.
claimed by Apache Software Foundation
Apache Mahout's goal is to build scalable machine learning libraries. With scalable we mean: Scalable to reasonably large data sets. Our core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm.
RapidMiner (formerly YALE) is the most comprehensive open-source software for intelligent data analysis, data mining, knowledge discovery, machine learning, predictive analytics, forecasting, and analytics in business intelligence (BI). RapidMiner provides more than 400 data mining operators, a
Python module to ease pattern classification analyses of large datasets. It provides high-level abstraction of typical processing steps (e.g. data preparation, classification, feature selection, generalization testing), a number of implementations of some popular algorithms (e.g. kNN, Ridge
MonetDB is a open-source columnar database system for high-performance applications. It comes with a feature rich SQL interface, ready to perform analytical queries on large datasets with an unusual speed.
Orange is a component-based data mining software. It includes a range of data visualization, exploration, preprocessing and modelling techniques. It can be used through a nice and intuitive user interface or, for more advanced users, as a module for Python programming language.
Developed by the Knowledge Discovery Lab at UMass directed by Prof. Jensen, Proximity is tool for describing, exploring, manipulating, and modeling the interconnections of people, places, things, and events. It has a high-performance engine for storing very large datasets (tens of M of nodes and
BioMart is a query-oriented data management system developed jointly by the European Bioinformatics Institute (EBI) and Cold Spring Harbor Laboratory (CSHL). The system can be used with any type of data and comes with a range of query interfaces and administration tools, including 'out of the
Myrrix is a complete, real-time, scalable recommender system, evolved from Apache Mahoutâ„¢. Just as we take for granted easy access to powerful, economical storage and computing today, Myrrix will let you take for granted easy access to large-scale learning from data. The Serving Layer component
The Java Data Mining Package (JDMP) is an open source Java library for data analysis and machine learning. It facilitates the access to data sources and machine learning algorithms (e.g. clustering, regression, classification, graphical models, optimization) and provides visualization modules. It
Copyright
©
2013
Black Duck Software, Inc.
and its contributors, Some Rights Reserved. Unless otherwise marked, this work is licensed under a
Creative Commons Attribution 3.0 Unported License
. Ohloh
®
and the Ohloh logo are trademarks of
Black Duck Software, Inc.
in the United States and/or other jurisdictions. All other trademarks are the property of their respective holders.