  |
Data Access Tools from the US Census Bureau - http://www.census.gov/main/www/access.html
General purpose data display and extraction tools that works with Census Bureau data. Census data available for pickup through census bureau employees only. [Free] |
  |
Pfam: Database of Protein Families and HMMs - http://pfam.janelia.org/
A large collection of multiple sequence alignments and trained hidden Markov models covering many common protein domains. Alignments are included as well as models for 8296 protein families, based on the Swissprot 48.9 and SP-TrEMBL 31.9 protein sequence databases. [GPL] |
  |
OpenCyc - http://www.opencyc.org/
Open source version of the Cyc technology, the world's largest and most complete general knowledge base and commonsense reasoning engine. Can be used as the basis of a wide variety of intelligent applications including; rapid development of an ontology in a vertical area, email prioritizing, routing, summarization, and annotating, expert systems, game engine development. [GPL] |
  |
HMMER: Biosequence Analysis - http://hmmer.janelia.org/
A tool used to build HMMs from multiple alignments and calculate e-scores. [GPL] |
  |
UCI Machine Learning - http://www.ics.uci.edu/~mlearn/MLRepository.html
A repository of databases, domain theories and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms at the University of California at Irvine. [Free] |
  |
WinBUGS: The BUGS Project - http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/contents.shtml
A stand-alone program to allow practical MCMC methods available to applied statisticians. Either a point and click interface can be used to control the analysis or a graphical interface can be constructed. The BUGs project also includes links to GeoBUGS for spatial analysis, PKBUGS for pharmacokinetic modelling, and OpenBUGS for latest developments. [GPL] |
  |
Weka 3: Data Mining Software - http://www.cs.waikato.ac.nz/~ml/weka/index.html
A collection of tools that implement decision trees and tables, rule learners, Naive Bayes, support vector machines, voted perceptrons, multi-layer perceptron. Meta schemes include bagging, stacking, and boosting. Written in Java. [GPL] |
  |
The Torch Machine Learning Library - http://www.torch.ch
This package forms a complete gradient descent machine learning library. Modules support vector machines in classification and regression, ensemble models such as bagging or adaboost, non-parametric models such as K-nearest neighbors, Parzen regression, and Parzen density estimation. Includes speech recognition tools. Written in C++ [BSD] |
  |
GRIDLOCK: A Scalable Approach to Unifying Computer and Communications Security - http://nsl.cs.columbia.edu/projects/gridlock/
A globally specified and locally interpreted policy language based software for specification of network access control policy. [GPL] |
  |
Open Source Computer Vision Library - http://www.intel.com/technology/computing/opencv/index.htm
Open source computer learning system making use of the Bayesian inferencing engine. [GPL] |
  |
Rapid Miner - http://rapid-i.com/
The Rapid Miner toolset is an environment for machine learning through use of nested operators. Multiple experiments can be arbitrarily nested together through use of a graphical XML based user interface. (Formerly YALE) [GPL] |
  |
MEME/MAST: Motif Discovery and Search - http://meme.sdsc.edu/meme/intro.html
A software package to discover motifs (highly conserved regions) in groups of related DNA or protein sequences and, search sequence databases using motifs. [Commercial] |
  |
MALLET: Advanced Machine Learning for Language - http://mallet.cs.umass.edu/index.php/Main_Page
An integrated collection of Java code useful for statistical natural language processing, document classification, clustering, information extraction, and other machine learning applications to text. [GPL] |
  |
LingPipe: Natural Language Processor (NLP) - http://www.alias-i.com/lingpipe/
A suite of Java libraries for the linguistic analysis of human language which can link entity mentions to database entries, uncover relations, cluster documents, and discover significant trends. [GPL] |
  |
Meta-MEME: Motif-based Hidden markov Modeling of Biological Sequences - http://metameme.sdsc.edu/
Software toolkit for building and using motif-based hidden Markov models of DNA and proteins. There is an online interactive version. Source written in C. [GPL] |
  |
Tree Visualizer - http://www.sgi.com/tech/mlc/trees.html
Software which allows one to navigate (fly) through the data tree, zoom in on interesting nodes, click on bars to get counts, and mark interesting places in the tree. Includes datasets for automobiles, voting, produce, and medical research. Uses LEDA, ([AFL] licensed only). [GPL] |
  |
Pattern Matching Pointers - http://www.cs.ucr.edu/~stelo/pattern.html
Using algorithms to address issues of searching and matching strings and more complicated patterns such as trees, regular expressions, graphs, point sets, and arrays. [GPL] |
  |
GAlib: Matthew's Genetic Algorithms Library - http://lancet.mit.edu/ga/
A toolset of genetic algorithm objects for C++ to perform optimization. Uses any representation and genetic operators. The documentation contains implementation and examples. Nice screenshots. PVM for distributed, parallel implementations. Includes graphic examples that use the Athena, Motif widget sets, or MFC. [BSD] |
  |
WebMO - http://www.webmo.net/
Web-based interface to computational chemistry. Has support for Gaussian 94/98/03, GAMESS, MolPro 2002, MOPAC 7/93/200x, NWChem 4.6+, QChem 2.1+, and Tinker 4.2+. Unix or Linux based. [Free] |
  |
Software Packages and Toolboxes - http://www.isp.pitt.edu/information/software.html
Online software repository of the Department of Computer Science at The University of Pittsburg. Everything from expert systems, finite-state machines, graphical models, linear programming, and machine learning through turning machines is covered here and all downloadable. Multiple platforms. [Free] |
  |
ORANGE: Inter-active Machine Learning Data Mining components - http://www.ailab.si/orange
A component based framework for data input/output, preprocessing, predictive modelling, ensemble methods, and modelling validation. [GPL] |
  |
Bayes Net Toolbox for Matlab - http://bnt.sourceforge.net/
Supports several inference algorithms and learning algorithms. Allows simulation of static and dynamic networks, including HMMs, IOHMMs, and Kalman filters. [GPL] |
  |
Intelligent Software Agent Projects at Sourceforge.net - http://sourceforge.net/softwaremap/trove_list.php?form_cat=591
A large collection of Intelligent Agent projects complete with sourcecode. [GPL] |
  |
Spider: General Purpose Machine Learning Toolbox in Matlab - http://www.kyb.tuebingen.mpg.de/bs/people/spider/index.html
An object orientated environment for machine learning in Matlab. Algorithms can be plugged together and can be compared with (e.g. model selection, statistical tests and visual plots). Algorithms may be downloaded separately. [GPL] |
  |
Machine Learning and Inference Laboratory - http://www.mli.gmu.edu/msoftware.html
Downloadable group of programs comprising the EMERALD machine learning system, (Experimental Machine Example-based Reasoning and Learning Disciple) integrates five modules (e.g. or 'robots'), each displaying a capability for machine learning, visualization, and interence. [LGPL] |
  |
VIBES: Variational Inference for Bayesian Networks - http://vibes.sourceforge.net/
A software package which allows variance-modeled posterior inference to be performed automatically on a Bayesian network. [GPL] |
  |
UMDHMM and other statistical programs - http://www.kanungo.com/software/software.html
This tool implements Hidden Markov Models and application to part-of-speech tagging. Also available; a multivariate hypothesis testing software for gaussian data, and a groundtruth/metadata editing and visualizing toolkit for OCR. [GPL] |
  |
C4.5 and FOIL - http://www.rulequest.com/Personal/
The home page of R. Quinlan with FTP links to FOIL (inductive logic programming) and C4.5 (learning decision trees). [LGPL] |
  |
HRE API: A Portable Handwriting Recognition Engine - http://playground.sun.com/pub/multimedia/handwriting/hre.html
This engine is a functionally complete interface for handwriting recognition. API was written in ANSI C and has minimal reliance on the Windows system. There is a version ported to Linux. [GPL] |
  |
BETSY: A Bayesian Essay Test Scoring System - http://edres.org/betsy
A windows-based program that classifies text based on trained material. Designed for automated essay scoring, BETSY can be applied to any text classification task. [GPL] |
  |
LibML - http://savannah.nongnu.org/projects/libml/
A machine learning library. New implementations of various machine learning algorithms. [GPL] |
  |
Maximum Entropy Modeling Toolkit - http://homepages.inf.ed.ac.uk/s0450736/maxent_toolkit.html
A library of tools for constructing maximum entropy (maxent) model in either python or C++. Some program features are L-BFGS and GIS parameter estimation, and gaussian prior smoothing. [GPL] |
  |
TIMBL: Tilburg Memory Based Learner - http://ilk.uvt.nl/software.html
A program implementing several memory-based learning techniques. These learners store representation of the training set explicitly, and classifies new cases by extrapolation from the most similar stored cases. [AFL] |
  |
Software Packages for Graphical Models/Bayesian Networks - http://www.ai.mit.edu/~murphyk/Software/bnsoft.html
Tools for modeling graphs and Bayesian networks. Scroll down. [AFL], [Commercial] |
  |
WinMine Toolkit - http://research.microsoft.com/~dmax/WinMine/tooldoc.htm
A set of tools for Windows 2000/NT/XP that allow you to build statistical models from data. [Free] |
  |
ArrayMiner - ClassMarker - http://www.optimaldesign.com/ArrayMiner/ClassMarker.htm
Programmatically isolate similarities between scattered classes of genes. Expression driven. Utilizes a voting method along with a k-Nearest-Neighbors classification. Very rich graphical interface. Samples of an unknown class are possible given enough data. Fully functional demo. [Commercial] |
  |
BNet EngineKit - http://www.cra.com/commercial-products-services/bnet-engine-kit.asp
A developer toolkit for researchers and engineers to embed belief networks in software applications. Nice online demo. [Commercial] |
  |
SNoW: Sparse Network of Winnows - http://l2r.cs.uiuc.edu/~danr/snow.html
A learning architecture specifically tailored for learning in very high-dimensional feature spaces. The current release uses sparse variations of Winnow, Perceptron, and Naive Bayes. [AFL] |
  |
SenseClusters - http://senseclusters.sourceforge.net/
Programs to cluster similar contexts together using unsupervised knowledge-lean methods for word sense discrimination, email categorization, and name discrimination. Written in Perl. [GNU] |
  |
Machine Learning Repository - http://scalab.uc3m.es/~dborrajo/ml-systems/
Machine language tools including inductive systems (i.e. ILP and others), and other ML related links. |
  |
New Scientific Brainstorming Software for Inventors -- Windows, Mac and Linux - http://www.paramind.net/pmscientificversion.html
A software developed to help your team brainstorm. Words are replaced programmatically in user's idea sentence with new words from program categories perhaps creating ideas not formerly thought of. Includes word categories. [Commercial] |
  |
Self-Taught: Software That Learns By Doing - http://www.computerworld.com/softwaretopics/software/story/0,10801,108320,00.html
Short article on self-modifying software touching on co-training, partial-programming, and genetic programming (GP) methods of problem solving. [FREE] |
  |
ITI: Incremental Tree Inducer - http://www.cs.umass.edu/~lrn/iti/index.html
An algorithm that incrementally constructs decision trees from labeled examples. [AFL] |
  |
Lemga: Learning Models and Generic Algorithms - http://www.work.caltech.edu/ling/lemga/
A library of classes for optimizing (training) the generic models. Written in C++. [GPL] |
  |
Sorting Algorithms for Machine Learning - http://www.csse.monash.edu.au/%7Elloyd/tildeAlgDS/Sort/
Various sorting algorithms including insertion, quick, merge, heap, Dutch National Flag, and radix with on-line demos. [Free] |
  |
Classification Toolbox for MATLAB - http://www.yom-tov.info/toolbox.html
A complete set of algorithms for classification, clustering, feature selection and reduction for Matlab. [Free] |
  |
The PNC2 Rule Induction System - http://www.newty.de/pnc2/index.html
Windows software tool that induces rules from your data using the PNC2 cluster algorithm. An integrated parameter-tuning component allows easy adjustment of the algorithms behavior to the given problem without requiring any further knowledge. [GPL] |
  |
SUBDUE: Graph Based Knowledge Discovery - http://ailab.wsu.edu/subdue/
A program which discovers interesting and repetitive subgraphs in labeled graph representations using the minimum description length principle. Includes applications to molecular biology. [Free] |
  |
(H)HMM Library and Designer - http://connex.lip6.fr/%7ebinsztok/hhld.html
This library allows probabilistic sequence models to be constructed through use of Hidden Markov models (HMMs) and Hierarchical Markov models HMMs (HHMMs) in Ocaml programming language. [GPL] |
  |
FTP Repository Site List for Cognitive and Machine Learning - http://hoohoo.ncsa.uiuc.edu/ftp/part6.html
Anonymous sites from popular colleges and universities. To access other pages just replace the 6 in URL with numbers 1-23. [Free] |
  |
General Hidden Markov Model Library - http://sourceforge.net/projects/ghmm
Hidden Markov Models software library from the Center of Applied Informatics, Cologne. Includes algorithms such as Viterbi, Baum-Welch, and Forward-Backward. [GPL] |
  |
PRAPI: Pattern Recognition Application Programmer's Interface - http://www.ee.oulu.fi/~topiolli/cppdocs/
A library for many pattern recognition tasks. The main focus of this package is on image analysis but utilizes a general architecture and XML-based data interchange format. Written in C++ [GPL] |
  |
Genetic Algorithm Projects at SourceForge.net - http://sourceforge.net/softwaremap/trove_list.php?form_cat=621
A large collection of algorithm projects with complete sourcecode. [GPL] |
  |
Experience-based Language Acquisition - http://sourceforge.net/projects/ebla
Computational model of human language acquisition written in Java; currently acquires a protolanguage of nouns and verbs language based on visual perception. [BSD] |
  |
ELIE: An Adaptive Information Extraction System - http://www.aidanf.net/software/elie_an_adaptive_information_extraction_system
A tool for adaptive information extraction from text. Also included are a number of other text processing tools for POS tagging, chunking, gazetteer, and stemming. [GPL] |
  |
libbpfl - A Bayesian Probability Filtering Library - http://libbpfl.sourceforge.net
A general purpose library for Bayesian filtering written in C++. [LGPL] |
  |
Nonparametric Classification with Polynomial MPMC Cascades - http://cervisia.org/machine_learning_code.php
Scalable non-parametric classification with Polynomial MPMC Cascades for use in Matlab. [GPL] |
  |
The Observable Operator Modeling Kit - http://omk.sourceforge.net
Machine learning library for 'observable operator models' (OOMs) suitable for time-series and sequence data classification and prediction. OOMs are similar but more powerful than HMMs. Written in C++ [BSD] |
  |
SUrrogate MOdeling (SUMO) Toolbox - http://www.sumo.intec.ugent.be/?q=SUMO_toolbox
A Matlab toolbox that automatically builds surrogate models of a given data source within the accuracy and time constraints set by the user. Some key features are the inclusion of scalable regression models, knowledge discovery, best-in-class modelling techniques, and logging and profiling tools. [COMM] |
  |
Fully Complex Machine Learning Algorithm: ELM - http://www.ntu.edu.sg/home/egbhuang/C-ELM.pdf
Treatise on a newly discovered algorithm for feedforward neural network, ELM. Complete performance evaluation along with discussions in PDF format. [Free] |
  |
Probabilistic Networks Library (PNL), and Open Source Machine Learning Library (OpenML) by Intel - http://deviceforge.com/news/NS5332804106.html
Open source library of tools for building machine learning software based on Bayesian mathematical principles. Links on lower left of page article. [GPL] |
  |
Bayes++: Open Source Bayesian Filtering Classes - http://bayesclasses.sourceforge.net/Bayes++.html
A library of C++ classes for Bayesian Filtering of discrete systems. [MIT] |
  |
JProGraM -- PRObabilistic GRAphical Models in Java - http://www.dii.unisi.it/%7efreno/JProGraM.html
JProGraM is a machine learning library which supports learning and inference algorithms for Bayesian networks, Markov random fields, hybrid random fields, probabilistic decision trees, dependency networks, and Parzen windows. [GPL] |
 |
Bow: A Toolkit for Statistical Language Modeling, Text Retrieval, Classification and Clustering - http://www.cs.cmu.edu/~mccallum/bow/
A library of C code useful for writing statistical text analysis, language modeling, and information retrieval programs. The current distribution includes the library, as well as front-ends for document classification (rainbow), document retrieval (arrow) and document clustering (crossbow). [LGPL] |
 |
FastMix - http://www.cs.cmu.edu/~psand/
Generates Gaussian mixture models for large datasets using efficient KD-clustering algorithms. [Free] |
 |
NEITHER: A Propositional Theory Refinement - http://www.cs.utexas.edu/users/ml/neither.html
A system to modify an incomplete or incorrect rule base to make it consistent with a set of input training examples. Written in C++ [Free] |
 |
Markov TheBeast - http://thebeast.googlecode.com
A Markov Logic Interpreter that focusses on efficient MAP inference and Online Learning featuring MAP inference using Cutting Planes combined with Max-Walk-Sat programming, parametrized weights, a shell interpreter, and cardinality constraints. [GPL] |
 |
PRODIGY: An Architecture for Planning and Learning - http://www.cs.cmu.edu/afs/cs.cmu.edu/project/prodigy/Web/prodigy-home.html
A system of research planning and learning utilizing explanation-based learning, partial evaluation, experimentation, graphical knowledge acquisition, automatic abstraction, mixed-initiative planning, and case-based reasoning. [Free] |
 |
Machine Learning Software Packages - http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/learning/systems/0.html
Various classes, packages, macros and other software systems related to machine learning. [GPL] |
 |
Machine Learning and Data Mining Software - http://www.sciencemag.org/feature/data/compsci/machine_learning.dtl
Index of software links for computer learning software. Scroll down. [FREE] |
 |
CHILL: Constructive Heuristics Induction for Language Learning - http://www.cs.utexas.edu/users/ml/chill.html
A general approach to the problem of inducing natural language parsers. It uses an annotated corpus, and produces a parser by using ILP for inducing the rules that control the actions of a shift-reduce parser. [Free] |
 |
MIX: Software for Mixture Distributions - http://icarus.math.mcmaster.ca/peter/mix/mix.html
Software for learning mixture distributions. Examples and two case studies are included. [Commercial] |
 |
NSP: N-gram Statistics Package - http://www.d.umn.edu/~tpederse/nsp.html
Software for counting and analyzing word n-grams in text. This package provides standard tests of association for identifying word n-grams in large corpora and allows users to implement other tests with minimal scripting knowledge. Written in Perl. [GPL] |
 |
KEEL: Knowledge Extraction based on Evolutionary Learning - http://sci2s.ugr.es/keel/
The aim of this project is to develop a Computational Environment for integrating the design and use of knowledge extraction models from data using evolutionary algorithms. Genetic learning may also be applied to the model. [GPL] |
 |
The AutoClass Project - http://ic-www.arc.nasa.gov/ic/projects/bayes-group/autoclass/
A database of cases described by a combination of real and discrete valued attributes, and automatically finds the natural classes in that data. It can be seen as a Naive Bayes classifier where the class node is hidden. [Free] |
 |
Machine Learning Research Software in LISP - http://www.cs.utexas.edu/~ml/ml-progs.html
FTP repository of common list algorithms and datasets for research. [GPL] |
 |
What If: Web-based Scientific Discovery - http://swift.cmbi.kun.nl/WIWWWI/
An algorithm engine which will calculate everything from symmetry, torsion angles, polar fraction through protein analysis and bond angles. Online version only. [Free] |
 |
Machine Learning Programs by Peter Clark - http://www.cs.utexas.edu/users/pclark/software/
A collection of downloadable packages including: KM - The Knowledge Machine, Guiding Inductive Learning with a Qualitative Model, LPE - Lazy Partial Evaluation, and CN2 - Rule induction from examples. [GPL] |
 |
Nieme: Classification, Regression, and Ranking - http://www-connex.lip6.fr/~maes/wikihomepage/pmwiki.php?n=Nieme.Nieme
A machine learning library for classification, regression and ranking. It implements several well-known algorithms and is specially designed for large-scale applications. [GPL] |
 |
HMM: Pattern search and discovery - http://bioweb.pasteur.fr/seqanal/motif/sam-uk.html
A collection of tools for creating and using HMMs for biological sequences. [AFL] |