G2 Crowd builds the world's largest business commerce platform fueled by $100M in funding 🚀

Best Machine Learning Software

Machine learning algorithms make predictions or decisions based on data. These learning algorithms can be embedded within applications to provide automated, artificial intelligence (AI) features or be used in an AI platform to build brand new applications. In both cases, a connection to a data source is necessary for the algorithm to learn and adapt over time. There are many different types of machine learning algorithms that perform a variety of tasks and functions. These algorithms may consist of more specific machine learning algorithms, such as association rule learning, Bayesian networks, clustering, decision tree learning, genetic algorithms, learning classifier systems, and support vector machines, among others.

These learned algorithms may be developed with supervised learning or unsupervised learning. Supervised learning consists of training an algorithm to determine a pattern of inference by feeding it consistent data to produce a repeated, general output. Human training is necessary for this type of learning. Unsupervised learning, on the other hand, requires no consistency in the input of machine learning algorithms. Unsupervised algorithms independently reach an output and are a feature of deep learning algorithms. Reinforcement learning is the final form of machine learning, which consists of algorithms that understand how to react based on their situation or environment. For example, autonomous driving cars are an instance of reinforcement machine learning because they react based on their surroundings on the road. If a traffic light is red, the car stops. Machine learning algorithms are used by developers when using an AI platform to build an application or to embed AI within an existing application. End users of intelligent applications may not be aware that an everyday software tool is utilizing a machine learning algorithm to provide some form of automation. Additionally, machine learning solutions for businesses may come in a machine learning as a service model.

To qualify for inclusion in the Machine Learning category, a product must:

  • Offer an algorithm or product that learns and adapts based on data
  • Be the source of intelligent learning capabilities for applications
  • Consume data inputs from a variety of data pools
  • Provide an output that solves a specific issue based on the learned data
G2 Crowd Grid® for Machine Learning
High Performers
Momentum Leaders
Momentum Score
Market Presence

Get personalized Machine Learning recommendations

Compare Machine Learning Software
    Results: 228

    Star Rating

    Machine Learning reviews by real, verified users. Find unbiased ratings on user satisfaction, features, and price based on the most reviews available anywhere.

    Microsoft Bing Image Search API is a service that provides a similar (but not exact) experience to Bing.com/Images (overview on MSDN), it allow partners send a search query to Bing and get back a list of relevant images.

    Scikit-learn is a software machine learning library for the Python programming language that has a various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.

    Enjoy the power of Programmatic Machine Learning

    machine learning support vector machine (SVMs), and support vector regression (SVRs) are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis.

    Dialogflow is an end-to-end development suite for building conversational interfaces for websites, mobile applications, popular messaging platforms, and IoT devices.

    Microsoft Bing Web Search API is a service that retrieve web documents indexed by Bing and narrow down the results by result type, freshness and more, it bring intelligent search to apps and harness the ability to comb billions of webpages, images, videos, and news with a single API call.

    FloydHub is a platform specially designed for deep learning and eliminating the engineering bottlenecks.

    pyBrain is a modular machine learning library fr python that offer a flexible, easy-to-se and powerful algorithms for machine learning task and a variety of predefined environments to test and compare algorithms.

    Crab as known as scikits.recommender is a Python framework for building recommender engines that integrate with the world of scientific Python packages (numpy, scipy, matplotlib), provide a rich set of components from which user can construct a customized recommender system from a set of algorithms and be usable in various contexts: ** science and engineering ** .

    Use your own data to create, train, and deploy machine learning and deep learning models. Leverage an automated, collaborative workflow to grow intelligent business applications easily and with more confidence.

    Our platform leverages human-in-the-loop practices to train, test, and tune machine learning models. At Figure Eight, we know that AI isn’t magic. We know what it takes to create AI that isn’t just a science project, but AI that works in the real world. And we provide the crucial ingredients that make it happen. We believe that AI is the combination of three important components: training data, machine learning, and humans-in-the-loop.

    Cloud AutoML is a suite of machine learning products that enables developers with limited machine learning expertise to train high-quality models specific to their business needs, by leveraging Google's state-of-the-art transfer learning, and Neural Architecture Search technology

    MLlib is Spark's machine learning (ML) library that make practical machine learning scalable and easy it provides ML Algorithms: common learning algorithms such as classification, regression, clustering, and collaborative filtering, feature extraction, transformation, dimensionality reduction, and selection, tools for constructing, evaluating, and tuning ML Pipelines, saving and load algorithms, models, and Pipelines and linear algebra, statistics, data handling, etc.

    PrediCX is a predictive analytics engine designed to take all heterogeneous data and process it dynamically to make recommendations to operators in terms of 'next best action' based ultimately on optimising the customer experience (CX).

    Weka is a machine learning algorithms for data mining tasks that can either be applied directly to a dataset or called from own Java code, it contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization and well-suited for developing new machine learning schemes.

    XGBoost is an optimized distributed gradient boosting library that is efficient, flexible and portable, it implements machine learning algorithms under the Gradient Boosting framework and provides a parallel tree boosting(also known as GBDT, GBM) that solve many data science problems in a fast and accurate way.

    Cloud TPU empowers businesses everywhere to access this accelerator technology to speed up their machine learning workloads on Google Cloud

    IBM Watson Personality Insights is a tool that extracts and analyzes a spectrum of personality attributes to help discover actionable insights about people and entities, and in turn guides end users to highly personalized interactions.

    mlr: Machine Learning in R that interface to a large number of classification and regression techniques, including machine-readable parameter descriptions.

    Microsoft Bing News Search API is a tool that search the web for news articles including details like authoritative image of the news article, related news and categories, provider info, article URL, and date added.

    Microsoft Cognitive Toolkit is an open-source, commercial-grade toolkit that empowers user to harness the intelligence within massive datasets through deep learning by providing uncompromised scaling, speed and accuracy with commercial-grade quality and compatibility with the programming languages and algorithms already use.

    Apache Storm is a free and open source distributed realtime computation system. Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing.

    BioPy is a collection of biologically-inspired algorithms written in Python that are more focused on artificial model's of biological computation, such as Hopfield Neural Networks, while others are inherently more biologically-focused, such as the basic genetic programming module included in this project.

    Microsoft Machine Learning Server is your flexible enterprise platform for analyzing data at scale, building intelligent apps, and discovering valuable insights across your business with full support for Python and R. Machine Learning Server meets the needs of all constituents of the process – from data engineers and data scientists to line-of-business programmers and IT professionals. It offers a choice of languages and features algorithmic innovation that brings the best of open-source and proprietary worlds together

    Nilearn is a Python module for fast and easy statistical learning on NeuroImaging data that leverages the scikit-learn Python toolbox for multivariate statistics with applications such as predictive modelling, classification, decoding, or connectivity analysis.

    The ML-Agents SDK allows researchers and developers to transform games and simulations created using the Unity Editor into environments where intelligent agents can be trained using Deep Reinforcement Learning, Evolutionary Strategies, or other machine learning methods through a simple to use Python API.

    AForge.Video is library that contains interfaces and classes to access different video sources, such as IP video cameras (MJPEG streams).

    Bolt is a discriminative learning of linear predictors (e.g. SVM or Logistic Regression) that uses fast online learning algorithms to aimed large-scale, high-dimensional and sparse machine-learning problems. In particular, problems encountered in information retrieval and natural language processing.

    clj-ml is a machine learning library for Clojure that can be applied to data sets to modify the dataset in some way: transforming nominal attributes into binary attributes, removing attributes etc.

    GoLearn is a 'batteries included' machine learning library for Go that implements the scikit-learn interface of Fit/Predict, to easily swap out estimators for trial and error it includes helper functions for data, like cross validation, and train and test splitting.

    GraphLab Create is a Python library, backed by a C++ engine, for quickly building large-scale, high-performance data products.

    HLearn is a high performance machine learning library written in Haskell to discover the "best possible" interface for machine learning. This involves two competing demands: The library should be as fast as low-level libraries written in C/C++/Fortran/Assembly; but it should be as flexible as libraries written in high level languages like Python/R/Matlab.

    htm.java is a Hierarchical Temporal Memory implementation in Java - an official Community-Driven Java port of the Numenta Platform for Intelligent Computing (NuPIC) it provide a Java version of NuPIC that has a 1-to-1 correspondence to all systems, functionality and tests provided by Numenta's open source implementation; while observing the tenets, standards and conventions of Java language best practices and development.

    Intel Data Analytics Acceleration Library (or Intel DAAL) is a software development library that is highly optimized for Intel architecture processors it provides building blocks for all data analytics stages, from data preparation to data mining and machine learning.

    kernlab is a Kernel-based machine learning methods for classification, regression, clustering, novelty detection, quantile regression and dimensionality reduction and the method support Vector Machines, Spectral Clustering, Kernel PCA, Gaussian Processes and a QP solver.

    Learning Based Java is a modeling language for the rapid development of software systems with one or more learned functions, designed for use with the JavaTM programming language that offers a convenient, declarative syntax for classifier and constraint definition directly in terms of the objects in the programmer's application.

    Microsoft Academic Knowledge API is a service that allow user to interpret queries for academic intent and retrieve rich information from the Microsoft Academic Graph (MAG), it is a knowledge base web-scale heterogeneous entity graph comprised of entities that model scholarly activities: field of study, author, institution, paper, venue, and event.

    Recommendations API is a tool that helps customer discover items in users catalog, customer activity in a user's digital store is used to recommend items and to improve conversion in digital store.

    Naive Bayesian Classification for Golang that perform classification into an arbitrary number of classes on sets of strings.

    Pattern Recognition and Machine Learning is a Matlab implementation of the algorithms.

    Pattern is a web mining module for the Python programming language that has a tools for data mining (Google, Twitter and Wikipedia API, a web crawler, a HTML DOM parser), natural language processing (part-of-speech taggers, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, clustering, SVM), network analysis and visualization.

    Pebl is a python library and command line application for learning the structure of a Bayesian network given prior knowledge and observations that can learn with observational and interventional data, handles missing values and hidden variables using exact and heuristic methods, provides several learning algorithms; makes creating new ones simple, has facilities for transparent parallel execution using several cluster and cloud resources, calculates edge marginals and consensus networks and presents results in a variety of formats.

    pyhsmm Bayesian inference in HSMMs and HMMs is a Python library for approximate unsupervised inference in Bayesian Hidden Markov Models (HMMs) and explicit-duration Hidden semi-Markov Models (HSMMs), focusing on the Bayesian Nonparametric extensions, the HDP-HMM and HDP-HSMM, mostly with weak-limit approximations.

    Pylearn2 is a library for machine learning research.

    Sparkling Water is a tool that allows users to combine the fast, scalable machine learning algorithms of H2O with the capabilities of Spark, users can drive computation from Scala/R/Python and utilize the H2O Flow UI, providing an ideal machine learning platform for application developers.

    Torch is a scientific computing framework with wide support for machine learning algorithms that puts GPUs first.

    tree is a classification and regression trees that can be used as alternatives to logistic regression and a way that can be sed to show the probability of being in any hierachical group.

    Vowpal Wabbit is a machine learning system that pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

    wAInut is a tool that work in progress towards an artificial general intelligence(AGI) for everyone

    warpt-ctc is a loss function useful for performing supervised learning on sequence data, without needing an alignment between input data and labels that can be used to train end-to-end systems for speech recognition

    Accord.MachineLearning contains Support Vector Machines, Decision Trees, Naive Bayesian models, K-means, Gaussian Mixture models and general algorithms such as Ransac, Cross-validation and Grid-Search for machine-learning applications.

    ADAM is a scalable Genomes Clustering With ADAM and Spark that provides both an application programming interface (API) and a command line interfacen CLI) for manipulating genomic data on a computing cluster.

    AForge.MachineLearning is a namespace that contains interfaces and classes for different algorithms of machine learning.

    AIToolbox is a toolbox of AI modules written in Swift: Graphs/Trees, Linear Regression, Support Vector Machines, Neural Networks, PCA, KMeans, Genetic Algorithms, MDP, Mixture of Gaussians, Logistic Regression

    Amazon DSSTNE: Deep Scalable Sparse Tensor Network Engine is an open source software library for training and deploying recommendation models with sparse inputs that is fully connected hidden layers, and sparse outputs. Models with weight matrices that are too large for a single GPU can still be trained on a single host it has been used at Amazon to generate personalized product recommendations for Amazon customer It is designed for production deployment of real-world applications which need to emphasize speed and scale over experimental flexibility.

    Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given query point and it creates large read-only file-based data structures that are mmapped into memory so that many processes may share the same data.

    Apache PredictionIO is an open source Machine Learning Server built on top of a state-of-the-art open source stack for developers and data scientists to create predictive engines for any machine learning task.

    Apache SAMOA is a distributed streaming machine learning (ML) framework that contains a programing abstraction for distributed streaming ML algorithms it enables development of new ML algorithms without directly dealing with the complexity of underlying distributed stream processing engines (DSPEe, such as Apache Storm, Apache Flink, and Apache Samza) users can develop distributed streaming ML algorithms once and execute them on multiple DSPEs.

    Apache SystemML is a machine learning platform optimal for big data that provides an optimal workplace for machine learning using big data, it can be run on top of Apache Spark, where it automatically scales your data, line by line, determining whether your code should be run on the driver or an Apache Spark cluster.

    Azure Bing Custom Search is an easy-to-use, ad-free custom search tool that lets you deliver the search results you want. Bing Custom Search allows you to select the slices of the web that you want to search over and control the ranking when searching over your targeted web space.

    Azure Bing Entity Search API identifies relevant entity based on your searched term, spanning multiple entity types such as famous people, places, movies, TV shows, video games, books, and even local businesses near you.

    Azure Content Moderator from Microsoft Azure provides automated moderation APIs for images, text and videos as well as a human review tool.

    Azure Custom Decision Service helps you create intelligent systems with a cloud-based contextual decision-making API that sharpens with experience.

    Azure QnA Maker API is a free, easy-to-use, REST API and web-based service that trains AI to respond to user's questions in a more natural, conversational way.

    bayesian-bandit.js is an adaptation of the Bayesian Bandit code from Probabilistic Programming and Bayesian Methods for Hackers, specifically d3bandits.js the code has been rewritten to be more idiomatic and also usable as a browser script or npm package and includes unit test.

    BIDMach is a machine learning toolkit that offers "rooflined" (optimized to the limit) compute primitives and competitive performance on learning tasks like regression, clstering, classification and matrix factorization.

    Brushfire is a framework for distributed supervised learning of decision tree ensemble models in Scala.

    C5.0 is a decision trees and rule-based models for pattern recognition that extracts informative patterns from data.

    Classifier is a general module to allow Bayesian and other types of classifications.

    Clojush is a version of the Push programming language for evolutionary computation, and the PushGP genetic programming system, implemented in Clojure.

    CloudForest allows for a number of related algorithms for classification, regression, feature selection and structure analysis on heterogeneous numerical / categorical data with missing values.

    Commerce Cloud Einstein platform allows retailers to support personalized, AI-powered experiences for shoppers that span web, mobile, social, in-store and more,

    Conjecture is a framework for building machine learning models in Hadoop using the Scalding DSL that enable the development of statistical models as viable components in a wide range of product settings.

    CORElearn is a suite of machine learning algorithms written in C++ with R interface that contains several machine learning model learning techniques in classification and regression these methods can be used for example to discretize numeric attributes. Its additional feature is OrdEval algorithm and its visualization used for evaluation of data sets with ordinal features and class, enabling analysis according to the Kano model of customer satisfaction.

    CoxBoost is a package that provides routines for fitting Cox models by likelihood based boosting for a single endpoint or in presence of competing risks

    The DataRobot automated machine learning platform captures the knowledge, experience and best practices of the world’s leading data scientists to deliver unmatched levels of automation and ease-of-use for machine learning initiatives. DataRobot enables users of all skill levels – from business people to analysts to data scientists – to build and deploy highly-accurate machine learning models in a fraction of the time of traditional modeling methods.

    DecisionTree.jl is a Julia classifier with the implimentation of the ID3 algorithm with post pruning (pessimistic pruning), parallelized bagging (random forests), adaptive boosting (decision stumps), cross validation (n-fold) and support for mixed nominal and numerical data.

    Disco is a lightweight, open-source framework for distributed computing based on the MapReduce paradigm it distributes and replicates data, and schedules jobs efficiently it includes the tools need to index billions of data points and query them in real-time.

    Dlib Machine Learning is a tool that contains a wide range of machine learning algorithms, designed to be highly modular, quick to execute, and simple to use via a clean and modern C++ API and used in a wide range of applications including robotics, embedded devices, mobile phones, and large high performance computing environments.

    Eggplant AI automatically generates test cases and optimizes test execution to find defects and maximize coverage of user journeys.

    Encog is an advanced machine learning framework that supports a variety of advanced algorithms, as well as support classes to normalize and process data, its training algoritms are multi-threaded and scale well to multicore hardware and can also make use of a GPU to further speed processing time. A GUI based workbench is also provided to help model and train machine learning algorithms.

    evtree is an Evolutionary Learning of Globally Optimal Trees that commonly used classification and regression tree methods like the CART algorithm are recursive partitioning methods that build the model in a forward stepwise search.

    FACTORIE is a toolkit for deployable probabilistic modeling, implemented as a software library in Scala that provides its users with a succinct language for creating relational factor graphs, estimating parameters and performing inference.

    Feature Forge is a set of tools for creating and testing machine learning features, with a scikit-learn compatible API

    Figaro is a probabilistic programming language that supports development of very rich probabilistic models and provides reasoning algorithms that can be applied to models to draw useful conclusions from evidence.

    FlinkML is the Machine Learning (ML) library for Flink it has a growing list of algorithms and contributors that aim to provide scalable ML algorithms, an intuitive API, and tools that help minimize glue code in end-to-end ML systems.

    fpc is a Flexible methods procedures for clustering and cluster validation.

    fungp is a genetic programming library implemented in the Clojure programming language it uses a process of evolution (mimicing natural selection in nature) to create and rewrite Clojure code.

    gago is a genetic algorithm library written in Go that is architectured in a modular way, allows using different evolutionary models, includes speciation, migration and parallel populations, allows implementing custom genetic operators, has no external dependencies, got a high test coverage and actively maintained and will remain one my priorities for a very long time.

    Ganitha is an open-source library (derived from the Sanskrit word for mathematics, or science of computation) is a Scalding library with a focus on machine-learning and statistical analysis.

    Gensim is a Python library that analyze plain-text documents for semantic structure and retrieve semantically similar document.

    haskell-ml is an implementations of various ML algorithms.

    Quickly create, deploy and manage high quality self-learning behavioral models to extract hidden value from enterprise data securely, in place and in real time.

    igraph is a collection of network analysis tools that emphasis on efficiency, portability and ease of use and can be programmed in R, Python and C/C++.

    Jruby Mahout is a gem that unleashes the power of Apache Mahout in the world of JRuby and supports Mahout recommendations it includes a simple Postgres manager that can be used to manage appropriate recommendations tables.

    kNear is a javascript implementation of the k-nearest neighbors algorithm a supervised machine learning algorithm numeric points are assigned a classification and 'learned' by the machine. New points are classified based on their proximity to points which have been 'learned' by the machine.

    KRFuzzyCMeans has implemented Fuzzy C-Means (FCM) the fuzzy clustering / classification algorithm on Machine Learning, it could be used in data mining and image compression.

    KRKmeans-Algorithm implemented K-Means the clustering algorithm and achieved multi-dimensional clustering that could be used in data mining, image compression and classification.

    LDA is a machine learning algorithm that extracts topics and their related keywords from a collection of documents.

    Learning.js is a Javascript implementation of several machine learning algorithms including Decision Tree and Logistic Regression

    LogicReg is a routines used for fitting Logic Regression models

    MAChineLearning is a framework that provides a quick and easy way to experiment with machine learning with native code on the Mac. It is written in Objective-C and it is usable by Swift.

    Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed, it focuses on the development of computer programs that can teach themselves to grow and change when exposed to new data.

    MachineLearning is a package that represents the very beginnings of an attempt to consolidate common machine learning algorithms written in pure Julia and presenting a consistent API, it will be targeted towards the machine learning practitioner, working with a dataset that fits in memory on a single machine

    Apache Mahout is a software that build an environment for quickly creating scalable performant machine learning applications, it provides three major features: A simple and extensible programming environment and framework for building scalable algorithms, A wide variety of premade algorithms for Scala + Apache Spark, H2O, Apache Flink and Samsara, a vector math experimentation environment with R-like syntax which works at scale

    maptre is a functions with example data for graphing, pruning, and mapping models from hierarchical clustering, and classification and regression trees.

    mboost function as gradient descent algorithm (boosting) for optimizing general risk functions utilizing component-wise (penalised) least squares estimates or regression trees as base-learners for fitting generalized linear, additive and interaction models to potentially high-dimensional data.

    metric-learn is the sub-field of machine learning dedicated to automatically constructing optimal distance metrics.

    MGL-GPR is a library of evolutionary algorithms such as Genetic Programming (evolving typed expressions from a set of operators and constants) and Differential Evolution.

    Microsoft Bing Autosuggest API is a tool that help users complete queries faster by adding intelligent type-ahead capabilities to an app or website.

    Microsoft Bing Video Search API is an API tool that find videos across the web and provide useful metadata including creator, encoding format, video length, view count, and more.

    Microsoft Entity Linking Intelligence Service is a web service that help developers with tasks relating to entity linking, given a specific paragraph within a document,this service will recognize and identify each separate entity based on its context.

    Microsoft Knowledge Exploration Service is a service that offers a fast and effective way to add interactive search and refinement to applications, it allows user to build a compressed index from structured data, author a grammar that interprets natural language queries, and provide interactive query formulation with auto-completion suggestions.

    Milk is a machine learning toolkit in Python that focuses on supervised classification with several classifiers available: SVMs (based on libsvm), k-NN, random forests, decision trees. It also performs feature selection. These classifiers can be combined in many ways to form different classification systems.

    MLBase.jl is a swiss knife for machine learning that does not implement specific machine learning algorithms, instead, it provides a collection of useful tools to support machine learning programs, including: Data manipulation & preprocessing, Score-based classification, Performance evaluation (e.g. evaluating ROC), Cross validation and Model tuning (i.e. search best settings of parameters).

    ml.js is a machine learning and numeric analysis tools in javascript for node.js and browser.

    MLKit is a machine learning framework written in Swift that features machine learning algorithms that deal with the topic of regression to provide developers with a toolkit to create products that can learn from data.

    mlpack is a scalable machine learning library, written in C++, that aims to provide fast, extensible implementations of cutting-edge machine learning algorithms, these algorithms as simple command-line programs and C++ classes which can then be integrated into larger-scale machine learning solutions.

    Mlxtend (machine learning extensions) is a Python library of useful tools for the day-to-day data science tasks.

    Naive Bayesian Classifer in APL is a tool that gain independent probabilistic assumptions on test input and requires precisely 2 groups with training data.

    All the talk about qualitative data analysis is for naught if you can’t understand language as it is spoken. That is what Natural Language Processing (NLP) is all about. NewSci NLP brings this power to organization’s seeking to extract insights from their unstructured data. Just as you know what a person is saying when you hear, “I’m hungry, I want an apple” vs. “I really want an Apple™ instead of a PC,” so now can a computer. NewSci NLP enables a computer to understand the people, places, and things important to your organization. This, in turn, allows your unstructured data to be analyzed just like your structured data. With NewSci NLP your organization will enjoy qualitative analysis (the Why behind the numbers) alongside your quantitative analytics. Uses models customized to your organization; the domain in which you operate; the quality of your recordings; and even local and regional dialects to deliver the highest level of transcription accuracy. Captures your organization’s domain and unique characteristics to enable deep Natural Language Understanding analysis and Natural Language Generation. Your NewSci Ontology will be your Rosetta Stone for unlocking the value hidden in your unstructured data. The NewSci Insight Reservoir™ brings governance and insight to the data lake. You enjoy all the benefits of a state-of-the-art Big Data lake including access to hundreds of data connectors for ingesting information; transformation tools for quality assurance and data enhancement; and cataloging of your data down to the field level while at the same time having unmatched data governance capabilities: Unlike a passive data lake, the NewSci Insight Reservoir™ is a powerful cognitive computing platform where you can perform machine learning; deep learning; and natural language processing on all your structured and unstructured data. NewSci NLP connects directly to your NewSci Insight Reservoir™ to extract meaning from your text and make it available for analysis. Machine and Deep Learning algorithms can be created, and perfected, as data enters the Insight Reservoir™, increasing the value in real-time. And all of the insights can easily be made available for visualization tools including Tableau®, Qlik®, and MS Power- BI®. Jump out of the data lake and get your organization into the NewSci Insight Reservoir™

    oblique.tree is a function that grows oblique trees for classification data.

    OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms that supports teaching agents everything from walking to playing games like Pong or Go.

    partykit: A Toolkit for Recursive Partytioning with infrastructure for representing, summarizing, and visualizing tree-structured regression and classification models, this unified infrastructure can be used for reading/coercing tree models from different sources ('rpart', 'RWeka', 'PMML') yielding objects that share functionality for print()/plot()/predict() methods.

    PredictionBuilder is a library for machine learning that builds predictions using a linear regression.

    Push is a programming language designed for evolutionary computation, to be used as the programming language within which evolving programs are expressed it is a stack-based execution architecture in which there is a separate stack for each data type that allow programs to manipulate their own code as they run and thereby to implement arbitrary and potentially novel control structures.

    python-recsys is a python library for implementing a recommender system.

    Quantile Regression Forests is a tree-based ensemble method for estimation of conditional quantiles, suited for high-dimensional data.

    Random Forest is an ensemble learning method for classification, regression and other tasks, that operate by constructing a multitude of descision trees at training time and outputing the class that is the mode of classes (classification) or mean prediction (regression) or the individual trees.

    rapaio is a statistics, data mining and machine learning toolbox

    Recommender is a tool that analyzes the the feedback of some users (implicit and explicit) and their preferences for some items to learns patterns and predicts the most suitable products for a particular user.

    Reproducible Experiment Platform (REP) is a software infrastructure to support collaborative ecosystem for computational science it is a Python based solution for research teams that allows running computational experiments on shared datasets, obtaining repeatable results, and consistent comparisons of the obtained results.

    rgenoud is a function that combines evolutionary search algorithms with with derivative-based (Newton or quasi-Newton) methods to solve difficult optimization and it can be sed for optimization problems for which derivatives do not exist.

    RGP is a simple modular Genetic Programming (GP) system build in pure R this system supports Symbolic Regression by GP through the familiar R model formula interface, GP individuals are represented as R expressions, an (optional) type system enables domain-specific function sets containing functions of diverse domain- and range types and is a basic set of genetic operators for variation (mutation and crossover) and selection is provided.

    ROOT is a modular scientific software framework that provides all the functionalities needed to deal with big data processing, statistical analysis, visualisation and storage, it is mainly written in C++ but integrated with other languages such as Python and R.

    Recursively Partitioned Mixture Model (RPMM) for Beta and Gaussian Mixtures is a model-based clustering algorithm that returns a hierarchy of classes, similar to hierarchical clustering, but also similar to finite mixture models.

    rusty-machine is a general purpose machine learning library that implemented in rust without requiring a hge number of external dependencies.

    shaman is a Machine Learning library for node.js that supports both simple linear regression and multiple linear regression.

    SHARK is a fast, modular, feature-rich open-source C++ machine learning library that provides methods for linear and nonlinear optimization, kernel-based learning algorithms, neural networks, and various other machine learning techniques and is compatible with Windows, Solaris, MacOS X, and Linux.

    SimpleAI simple artificial intelligence utilities that allows user to define problems and look for the solution with different strategies.

    Simple Bayes is a Naive Bayes machine learning implementation in Elixir.

    Spearmint is a software package to perform Bayesian optimization that automatically run experiments (thus the code name spearmint) in a manner that iteratively adjusts a number of parameters so as to minimize some objective in as few runs as possible.

    Stan is a tool that is use for statistical modeling, data analysis, and prediction in the social, biological, and physical sciences, engineering, and business

    Stanford Classifier is a machine learning tool that will take data items and place them into one of k classes it can also give a probability distribution over the class assignment for a data item.

    Statistiker is a way to do Statistics in Clojure to have an implementation of all popular algorithms for mining datasets.

    Steam AI engine is an end-to-end platform that streamlines the entire process of building and deploying smart applications, data scientists and developers can launch turnkey compute environments for collaboratively training and deploying predictive models and integrate those models into real-time smart applications.

    SuperLearner is a package that implements the super learner prediction method and contains a library of prediction algorithms to be used in the super learner.

    svmpath is a package that computes the entire regularization path for the two-class svm classifier with essentially the same cost as a single SVM fit.

    Swift AI is a high-performance AI and machine learning library written entirely in Swift that includes a set of common tools used for machine learning and artificial intelligence research.

    Swift Brain is a neural network / machine learning library written in Swift for AI algorithms in Swift for iOS and OS X development it includes algorithms focused on Bayes theorem, neural networks, SVMs, Matrices, etc.

    SwiftLearner is a scala machine learning library that is easier to follow than the optimized libraries, and easier to tweak it use plain Java types and have few or no dependencies.

    tgp is a Bayesian nonstationary, semiparametric nonlinear regression and design by treed Gaussian processes (GPs) with jumps to the limiting linear model (LLM) in special cases also implemented include Bayesian linear models, CART, treed linear models, stationary separable and isotropic GPs, and GP single-index models.

    Theano is a Python library that allows user to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently

    Libra Toolkit is a collection of algorithms for learning and inference with discrete probabilistic models, including Bayesian networks (BNs), Markov networks (MNs), dependency networks (DNs), sum-product networks (SPNs), and arithmetic circuits (ACs), it focuses more on structure learning, especially for tractable models in which exact inference is efficient. Each algorithm in Libra is implemented as a command-line program suitable for interactive use or scripting, with consistent options and file formats throughout the toolkit.

    Topic models are Bayesian, hierarchical mixture models of discrete data that implements utilities for reading and manipulating data commonly associated with topic models as well as inference and prediction procedures for such models.

    topik is a topic modeling toolbox that provide a full suite and high-level interface for applying topic modeling, it includes many utilities beyond statistical modeling algorithms and wraps all of its features into an easy callable function and a command line interface and it is built on top of existing natural language and topic modeling libraries and primarily provides a wrapper around them, for a quick and easy exploratory analysis of your text data sets.

    ToPS is an objected-oriented framework that is implemented using C++ that facilitates the integration of probabilistic models for sequences over a user defined alphabet it contains the implementation of eight distinct models to analyze discrete sequences: Independent and identically distributed model, Variable-Length Markov Chain (VLMC), Inhomogeneous Markov Chain, Hidden Markov Model, Pair Hidden Markov Model, Profile Hidden Markov Model, Similarity Based Sequence Weighting and Generalized Hidden Markov Model (GHMM).

    TPOT is a Python tool that automatically creates and optimizes machine learning pipelines using genetic programming.

    Vulpes is an implementation of deep belief and deep learning, written in F# and using Alea.cuBase to connect to PC's GPU device.

    yahmm is a module that implements Hidden Markov Models (HMMs) with a compositional, graph- based interface it can construct node by node and edge by edge, built up from smaller models, loaded from files, baked (into a form that can be used to calculate probabilities efficiently), trained on data, and saved.

    YCML is an Artificial Intelligence, Machine Learning and Optimization framework written in Objective-C, it can be used both in Objective-C as well as in Swift and has been verified to run on MacOS and iOS.

    Accord.NET Framework is a .NET machine learning framework combined with audio and image processing libraries completely written in C#, it is a framework for building production-grade computer vision, computer audition, signal processing and statistics applications even for commercial use

    Aerosolve is a machine learning package built for humans its library is meant to be used with sparse, interpretable features such as those that commonly occur in search (search keywords, filters) or pricing (number of rooms, location, price). It is not as interpretable with problems with very dense non-human interpretable features such as raw pixels or audio samples.

    AForge.Genetic is a library that contains classes to rn Genetic Algorithms (GA), Genetic Programing (GP) and Gene Expression Programming (GEP).

    AForge.Imaging is a namespace that contains interfaces and classes for different image processing and routines.

    AForge.Robotics is a libraru that provide support for some robotic kits.

    Faster, easier machine learning models for busy business professionals

    APEX is an AI-enhanced technology platform intended to provide solutions for your business end to end. With APEX you gain access to the same powerful AI capabilities and tools used by the tech unicorns at a fraction of the cost. APEX allows you to realize the full benefits of the AI technologies, while sustaining governance, flexibility, scalability, tool compatibility, and collaboration. Through the integration of the most advanced open source and proprietary 2021.AI technological components, APEX enhances data governance, increases maintainability and quality of the AI models. APEX can be installed either on-premises, or consumed in private or public cloud. APEX offers 3 editions: Front, Go, and Enterprise, all capable of delivering immediate business value for companies of all sizes, in all the stages of AI maturity and ambitions.

    Machine Learning Platform For AI provides end-to-end machine learning services, including data processing, feature engineering, model training, model prediction, and model evaluation. Machine Learning Platform For AI combines all of these services to make AI more accessible than ever.

    AstroML is a Python module for machine learning and data mining that provide a community repository for fast Python implementations of common tools and routines used for statistical data analysis in astronomy and astrophysics, to provide a uniform and easy-to-use interface to freely available astronomical datasets.

    Breeze is a Scala library for numerical processing that aims to be generic, clean, and powerful without sacrificing (much) efficiency.

    bst: Gradient Boosting is a Functional gradient descent algorithm for a variety of convex and non-convex loss functions, for both classical and robust regression and classification problems.

    CheckRecipient is an Artificial Intelligence & Machine Learning Email Security Platform which prevents highly sensitive information being sent to the wrong people.

    ClearTK is a Machine Learning for UIMA it is a framework for developing machine learning and natural language processing components within the Apache Unstructured Information Management Architecture.

    clusterfck is a JavaScript library for hierarchical clustering it is used to group similar items together. Hierarchical clustering in particular is used when a hierarchy of items is needed or when the number of clusters isn't known ahead of time.

    Clusterone is a cloud agnostic, deep learning platform that enables teams to bridge the AI Gap through scalable model training, running distributed computing or many concurrent experiments, flexible infra with Zero DevOps, and lowest computing cost

    Comportex is an implementation of Hierarchical Temporal Memory as a Clojure library, it is a separate implementation based initially on the Numenta CLA white paper but significantly evolved.

    CRFsuite is a tool that allow implementation of Conditional Random Fields (CRFs) for labeling sequential data.

    Cubist is a regression modeling using rules with added instance-based corrections

    Intelligent recognition technology is provided as a service for organizations looking to extract data from any forms and documents.

    DEAP is a novel evolutionary computation framework for rapid prototyping and testing of ideas. It seeks to make algorithms explicit and data structures transparent. It works in perfect harmony with parallelisation mechanism such as multiprocessing and SCOOP.

    Decision Trees is a NodeJS implementation of decision tree using ID3 algorithm

    DeepDetect is a deep learning API and server that is written in C++11 to makes deep learning easy to work with and integrate into existing applications.

    While most of the methods to treat data relies on old expensive software and valuable human resources, the AI will transform your big data into powerful predictive inferences to be deployed in real time, 24/7 at a competitive price.

    DiffSharp is a functional automatic differentiation (AD) library that allows exact and efficient calculation of derivatives, by systematically invoking the chain rule of calculus at the elementary operator level during program execution.

    Microsoft Distributed Machine Learning Toolkit (DMTK) is a framework that contains both algorithmic and system innovations to make machine learning tasks on big data highly scalable, efficient, and flexible

    Dynamic Predictive Audiences is a customer segmentation software from simMachines. This machine learning platform is capable of making actionable predictions and recommendations with the justification behind each recommendation.

    DynaML is a Scala environment for conducting research and education in Machine Learning that packaged with a library of classes for various predictive models and a Scala REPL where one can not only build custom models but also play around with data work-flows.

    Fido is a light-weight, open-source, and highly modular C++ machine learning library that targeted towards embedded electronics and robotics, it includes implementations of trainable neural networks, reinforcement learning methods, genetic algorithms, and a full-fledged robotic simulator.

    frbs is an implementation of various learning algorithms based on fuzzy rule-based systems (FRBSs) for dealing with classification and regression tasks it allows to construct an FRBS model defined by human experts.

    Cloud Datalab is a powerful interactive tool created to explore, analyze, transform and visualize data and build machine learning models on Google Cloud Platform. It runs on Google Compute Engine and connects to multiple cloud services easily so you can focus on your data science tasks.

    H2O is a tool that makes it possible for anyone to easily apply machine learning and predictive analytics to solve today's most challenging business problems, it combine the power of highly advanced algorithms, the freedom of open source, and the capacity of truly scalable in-memory processing for big data on one or many nodes.

    imbalanced-learn is a python package that offera number of re-sampling tecniqes commonly used in datasets showing strong between-class imbalance.

    The Intelligent Automation Engine ushers in a new era of workplace productivity. Machine learning drives new levels of automation pushing machines to get work done smarter and faster.

    KRHebbian implemented Hebbian algorithm that is a non-supervisor of self-organization algorithm of Machine Learning

    Leaf is a open, modular and clear-designed Machine Intelligence Framework that provide performance for distributed (Deep|Machine) Learning - sharing concepts from Tensorflow and Caffe.

    The leading Real-Time Personalization platform utilizing Augmented Intelligence to help enterprises transform their customer experiences across every touch point. Leveraging powerful machine learning algorithms, LiftIgniter helps clients deliver end to end real-time personalized experiences and true customer centricity. We empower Marketing teams by combining massive machine learning processing power and scale with human insights for continuous A/B testing through Augmented Intelligence. - Real-time marketing and personalization platform leveraging powerful and dynamic machine learning models to continuously process billions of signals at any point in time. - Augmented Intelligence for real-time personalization at scale that transforms the customer experience across every touch point - Continuous A/B testing platform that pairs machine learning with human intuition. CI/CD for the Marketing team. - Our integration is easy and lightweight with no need to port over years of historical data to be successful. - Flexible solution that works well for sparse data sets, recent user behavior, as well as a cold start or new site users.

    LIONoso is a comprehensive Machine Learning and Intelligent Optimization tool for non-profit research and academic use, users adopt it for orchestrating heterogeneuos components that deals with automating processes, with the arrangement, coordination, and management of complex software components connecting data, experiments, simulators, models, decisions.

    MachineMetrics is a real-time manufacturing analytic platform designed to make monitoring CNC machines simple.

    Metis Machine is a data science platform designed to allow machine learning, deep learning, neural networks, predictive and prescriptive analytics, cognitive intelligence.

    MLDB is an open-source database designed for machine learning that can be install in any device and send commands over a RESTful API to store data, explore it using SQL, then train machine learning models and expose them as APIs.

    MLJAR is a predictive analytics platform. It facilitates machine learning algorithms search and tuning.

    Numenta is a machine intelligence solution that delivers capabilities and demonstrates a computing approach based on biological learning principles to help you manage your business.

    NuPIC is an open source project based on a theory of neocortex called Hierarchical Temporal Memory (HTM) that can be used to analyze streaming data, it learns the time-based patterns in data, predicts future values, and detects anomalies and includes discussion groups on HTM theory, research on extending HTM, and source code for complete applications based on HTM.

    Onyx is a framework for building applications that includes packaged, end-to-end applications for collaborative filtering, classification, regression and clustering.

    OsmosisAI is a friendly UI driven platform that dramatically simplifies the process of training and deploying Vision AI classifiers for the enterprise.

    Pattern Recognition Toolbox for MATLAB is a tool that provides an easy to use and robust interface to dozens of pattern classification tools making cross-validation, data exploration, and classifier development rapid and simple it gives user the power to apply sophisticated data analysis techniques to the problem.

    The Pendo Systems Machine Learning Platform turns unstructured data into AI-ready, structured datasets at machine scale

    pgmpy [pgmpy] is a python library for working with graphical models that allows the user to create their own graphical models and answer inference or map queries over them.

    Qubole delivers a Self-Service Platform for Big Data Analytics built on Amazon, Microsoft and Google Clouds

    Recommendation360™ Retail is a recommendation engine that enables e-commerce retailers to generate hyper-personalized product recommendations to their customers.

    Remesh platform uses machine learning to understand and engage groups of people with real-time conversation.

    Rmalschains it is a package that implements an algorithm family for continuous optimization called memetic algorithms with local search chains (MA-LS-Chains), memetic algorithms are hybridizations of genetic algorithms with local search methods suited for continuous optimization.

    rustlearn is a machine learning crate for Rust that contains reasonably effective implementations of a number of common machine learing algorithms.

    RXA is a cloud-based software company that offers machine learning and artificial intelligence applications to help you make smarter, faster decisions.

    SAS FACTORY MINER is a softwarethat automatically build and retrain hundreds of predictive models across multiple segments and pick the best model for each segment to reveal new opportunities, expose hidden risks, and fuel smarter, well-timed decisions.

    Saul is a modeling language implemented as a domain specific language (DSL) in Scala that facilitate designing machine learning models with arbitrary configurations for the application programmer, including, interacting with raw data and setting it in a flexible graph structure (i.e. data model) using the original available data structures, relational feature extraction by flexible querying from the data model graph and designing flexible learning models including various configurations in which learners interact.

    Scribe helps sales people save ~2 hours every day and do more sales calls instead, by bringing sales to Slack. Scribe is used by people at Salesforce, Uber, General Assembly and other top companies, and we're backed by Y Combinator. This In-Slack Sales Bot brings emails to Slack, suggests smart replies that you can edit and send directly via Slack, and lets you update your CRM with the click of a button in Slack itself, so that anyone can scale their sales conversations and actually save time while doing it (approx 2 hrs a day) ⏳💪

    SHOGUN is large scale machine learning toolbox that unified large-scale learning for a broad range of feature types and learning settings, like classification, regression, or explorative data analysis.

    SciKit-Learn Laboratory (SKLL) provides a number of utilities to make it simpler to run common scikit-learn experiments with pre-generated features.

    Smile is a fast and comprehensive machine learning system that provides hundreds advanced algorithms with clean interface, Scala API also offers high-level operators that make it easy to build machine learning apps.

    sofia-ml is a suite of fast incremental algorithms for machine learning (sofia-ml) that can be used for training models for classification, regression, ranking, or combined regression and ranking, intended to aid researchers and practitioners who require fast methods for classification and ranking on large, sparse data sets.

    Swix is a general matrix language for Apple's Swift that aims to make the Matlab-iOS conversion easy.

    Training your models on a massive scale is one thing. Running hundreds of iterations on multiple GPUs in parallel and managing the whole machine learning pipeline effectively is what sets you apart from the competition.

    The Xilinx ML Suite enables developers to optimize and deploy accelerated ML inference. It provides support for many common machine learning frameworks such as Caffe, MxNet and Tensorflow as well as Python and RESTful APIs.