Natural language processing (NLP) allows applications to interact with human language using a deep learning algorithm. NLP algorithms input language and can give a variety of outputs based on the learned required task. These outputs can include automatic summarization, language translation, part-of-speech tagging, parsing or grammatical analysis, and sentiment analysis, among others. NLP algorithms can also provide voice recognition and natural language generation, which converts data into understandable human language. Some examples of NLP uses include chatbots, translation applications, and social media monitoring tools that scan Facebook and Twitter for mentions. Natural language processing algorithms are an example of a deep learning algorithim and may be a pre-built offering in anAI platform.
To qualify for inclusion in the Natural Language Processing category, a product must:
Natural Language Processing (NLP) reviews by real, verified users. Find unbiased ratings on user satisfaction, features, and price based on the most reviews available anywhere.
TokensRegex is a generic framework defining patterns over text (sequences of tokens) and mapping it to semantic objects represented as Java objects thta emphasizes describing text as a sequence of tokens (words, punctuation marks, etc.), which may have additional attributes, and writing patterns over those tokens, rather than working at the character level, as with standard regular expression packages.
Stanford Topic Modeling Toolbox (TMT) brings topic modeling tools to social scientists and others who wish to perform analysis on datasets that have a substantial textual component, it has the ability to import and manipulate text from cells in Excel and other spreadsheets, train topic models (LDA, Labeled LDA, and PLDA new) to create summaries of the text, select parameters (such as the number of topics) via a data-driven process and generate rich Excel-compatible outputs for tracking word usage across topics, time, and other groupings of data.
Stanford Word Segmenter currently supports Arabic and Chinese that provided segmentation schemes have been found to work well for a variety of applications the system requires Java 1.8+ to be installed, it recommend at least 1G of memory for documents that contain long sentences. For files with shorter sentences (e.g., 20 tokens), decrease the memory requirement by changing the option java -mx1g in the run scripts.
textacy is a Python library for performing higher-level natural language processing (NLP) tasks, built on the high-performance spaCy library that has tokenization, part-of-speech tagging, dependency parsing, etc. offloaded to another library, textacy focuses on tasks facilitated by the ready availability of tokenized, POS-tagged, and parsed text.
Treat is a toolkit for natural language processing and computational linguistics in Ruby that build a language- and algorithm- agnostic NLP framework for Ruby with support for tasks such as document retrieval, text chunking, segmentation and tokenization, natural language parsing, part-of-speech tagging, keyword extraction and named entity recognition.
Tregex is a utility for matching patterns in trees, based on tree relationships and regular expression matches on nodes (the name is short for "tree regular expressions"). Tregex comes with Tsurgeon, a tree transformation language. Also included from version 2.0 on is a similar package which operates on dependency graphs (class SemanticGraph, called semgrex.
VoiceBase is defining the future of deep learning and communications by providing unparalleled access to spoken information for businesses to make better decisions. With flexible APIs developers and enterprises build scalable solutions with VoiceBase by embedding speech-to-text, conversational analytics, and predictive analytics capabilities into any big voice application. VoiceBase’s customers include Amazon Web Services, Twilio, Nasdaq, HireVue and Veritone. The company is privately held and is based in San Francisco, California.
Intelligent Service Robot is a dialog platform that enables smart dialog through various dialog-enabling clients, such as websites, mobile apps, and robots. Users can use domain-specific knowledge bases, configure their own knowledge base for customized smart dialogs and use Intelligent Service Robot to facilitate self-service through multi-round dialog. Intelligent Service Robot can also integrate with third-party APIs to enable complex scenarios such as order search, shipping tracking, and self-service returns
Cogito API is a ready to deploy and fully configured API series that helps developers accelerate creation and deployment of unique applications that leverage large volumes of unstructured information from multiple sources. Cogito API is easily deployed or integrated for faster evaluation and analysis of content such as web pages, social media data or any big data sets or real-time information streams.
Cortical.io has wrapped its Retina Engine into an easy-to-use, powerful platform for fast semantic search, semantic classification and semantic filtering that can process any kind of text, independently of language and length it enables user to process terabytes of data orders of magnitude faster than other methods.
CRF++ is a simple, customizable, and open source implementation of Conditional Random Fields (CRFs) for segmenting/labeling sequential data, CRF++ is designed for generic purpose and will be applied to a variety of NLP tasks, such as Named Entity Recognition, Information Extraction and Text Chunking.
Datumbox API offers a large number of off-the-shelf Classifiers and Natural Language Processing services which can be used in a broad spectrum of applications including: Sentiment Analysis, Topic Classification, Language Detection, Subjectivity Analysis, Spam Detection, Reading Assessment, Keyword and Text Extraction and more.
Inbenta, a global leader in artificial intelligence, utilizes patented natural language processing technology to provide a highly accurate search solution for customer support, e-commerce and chatbots. Inbenta's semantic search engine understands & delivers results based on the meaning behind customers’ search queries, not the individual keywords, leading to improved customer satisfaction, lower support costs and stronger ROI. The result: industry-leading 90%+ self-service rates.
MeTA is a modern C++ data sciences toolkit that allow text tokenization, including deep semantic features like parse trees, inverted and forward indexes with compression and various caching strategies, a collection of ranking functions for searching the indexes, topic models, classification algorithms, graph algorithms, language models, CRF implementation (POS-tagging, shallow parsing), wrappers for liblinear and libsvm (including libsvm dataset parsers), UTF8 support for analysis on various languages and .multithreaded algorithms
MXNet is a Flexible and Efficient Library for Deep Learning that supports both imperative and symbolic programming, calculates the gradient automatically for training a model, runs on CPUs or GPUs, on clusters, servers, desktops, or mobile phones and supports distributed training on multiple CPU/GPU machines, including AWS, GCE, Azure, and Yarn clusters.
Natural Language Processing for JVM languages (NLP4J) provides a tools readily available for research in various disciplines, Frameworks for fast development of efficient and robust NLP components and API for manipulating computational structures in NLP (e.g., dependency graph).
Omnitraq extracts critical business insights through our award-winning and patented technology from call center calls, web media, video, audio, and text data. By delivering these insights at low cost, with speed, and at scale, Omnitraq can provide both SMB and Enterprise clients with a suite of affordable and high impact BI tools.
Sonix is an online platform that combines automated transcription and editing. We built the world's first AudioText Editor™ that allows users to edit audio in a revolutionary new way: Edit audio by editing text. Sonix integrates with Adobe Audition, Adobe Premiere, Final Cut Pro, Audacity, and Hindenburg.
Synthesys is a solution that adds the brainpower of thousands of people to a team. by reading through all data and highlights the important people, places, organizations, events and facts being discussed, resolve highlighted points and determines what's important, connecting the dots together and figures out what the final picture means by comparing it with the opportunities, risks and anomalies that are looking for.
Ucto is a tool that tokenizes text files: it separates words from punctuation, and splits sentences, it offers several other basic preprocessing steps such as changing case that can all use to make text suited for further processing such as indexing, part-of-speech tagging, or machine translation.