Subversion is an open source version control system. Founded in 2000 by CollabNet, Inc., the Subversion project and software have seen incredible success over the past decade. Subversion has enjoyed and continues to enjoy widespread adoption in both the open source arena and the corporate world.
Base is a desktop database management system, designed to meet the needs of a broad array of users. Base offers wizards to help users new to database design (or Base) to create Tables, Queries, Forms and Reports, along with a set of predefined table definitions for tracking Assets, Customers, Sales Orders, Invoices and much more.
Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.
Apache Ant is a Java library and command-line tool whose mission is to drive processes described in build files as targets and extension points dependent upon each other. The main known usage of Ant is the build of Java applications. Ant supplies a number of built-in tasks allowing to compile, assemble, test and run Java applications. Ant can also be used effectively to build non Java applications, for instance C or C++ applications. More generally, Ant can be used to pilot any type of process which can be described in terms of targets and tasks.
Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.
Apache Maven Doxia is a content generation framework which aims to provide its users with powerful techniques for generating static and dynamic content: Doxia can be used in web-based publishing context to generate static sites, in addition to being incorporated into dynamic content generation systems like blogs, wikis and content management systems.
MLlib is Spark's machine learning (ML) library that make practical machine learning scalable and easy it provides ML Algorithms: common learning algorithms such as classification, regression, clustering, and collaborative filtering, feature extraction, transformation, dimensionality reduction, and selection, tools for constructing, evaluating, and tuning ML Pipelines, saving and load algorithms, models, and Pipelines and linear algebra, statistics, data handling, etc.
The Apache Jackrabbit content repository is a fully conforming implementation of the Content Repository for Java Technology API.It is an effort to implement a scalable and performant hierarchical content repository for use as the foundation of modern world-class web sites and other demanding content applications
Apache Bigtop is an Apache Foundation project for Infrastructure Engineers and Data Scientists looking for comprehensive packaging, testing, and configuration of the leading open source big data components. Bigtop supports a wide range of components/projects, including, but not limited to, Hadoop, HBase and Spark.
The Apache Gump continuous integration tool was the first one developed at the Apache Software Foundation. It is written in Python and fully supports Apache Ant, Apache Maven (1.x to 3.x) and other build tools. Gump is unique in that it builds and compiles software against the latest development versions of those projects. This allows gump to detect potentially incompatible changes to that software just a few hours after those changes are checked into the version control system.
The Apache Axiom library provides an XML Infoset compliant object model implementation which supports on-demand building of the object tree. It supports a novel "pull-through" model which allows one to turn off the tree building and directly access the underlying pull event stream using the StAX API.
Apache OFBiz is an open source product for the automation of enterprise processes that includes framework components and business applications for ERP (Enterprise Resource Planning), CRM (Customer Relationship Management), E-Business / E-Commerce, SCM (Supply Chain Management), MRP (Manufacturing Resource Planning), MMS/EAM (Maintenance Management System/Enterprise Asset Management), POS (Point Of Sale).
Apache REEF (Retainable Evaluator Execution Framework) is a library for developing portable applications for cluster resource managers such as Apache Hadoop YARN or Apache Mesos. Apache REEF simplifies development of those resource managers through Centralized Control Flow , Task runtime , Support for multiple resource managers , NET and Java API and Plugins.
Apache SAMOA is a distributed streaming machine learning (ML) framework that contains a programing abstraction for distributed streaming ML algorithms it enables development of new ML algorithms without directly dealing with the complexity of underlying distributed stream processing engines (DSPEe, such as Apache Storm, Apache Flink, and Apache Samza) users can develop distributed streaming ML algorithms once and execute them on multiple DSPEs.
Apache ServiceMix is a flexible, open-source integration container that unifies the features and functionality of Apache ActiveMQ, Camel, CXF, and Karaf into a powerful runtime platform you can use to build your own integrations solutions. It provides a complete, enterprise ready ESB.
Apache SystemML is a machine learning platform optimal for big data that provides an optimal workplace for machine learning using big data, it can be run on top of Apache Spark, where it automatically scales your data, line by line, determining whether your code should be run on the driver or an Apache Spark cluster.
Apache Usergrid is an open-source Backend-as-a-Service) composed of an integrated distributed NoSQL database, application layer and client tier with SDKs for developers looking to rapidly build web and/or mobile applications. It provides elementary services (user registration & management, data storage, file storage, queues) and retrieval features (full text search, geolocation search, joins) to power common app features.
Apache Mahout is a software that build an environment for quickly creating scalable performant machine learning applications, it provides three major features: A simple and extensible programming environment and framework for building scalable algorithms, A wide variety of premade algorithms for Scala + Apache Spark, H2O, Apache Flink and Samsara, a vector math experimentation environment with R-like syntax which works at scale
Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text that supports the common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution these tasks are usually required to build more advanced text processing services and includes maximum entropy and perceptron based machine learning.
Apache Batik is a Java-based toolkit for applications or applets that want to use images in the Scalable Vector Graphics (SVG) format for various purposes, such as display, generation or manipulation.
Apache Celix is an implementation of the OSGi specification adapted to C and C++. It is a provides a framework to develop (dynamic) modular software applications using component and/or service-oriented programming.
Apache Eagle is an open source analytics solution for identifying security and performance issues instantly on big data platforms, e.g. Apache Hadoop, Apache Spark etc. It analyzes data activities, yarn applications, jmx metrics, and daemon logs etc.and provides state-of-the-art alert engine to identify security breach, performance issues and shows insights.
Apache Edgent is a programming model and micro-kernel style runtime that can be embedded in gateways and small footprint edge devices enabling local, real-time, analytics on the continuous streams of data coming from equipment, vehicles, systems, appliances, devices and sensors of all kinds (for example, Raspberry Pis or smart phones).
Apache Fineract is an open source system for core banking as a platform. Fineract provides a reliable, robust, and affordable solution for entrepreneurs, financial institutions, and service providers to offer financial services to the world 2 billion underbanked and unbanked.
Apache Gobblin is a distributed data integration framework designed to simplify common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.
Apache HAWQ is a Hadoop native SQL query engine that combines the key technological advantages of MPP database with the scalability and convenience of Hadoop. It provides users the tools to confidently and successfully interact with petabyte range data sets.
Calc is the spreadsheet application you've always wanted. Newcomers find it intuitive and easy to learn; professional data miners and number crunchers will appreciate the comprehensive range of advanced functions.
Apache Trafodion is a webscale SQL-on-Hadoop solution enabling transactional or operational workloads on Apache Hadoop.Trafodion provides SQL access to structured, semi-structured, and unstructured data allowing you to run operational, historical, and analytical workloads on a single platform.
Apache Websh is a rapid development environment for building powerful, fast, and reliable web applications in Tcl. Websh is versatile and handles everything from HTML generation to data-base driven one-to-one page customization.
Apache Yetus is a collection of libraries and tools that enable contribution and release processes for software projects.It provides a robust system for automatically checking new contributions against a variety of community accepted requirements