To explain what I like best, I am going to refer to two aspects: A) NoSQL vs RDBMS and B) Freeware and Licensed Engines
A) Distributed data is based on replication, and data partitioning lets you not only to scale your overall performance up, but also improve availability. There are two streams, one that is based on Hadoop and the other one is based on Sharding, the first one requires a huge amount of disk space, another invisible cost, and that initially none is caring about, as the storage costs drops, no need to worry or concern about data duplication and wasting space. In addition to the space issue, it is that the data is open, and there are not many security tools yet to hide the data and secure it. I am not sure everyone shares or knows these things…, so again this is confronting people with different ways of thinking. On the other hand maintaining replication in Hadoop is not a plug and play thing, it also has its challenges and requires complex java scripting. At last if we use Hadoop we are going towards a NoSQL or unstructured DB, and again, this is where many people get confused with…, so why turning the wheel to the opposite direction!, whilst we can use distributed data implemented in RDBMS (sharding)…, we have to have in count that NoSQL sacrifies consistency mainly by a very bad space consumer optimistic locking method to avoid dealing with atomicity and isolation.
Definitely NoSQL can’t replace RDBMS, as the BASE model is for some specific use cases, and for companies that defined their own solutions no other in the market was able to do…, but at that time!, RDBMS have now also powered up their engines, and in one case have even started to integrate and improve the unstructured model integrating it with the structured one, in the same data repository combining the best of both worlds, this is keeping up with the performance and capacity, without scarifying consistency, atomicity and isolation at the same time!.
There’s just one company that understood this problem and started working in a new paradigm, and this is to support the unstructured data model together with the structured one. This company is IBM, and its powerful INFORMIX engine. In any of the scenarios, either if by sharding or doing data partitioning required to scale out horizontally, a database engine has needs a very strong replication technology.
INFORMIX deployment has been driven by its strengths in replication – Informix employs one of the industry’s most scalable and efficient replication architectures – and data management. It also combines strengths in embedded data management with low administrative overhead, highly effective use of server resources and enterprise-class resilience.
INFORMIX is oriented to solving the massive change that is occurring in computing environments around the world. Specifically, unstructured data is flooding into organizations and companies are struggling to choose between a relational or NoSQL database to handle it. While NoSQL offers the ability to have dynamic elasticity with data and its organization, these databases lack essential elements such as scalability, security, flexible deployment, transaction support and full data integration. INFORMIX offers a solution that can natively handle NoSQL and structured data simultaneously providing a single system with the seamless integration of all data within the enterprise. It is meant to help developers use native JSON and BSON data types as well as other NoSQL functions (including sharding) for complete plug-and-play compatibility with MongoDB and other NoSQL databases. Application developers can access structured and unstructured data in a single statement and join them together as needed. NoSQL applications can use the enterprise-level capabilities of a structured database system including transaction support, backup and recovery, H/A, compression, enhanced scalability, security and access control. All these last features are not minor and characteristic of any other NoSQL engine.
I think that the hybrid development capability of INFORMIX-NoSQL to manage and present both structured and structured data repositories, is going to be very important as organizations realize that 1) there is value in the new flexible schema paradigms like JSON, 2) existing apps can benefit from being able to present this data, 3) JSON based applications could benefit from access to the relational data that is already in the organization's relational databases, 4) NO new hires will be required to work in the NoSQL/JSON space, simplifying the complex, expensive, slow, and error prone daily ETL, from multiple data sources.
B) This is TCO (Total Cost Ownership). This is to do with how easy an engine is to be maintained, this is then to do with TBF (Time Between Failures) or unplanned downtime, and the much the technology allows you to avoid the downtime or plan downtime to reduce it. All this at the light of the incurring expenses to maintain it, support it vs licensing costs.
I always like to call these as DIRECT and INDIRECT COSTS.
DIRECT COSTS is very simple, it is represented by the licenses cost.
INDIRECT COSTS are the ones that NONE seems to sees, it is incredible!. This is why one platform needs more resources than others…, it is simple math, if a platform is all the time requiring an engineer to touch here, and there, today, next month again the same thing in the same place, in one word REDOING!. This is consuming most of te DBA time, that could be used to grow the business. If the DBA spends most of his time in recurrent fixes!..., then it is simple, another DBA is required. It is not just what we redo, but also what we can't do because we are stuck fixing. INDIRECT COSTS are invisible!, incredible, but real.
There are also many aspects to consider as INDIRECT COSTS, such as performance, capacity, how much an engine can do with it…, and I am talking about using the same h/w, same data, but how much it is able to process is key…, so HOW MUCH THE ENGINE CAN DO WITH LESS!.