Aerospike is well optimized to use SSD performance. The only thing needed is proper capacity planning and your production environment will be pretty stable.We have around 20-30TB data store in Aerospike. Here are below points which is like most about it :-
1. With a peak of 300-400k TPS , 99% of a incoming requests is still finishing in < 1-2ms. I believe it is the fastest Nosql Data store till date.
2. Cluster Setup is very easy.
3. Absolute Zero Maintenance as compared with Cassandra or Hbase
4. Easy to Monitor the cluster health and performance.
5. Enterprise Support is pretty good.
6. Version Upgrades is a very seamless process.
However, there are many things which are work in progress in Aerospike :
1. Migrations : Migration is an activity that comes into action to balance and redistribute data when a node goes down and leaves the cluster. It can happen due to network issues or system level bugs which results in a service down time on that server. It is an expensive operation as it directly impacts the performance with a spike in latencies. This is further complicated as even latest Aerospike version is not smart enough to understand the difference between a manual service restart and a node going down.
2. Namespace Addition Needs Downtime : Namespace in Aerospike is an equivalent term to database in RDBMS. We have many different applications which are using their respective namespaces in production. Any new namespace addition is not dynamic untill now and it requires a service restart on all nodes in the cluster.
3. Isolation at Namespace Level :
For better and stable production environment, we must have better control of all the key metrics that are critical and need special attention. As of now, we can monitor all the TPS at cluster level but we cannot deter- mine how many TPS are happening at namespace level. Also, there is no method by which we can isolate or put constraints at namespace level in terms of number of read/write activiities that can happen at any name- space. We have faced situations in production where TPS spike in one less critical namespace impacts the performance of all other namespaces.
4. IP tracking in Connections : Aerospike provides a security feature in Enterprise Version where we can enable security on our distributed cluster but we cannot track the client hostname or IP from where connection is coming. Having this will help us to narrow down our search to identify hostname which is generating too many connections.
5. XDR Limitations : XDR Replication is a feature available in Enterprise Aero- spike which allows us to replicate from one datacenter to another. But, it is not possible to enable any security while XDR is enabled in the cluster. Also, XDR can not replicate updates at bin level and it replicates the whole record again increasing inefficiency at network level.
We are solving one of the biggest problems of mobile advertising industry. Scale of data is directly proportional to the number of smart phone devices across the world. Aerospike is providing us faster performance at a bigger scale to solve the larger problems of one of the fastest growing tech companies in the advertising world.