Hadoop is the most common term used in big data world. For whatever you do on big data Hadoop will be an underlying element, so Hadoop plays an important role in all data related activities.
First of all Hadoop does not replace a regular RDBMS. It is best suitable for batch operations than real time.
There are recently kafka, spark streaming which can be fit into hadoop stack and make real time analytics possible.
Hadoop is best to process very huge datasets. Its cheaper!
Once the cluster is setup adding or deleting nodes is simple.
There are very good user & developers communities working on Hadoop.
With the recent releases of Hadoop with Yarn, Highly available, Federation the product is getting better and better.
The high benchmarks for Hadoop & Map Reduce operations are very promising to use it.
All reporting and data ware house operations can be shifted to HDFS, Hive, Hadoop stack.
Hadoop can do many things file wise - it splits, merges, archives, unarchives and what not. There are almost all kinds of file operations are compatible with Hadoop and HDFS
There are several use cases like recommendation systems, social network analysis(graph data), sentiment analysis etc implemented successfully on Hadoop.
With the Hive things are much simpler. Just point tables to data(delimited) on HDFS and use it like a database or DWH.
Tools like Sqoop, Flume make easy conversion of streaming data and legacy RDBMS data onto HDFS
Frankly I do not see any disadvantages or drawbacks using Hadoop! Its simply great.
But there are some use cases which are not suitable for Hadoop :)
1. For all transaction purposes
2. For small data or structured data
Check yourself on
1. Data size - does it have TBs-PetaBytes of data
2. How much time you can wait - Hadoop is not instant querying tool
3. What is the data growth expected
4. Can I manage with out any real time operations
5. How much percentage of your data is structured - the low the better
So if you say yes for most of the questions then Hadoop is recommended
1. Building data pipelines.
2. Generating reports on very huge datasets.
3. Performing operations and extracting results from huge datasets.
4. Data analytics.