HBase, a Distributed, Scalable, Big Data Store
HBase is a non-relational or NoSQL database. HBase is written in Java and designed after Google’s Bigtable. HBase is developed as a part of Apache Hadoop project. It provides Hadoop with Bigtable-like capabilities, running on top of its Distributed File System. HBase ensures a fault-tolerant way to store large amounts of sparse data and allows read/write access to it in real time. According to the storage mechanism, it is a column-oriented database and all tables in HBase are sorted by row so it is designed for huge table and is suitable for Online Analytical Processing.
Technically speaking, HBase is actually a ‘Data Store” because it doesn’t have many features that are common for RDBMS, such as advanced query languages, triggers, secondary indexes, typed columns etc. HBase differs from traditional and relational databases and does not support SQL scripting. It has similarities with a MapReduce application.
Apache HBase has the following key features:
- It has automatic failure support.
- It provides consistent writes and reads and is perfect for high-speed counter aggregation.
- It features automatic sharding of HBase tables as the amount of data grows.
- It is linearly and modularly scalable. HBase clusters can be expanded by adding RegionServers hosted on commodity class servers.
- It supports Java API for client access that is easy-to-use
- It supports Thrift/REST API for non-Java front ends.
- It integrates with Hadoop as a destination and as a source.
- It has Bloom filters and Block Cache for high volume queries.
- It allows replication of data across clusters.
HBase is used if there is a need for write heavy applications and provide fast random real-time access to Big Data. Enterprises use its low latency storage for tasks that need real-time analysis of tabular data for different end-user applications. Such companies as Yahoo, Adobe, Twitter, and Facebook use HBase internally. It is now used on different data-driven websites, including Netflix, Facebook's Messaging Platform, HbSpot, Meetup, Airbnb, Salesforce.com, and more.