Apache Spark, Hbase

Multiple WAL in Apache HBase 1.3 and performance enhancements!!!

Apache HBase 1.3.0 was released mid-January 2017 and ships with support for date-based tiered compaction and improvements in multiple areas, like write-ahead log (WAL), and a new RPC scheduler, among others. The release includes almost 1,700 resolved issues in total. Below are some bold points on enhancement made in HBase 1.3.0:- The “date-based tiered compaction” support shipped in HBase 1.3.0 is beneficial for where data is infrequently deleted or updated and recent data is scanned more often than an older one. Records time-to-live (TTL) can be easily enforced with this new compaction strategy. Improved multiple WAL support in Apache HBase […]

Hadoop, Hbase, Hive

JRuby code to purge data on Hbase over Hive table…

Problem to Solve:- How to delete/update/query Binary format stored values in a HBase column family column. Hive over HBase table, where we cant use standard API and unable to apply filters on binary values, you can use below solution for programmability.   Find JRuby source code at github location github.com/mkjmkumar/JRuby_HBase_API This program written in JRuby to purge data using HBase shell and deletes required data applying filter on given binary column.   So you have already heard many advantages of storing data in HBase(specially binary block format) and create Hive table on top of that to query your data. I am not going to explain use case for this, why […]

Database, Hbase, Tephra

Tephra is open-sourced projects that adds complete transaction support to Apache HBase…

Transaction support in Hbase? Yes, a wide range of use case require transaction support. Firstly, we want the client to have great insight and fine-grained control of what the transaction system can do. Having full control on the client side not only allows you to make the best decisions for optimizing for specific use cases, but it also makes integration with third-party systems simpler. Secondly, when different types of components in your application share the data and update the data in multiple data stores in many different ways(Hadoop applications), it is important for the transaction system to support you. Thirdly, […]

Hadoop, HDFS, Hive

HBase Replication and comparison with popular online backup programs…

Short Description: HBase Replication: Hbase Replication solution can solve the cluster security, data security, read and write separation and operation Article   This article is first series of three articles, next coming articles with some code and mechanism present in latest version of HBase supporting HBase Replication.   HBase Replication Hbase Replication solution can solve the cluster security, data security, read and write separation, operation and maintenance, and the guest operating errors, and so the ease of management and configuration, provide powerful online applications support. Hbase replication currently used in the industry are rare, because there are many aspects, such […]