Hadoop, HDFS

Heterogeneous Storage in HDFS(Part-1)…

An Introduction of heterogeneous storage type, and the flexible configuration of heterogeneous storage! Heterogeneous Storage in HDFS Hadoop version 2.6.0 introduced a new feature heterogeneous storage. Heterogeneous storage can be different according to each play their respective advantages of the storage medium to read and write characteristics. This is very suitable for cold storage of data. Data for the cold means storage with large capacity and where high read and write performance is not required, such as the most common disk for thermal data, the SSD can be used to store this way. On the other hand when we required […]

Cloudera, encryption, HDFS, Security

A Step-by-Step Guide to HDFS Data Protection Solution for Your Organization on Cloudera CHD

  An enterprise-ready encryption solution should provide the following Comprehensive encryption offering wherever it resides, including structured and unstructured data at rest and data in motion. HDFS Encryption implements transparent, end-to-end encryption of data read from and written to HDFS, without requiring changes to application code. Centralized encryption and key management: A centralized solution will enable you to protect and manage both the data and keys. Secure the data by encrypting or tokenizing it, while controlling access to the protected data. This guide will help you through enabling HDFS encryption on your cluster, using the default Java KeyStore KMS. If […]

Hadoop, HDFS

HDFS is really not designed for many small files!!!

Few of my friends new to Hadoop ask frequently what the good file size is for Hadoop and how to decide file size. Obviously it should not be small size and file size should be as per the block size. HDFS is really not designed for many small files. For each file, the client has to talk to the namenode, which gives it the location(s) of the block(s) of the file, and then the client streams the data from the datanode. Now, in the best case, the client does this once, and then finds that it is the machine with […]

Hadoop, HDFS, Hive

HBase Replication and comparison with popular online backup programs…

Short Description: HBase Replication: Hbase Replication solution can solve the cluster security, data security, read and write separation and operation Article   This article is first series of three articles, next coming articles with some code and mechanism present in latest version of HBase supporting HBase Replication.   HBase Replication Hbase Replication solution can solve the cluster security, data security, read and write separation, operation and maintenance, and the guest operating errors, and so the ease of management and configuration, provide powerful online applications support. Hbase replication currently used in the industry are rare, because there are many aspects, such […]