Recent Blogs...

GoLang ORM – GORM......

In my blogs, I am creating a Kuberetes Client using GoLang to add custom feature which is not present in Kubernetes Vanilla Installations or Kuberente..

-s Flavours such as EKS or AKS. To achieve end goal it is required to intract with database using Go. In this Blog I’ll showcase that we can use the Go-ORM or GORM to interact with a sqlite3/Postgres database in a simple manner. ORMs or Object Relati

Error syncing load balancer: failed to ensure load balancer: could not find any suitable s...

This is very annoying when you cant remember the issue you faced in past. I completely forget where the annotation needs to be set to get rid of this ..

-issue. And as always AWS documentation is super confusing when we need something that is important. We can get better results outside the aws documentation and I dont remember aws documentation has solved any problem for me last time. But anyways let

Keycloak: Centralized Authorization...

In this blog, I am going to explain how we can implement the centralized authorization strategy for all our endpoints using the keycloak tool. All ref..

-erence and links have been taken from the official Keycloak documentation, which can be accessed from keycloak website. Before seeing how to configure the keycloak Customers to be able to implement this authorization strategy, we are going to name a

Making an automatic ON Switch using only relay and a charger!!! ...

Hi everyone I will be showing you how to make a switch that automatically switches on when there is a power outage if you have a Backup battery power ..

-supply in your home(Should read if you have frequent power outages). Application: I have been having many power outages in my place and it is frustrating when there is a power outage and the AC goes off and I need to manually switch on the fan, that

KeyCloak : A Brief Introduction to OAuth 2.0, OpenID Connect, SAML 2.0. and JWT(Part-1)...

In this article we are going to get a brief introduction to the standards that enable us to integrate web applications securely and easily with Keyclo..

-ak. This blog will give we a gentle introduction without going too much into detail. Even if we are new to these standards, we may still want to skim through them. Authorizing application access with OAuth 2.0 OAuth 2.0 is by now a massively popular

ARM64 based Graviton worker node in EKS and run Postgres cluster using statefulset...

I am writing this blog in series as covering everything in one blog is difficult. In this blog we will see some encouragement to move to ARM64 along w..

-ith a proof-of-concept (poc) of running Postgres Cluster on ARM based Graviton ec2 in aws cloud. We will proceed through the running of multi-architecture Docker images to leverage the latest AWS Graviton2 processors. The AWS projected performance an

Kubernetes Keycloak : Add admin console url...

After creating the Keycloak application on aws eks seems some additional stuff is required to get Admin Console. Before discussing the issue, let me g..

-ive you below the links in order to create and run keycloak application. I am using steps defined in the below codecentric helm repo for installing the application. Download the values.yaml and make changes related to ingress. I’ll update the values

unsupported Kubernetes version (Service: Eks, Status Code: 400, Request ID) but its actual...

Firstly, sorry for the misleading error code as this forced even me to scratch my head and spend a few hours in order to figure out the confusion arou..

-nd CloudFormation always leading me to the error below:- I found it surprising that there wasnt a single google result revealing the actual cause of this error and by hit and trial method I have come up with a solution. The below document does not t

Unable to create ArgoCD application, Error: manifest does not contain a layer with mediaty...

We are utilizing ArgoCD for Gitops and everything works fine and the application is getting created with docker images stored in the ECR repository. B..

-ut when we create an application using OCI enabled helm ecr repository in ArgoCD we face issues. The version of ArgoCD is v2.0.5. This is the command that I use to create an application in the ArgoCD using a helm chart stored in the OCI registry as a

Helm Push/Pull Error: scheme OCI not supported...

There are a few issues reported by the developers when storing Helm charts in AWS ECR. Common errors occur when they are using the old version of the ..

-Helm chart. below is one of the many scenarios and others can be solved by just updating the Helm chart. In case you would like to know what is OCI and how to install add charts into the ecr repo please visit my other blog Helm OCI based charts into

Helm OCI based charts into AWS ECR and OCI feature...

We are using AWS ecr for storing our application images and the AWS ecr support of Open Container Initiative (OCI) artifacts for Helm has greatly boos..

-ted the development effort on our Helm charts. This article will show you how to create, push and pull helm chart into the AWS ecr repository. Also will showcase if you face any error related to ocr what necessary steps you need to take in order to r

pgAdmin hosting on the Kubernetes and design advantages...

pgAdmin 4 is a free open source graphical management tool for PostgreSQL. This article will help you to understand the advantages of hosting a pgAdmin..

- application as a web application in your Cloud environment and the steps involved on how to install pgAdmin on Kubernetes using Helm Chart. For documentations on pgAdmin visit the official website. My web application developers required access to t

How to avoid Helm warning due to config permissions...

When a developer working on Kubernetes he/she generally forgets to secure the Kubernetes config file that contains the cluster tokens of environments ..

-that can be sensitive. So if you are seeing the below warning from helm chart you are one of them:- WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /home/mukesh/.kube/config WARNING: Kubernetes configuration fil

GitOps using ArgoCD with Azure git repository...

To learn more on ArgoCD you can visit documentation by clicking the link https://argo-cd.readthedocs.io/en/stable/ Let us start by generating the PAT..

- for Azure Git Repository, please follow the below steps:- A personal access token (PAT) is used as an alternate password to authenticate into Azure DevOps. Learn how to create, use, modify, and revoke PATs for Azure DevOps. Sign in to your organiz

Encourage you to switch to Jupyter Lab......

Notebooks are great for prototyping, longer pipelines or processes. If you are a user of PyCharm or Jupyter Notebook and an exploratory data scien..

-tist, I would encourage you to switch you to Jupyter Lab. For Jupyter Lab installation steps go here Below are some of the advantages that I see using Jupyter Lab over Jupyter Notebook:- The new terminal is a tab view to use compared.

Python Lists and Lambda Learning…...

Use List as Stack With these easy-to-use methods, we can utilize lists as stacks. The last member of Stacks to join is the first member to be fetched ..

-(last-in, first-out rule). To append a member to the top of the stack you can use append(). To remove a member from the top of the stack, you can use pop() (Note: you do not need to include parameters). Examples are as follows:

Apache Spark RDD API using Pyspark…...

In my previous article, I am using scala to show usability of Spark RDD API. Many of us utilizing PySpark to work with RDD and Lambda functions. Thoug..

-h the function names and output is same what we have in Scala, syntax in Pyspark is different on RDD operations. Ill explain here Pyspark RDD using a different approach and with a different perspective to solve the problem. Let us consider we are st

How to convert Python list, tuples, strings to each other…...

There are three built-in functions in Python : lists, tuples, and strings. The three functions, str (), tuple (), and list (), convert to each other u..

-sing the following example:

Cloud Databases Cloud Blob......

Cloud computing is the next stage in evolution of the Internet. The cloud in cloud computing provides the means through which everything — from comput..

-ing power to computing infrastructure, applications, business processes to personal collaboration — can be delivered to you as a service wherever and whenever you need. Cloud databases are web-based services, designed for running queries on structur

A Step-by-Step Guide to HDFS Data Protection Solution for Your Organization on Cloudera CH...

An enterprise-ready encryption solution should provide the following Comprehensive encryption offering wherever it resides, including structured and ..

-unstructured data at rest and data in motion. HDFS Encryption implements transparent, end-to-end encryption of data read from and written to HDFS, without requiring changes to application code. Centralized encryption and key management: A centralize

Past and Future of Apache Kylin!!!...

Apache Kylin origin In today’s era of big data, Hadoop has become the de facto standards, and a large number of tools one after another around the Ha..

-doop platform to build, to address the needs of different scenarios. For example, Hadoop Hive is a data warehouse tools, data files stored on HDFS distributed file system can be mapped to a database table and provides SQL queries. Hive execution eng

PG-Storm: Let PostgreSQL run faster on the GPU...

PostgreSQL extension PG-Storm, allows users to customize the data scan and run queries faster. CPU-intensive work load is identified and transferred t..

-o the GPU to take advantage of the powerful GPU parallel execution ability to complete the data task. The combination of few number of core processors, RAM bandwidth, and the GPU has a unique advantage. GPUs typically have hundreds of processor cores

Tephra is open-sourced projects that adds complete transaction support to Apache HBase…...

Transaction support in Hbase? Yes, a wide range of use case require transaction support. Firstly, we want the client to have great insight and fine-..

-grained control of what the transaction system can do. Having full control on the client side not only allows you to make the best decisions for optimizing for specific use cases, but it also makes integration with third-party systems simpler. Secon

Hive Naming conventions and database naming…...

Short Description: Naming conventions help to ease programmer and architect to understand whats inside going on in a business. Article I have worked ..

-with almost 20 to 25 applications. Whenever i start working first i have to understand each applications naming convention and i keep thinking why we all not follow single naming convention. As Hadoop is evolving rapidly therefore would like to share

HBase Replication and comparison with popular online backup programs…...

Short Description: HBase Replication: Hbase Replication solution can solve the cluster security, data security, read and write separation and operatio..

-n Article This article is first series of three articles, next coming articles with some code and mechanism present in latest version of HBase supporting HBase Replication. HBase Replication Hbase Replication solution can solve the cluster secur

Apache Shiro design is intuitive and a simple way to ensure the safety of the application…...

Short Description: Apache Shiro’s design goals are to simplify application security by being intuitive and easy to use… Article Apache Shiro design i..

-s intuitive and simple way to ensure the safety of the application. Software design is generally based on user stories to achieve, that is, based on how users interact with the system to design the user interface or service API. For example, a user

Heterogeneous Storage in HDFS(Part-1)…...

An Introduction of heterogeneous storage type, and the flexible configuration of heterogeneous storage! Heterogeneous Storage in HDFS Hadoop version..

- 2.6.0 introduced a new feature heterogeneous storage. Heterogeneous storage can be different according to each play their respective advantages of the storage medium to read and write characteristics. This is very suitable for cold storage of data.

The ACID properties and the CAP theorem are two concepts in data management to distributed...

Started working on HBase again!! Thought why not refresh few concepts before proceeding to actual work. Important things comes into mind when we work ..

-with NoSQL is distributed environment are sharding and partitions. Let’s dive into ACID properties of database and CAP theorem for distributed system. The ACID properties and the CAP theorem are two concepts in data management to distributed system

Coding Tips and Best Practice in Hive and Oozie…...

Many time during the code review found some common mistakes done by the developer. Here are few of them… Workflow mandatory item : Add this property..

- in all workflows that have a Hive action. This property will make sure that the hive job runs with the necessary number of reducers instead of just 1.HQL items : Setting properties: Keep the set properties in the HQL to a minimum. Let it take the d

HPL/SQL Make SQL-on-Hadoop More Dynamic...

Think about the old days when we solved many business problems using Dynamic SQL, exception handling, flow-of-control, iterations. Now when I worked w..

-ith couple of migration projects found few business rules that need to transform to Hive compatible (some of them are very complex and nearly impossible). Solution is HPL/SQL (formerly PL/HQL), is a language translation and execution layer developed

Best Practices for Hive Authorization when using connector to HiveServer2...

Recently we are in process of working with Presto and configuring Hive Connector to it. It got connected successfully with steps given at prestodb.io/..

-docs/current/connector/hive.html. An overview of our architecture is Presto is running on a different machine (Presto Machine) use Hive connector to communicate with Hadoop cluster which is running on different machines. Presto Machine have hive.prop

HDFS is really not designed for many small files!!!...

Few of my friends new to Hadoop ask frequently what the good file size is for Hadoop and how to decide file size. Obviously it should not be small siz..

-e and file size should be as per the block size. HDFS is really not designed for many small files. For each file, the client has to talk to the namenode, which gives it the location(s) of the block(s) of the file, and then the client streams the dat

Kafka: A detailed introduction...

I’ll cover Kafka in detail with introduction to programmability and will try to cover almost full architecture of it. So here it go:- We need Kafka w..

-hen there is need for building a real-time processing system as Kafka is a high-performance publisher-subscriber-based messaging system with highly scalable properties. Traditional systems unable to process this large data and mainly for offline used

Out of the Box(Why Women Live Longer than Men)...

Fact is men enjoy life more but at the end winners are women because they always get extra bits of years(these bits are sometimes in GB of ten years o..

-f extra life compared to men). I am not subject matter expert but some questions around me lead to dig more and get few possible connections to life. Hope you’ll enjoy this article to have more understanding of life(after all we all have one). Below

Introduction to Spark...

Introduction to Apache Spark:- Spark As a Unified Stack and Computational Engine is responsible for scheduling, distributing, and monitoring applicat..

-ions consisting of many computational tasks across many worker machines. Eventually the big data exports around the world have derived the specialized systems on top of Hadoop to solve certain problems like graph processing, implementation of effici

Performance utilities in Hive...

Before taking you in details of utilities provided by Hive, let me explain few components to get execution flow and where the related information stor..

-ed in system. Hive is a data warehouse software best suited for OLAP (OnLine Analytical Processing) workloads to handle and query over vast volume of data residing in a distributed storage. The Hadoop Distributed File System (HDFS) is the ecosystem

Data Analysis Approach to a successful outcome......

I have done data analysis for one of my project using below approach and hopefully it may help you understand underlying subject. Soon ill post my pro..

-ject on data analysis and detail description on technology used Python(web scraping- data collection), Hadoop, Spark and R. Data analysis is a highly iterative and non-linear process, better reflected by a series of cyclic process, in which informati

Why and when we need Machine Learning......

I’m into the data management/data quality from several years. When I ask some people what is data management processes they simply reply, “well, we ha..

-ve some of our data stored in a database and other data stored on file shares with proper permissions.” This isn’t data management…it’s data storage. If you and/or your organization don’t have good, clean data, you are most definitely not ready for m

Now you read end of this page but there are many more blogs - Click on read more to go to ...

Now you read end of this page but there are many more blog - Click on read more to go to all blogs - Click on read more to go to all blogs - Click on ..

-read more to go to all blogs - Click on read more to go to all blogs - Click on read more to go to all blogs - Click on read more to go to all blogs - Click on read more to go to all blogs - Click on read more to go to all blogs - Click on read more