The ACID properties and the CAP theorem are two concepts in data management to distributed system.

Started working on HBase again!! Thought why not refresh few concepts before proceeding to actual work. Important things comes into mind when we work with NoSQL is distributed environment are sharding and partitions.  Let’s dive into ACID properties of database and CAP theorem for distributed system. The ACID properties and the CAP theorem are two… Continue reading The ACID properties and the CAP theorem are two concepts in data management to distributed system.

Published

Data Analysis Approach to a successful outcome

have done data analysis for one of my project using below approach and hopefully it may help you understand underlying subject.Data analysis is a highly iterative and non-linear process, better reflected by a series of cyclic process, in which information is learned at each step, which then informs w

Published

Encourage you to switch to Jupyter Lab…

Notebooks are great for prototyping, longer pipelines or processes. If you are a user of PyCharm or Jupyter Notebook and an exploratory data scientist, I would encourage you to switch you to Jupyter Lab. For Jupyter Lab installation steps go here Below are some of the advantages that I see using Jupyter Lab over Jupyter Noteb

Published

Why and when we need Machine Learning…

I’m into the data management/data quality from several years. When I ask some people what is data management processes they simply reply, “well, we have some of our data stored in a database and other data stored on file shares with proper permissions.” This isn’t data management…it’s data storage. If you and/or your organization don’t have good, clean data, you are most definitely not ready for machine learning. Data management should be your first step before diving into any other data project(s).

Published