
BLOGS
CatBoost vs. Light GBM vs. XGBoost
March 13, 2018
Evaluating Metrics for Machine Learning Models — Part 1
May 02, 2018
Well, in this post, I will be discussing the usefulness of each error metric depending on the objective and the problem we are trying to solve. When someone tells you that “USA is the best country”, the first question that you should ask is on what basis is this statement being made. Are we judging each country on the basis of their economic status, or their health facilities etc.? Similarly each machine learning model is trying to solve a problem with a different objective using a different dataset and hence, it is important to understand the context before choosing a metric
Evaluating Metrics for Machine Learning Models — Part 2
May 02, 2018
In the first blog, we discussed some important metrics used in regression, their pros and cons, and use cases. This part will focus on commonly used metrics in classification, why should we prefer some over others with context.
How to Handle Missing Data
January 30, 2018
One of the most common problems I have faced in Data Cleaning/Exploratory Analysis is handling the missing values. Firstly, understand that there is NO good way to deal with missing data. I have come across different solutions for data imputation depending on the kind of problem — Time series Analysis, ML, Regression etc. and it is difficult to provide a general solution. In this blog, I am attempting to summarize the most commonly used methods and trying to find a structural solution.