In the last blog post of this series, we discussed classifiers. The categories of classifiers and how they are evaluated were discussed. We have also discussed regression models in depth. In this post, we dwell a little deeper in how regression models can be used for classification tasks. Logistic Regression is a widely used regression […]

# Data Science Simplified Part 9: Interactions and Limitations of Regression Models

In the last few blog posts of this series discussed regression models at length. Fernando has built a multivariate regression model. The model takes the following shape: price = -55089.98 + 87.34 engineSize + 60.93 horse power + 770.42 width The model predicts or estimates price (target) as a function of engine size, horse power, […]

# Data Science Simplified Part 8: Qualitative Variables in Regression Models

The last few blog posts of this series discussed regression models. Fernando has selected the best model. He has built a multivariate regression model. The model takes the following shape: price = -55089.98 + 87.34 engineSize + 60.93 horse power + 770.42 width The model predicts or estimates price (target) as a function of engine […]

# Data Science Simplified Part 7: Log-Log Regression Models

In the last few blog posts of this series, we discussed simple linear regression model. We discussed multivariate regression model and methods for selecting the right model. In this article will address that question. This article will elaborate about Log-Log regression models.

# Data Science Simplified Part 6: Model Selection Methods

In the last article of this series, we had discussed multivariate linear regression model. Fernando creates a model that estimates the price of the car based on five input parameters. Fernando indeed has a better model. Yet, he wanted to select the best set of variables for input. This article will elaborate on model selection […]

# Data Science Simplified Part 2: Key Concepts of Statistical Learning

In the first article of this series, I had touched upon key concepts and processes of Data Science. In this article, I will dive in a bit deeper. First, I will define what is Statistical learning. Then, we will dive into key concepts in Statistical learning. Believe me; it is simple. As per Wikipedia, Statistical […]

# Data Science Simplified Part 5: Multivariate Regression Models

In the last article of this series, we discussed the story of Fernando. A data scientist who wants to buy a car. He uses Simple Linear Regression model to estimate the price of the car. The regression model created by Fernando predicts price based on the engine size. One dependent variable predicted using one independent […]

# Data Science Simplified Part 4: Simple Linear Regression Models

In the previous posts of this series, we discussed the concepts of statistical learning and hypothesis testing. In this article, we dive into linear regression models. Before we dive in, let us recall some important aspects of statistical learning. Independent and Dependent variables: In the context of Statistical learning, there are two types of data: […]

# Data Science Simplified Part 3: Hypothesis Testing

Edward Teller, the famous Hungarian-American physicist, once quoted: “A fact is a simple statement that everyone believes. It is innocent, unless found guilty. A hypothesis is a novel suggestion that no one wants to believe. It is guilty, until found effective.” Application of hypothesis testing is predominant in Data Science. It is imperative to simplify […]

# Data Science Simplified Part 1: Principles and Process

In 2006, Clive Humbly, UK Mathematician, and architect of Tesco’s Clubcard coined the phrase “Data is the new oil. He said the following: ”Data is the new oil. It’s valuable, but if unrefined it cannot be used. It has to be changed into gas, plastic, chemicals, etc. to create a valuable entity that drives profitable […]