Concept of Decision Tree Classification | Machine Learning | Data Science | Mathematics

Concept of Decision Tree Algorithm | Machine Learning | Data Science | Mathematics

Decision Tree Algorithm for Classification

Decision Tree Algorithm is one of the most popular algorithms and widely used in machine learning. It is a type of supervised learning-based algorithm, can be used for both classification and regression.

Photo by Fabrice Villard on Unsplash


Let's see first how it works?

A simple decision tree example


So, now we are enough aware of the decision tree, so let's get deeper.

Impurity

It is a measurement that how much our data is impure, means how much homogeneity is present in your data.

Image Source: Research Gate

For measuring impurity we have several measures from which we will learn these two: 

1. Entropy:

Entropy is nothing but the randomness in your dataset. Which increase predictability. It is directly proportional to the non-homogeneity in your dataset. It measures the purity of the split.

Use: We analyse the entropy on every node in the decision tree to determine the impurity, t ranges from 0 to 1. If the entropy is exact 1 then your dataset is complete impure. But the entropy 0 found on any node then the dataset is pure means it contains homogenous values.

The formula for Entropy:


2. Gini Impurity:

Gini Impurity is a measure of inequality in the sample. it can range up to 0 and 1. 0. It works as same as the entropy works.

So why we use it?
So, basically, it takes less computation time rather than the entropy we usually use it more in a random forest.

The formula for Gini Index:


Information Gain:

When we split our dataset into further decisions then, our data get more simplified means entropy of data decreases. So, the decrease in Entropy is called Information Gain.


Information gain = entropy(parent) - [weights average]* entropy(children)

Basically, we perform split on the dataset but a problem rises that we can actually split that database in multiple ways so that we have to find the best possible way that we can actually gain the best result. So that we compute the information gain, it tells us about the measurement of entropy difference in the last node to the next node.
Information gain is simply the expected reduction in entropy caused by partitioning the examples according to this attribute.
 

Finally higher the information gain, the higher the model predict perfectly.

A decision tree can solve classification and regression problems both.

Classification:

A decision tree can solve it by applying frequent if-then conditions. As a result, it can classify your problems. For example, as shown above in the image.

So let's understand how it works. Let suppose you have this type of data.


In this data, I have to plan which value of marks should I choose that I can actually get more information gain, and required entropy.

From this diagram, you actually work on how to select the split. And this is all about the Decision tree algorithm.




Comments

Popular posts from this blog

SMART HOSPITAL | IoT based smart health monitoring system | Real time responses | Internet Of Things | ESP8266

A Quick Guide to Data pre-processing for Machine Learning | Python | IMPUTATION | STANDARDISATION | Data Analysis | Data Science

Plotly & Cufflinks | A Data Visualisation Library with Modern Features | Python | Data Science | Data Visualisation