Posts

Decision Tree Regression Full Concepts | Machine Learning | Data Science

Image
Decision Tree Regression with Example | Machine Learning Algorithm | Data Science Tree-based methods are simple and useful as they are not that much competition with the other supervised learning algorithm in terms of predictions. Decision tree-based method involves segmenting the predictor space into a number of simple regions. To predict the prediction we usually use the mean and mode of the training observations. These types of set of rules can be explained as a tree and these types of approaches known as Decision Tree Methods. sal = mean log salary We are predicting the player's salary by years he played and no hits he made in his whole career. So, the above figure shows the regression tree to fit it and it consists of a series of splitting rules and stating at the top of a tree and ending at the last second level of the tree. As in this regression tree, the splitting is based on the experience in years then with the more than 4.5 years experience the tree is further split into

K Means Clustering Mathematics & Elbow Method to find optimal value of K | Data Science | Machine Learning | Explanation |

Image
K Means Clustering Mathematics & Elbow Method to find optimal value of K | Data Science | Machine Learning | Explanation | K means clustering(KMC) is an unsupervised machine learning algorithm. This directly means the supervision is not here to help the model to learn. In KMC the all process of classification of data is done by the model itself, it recognises the features of the data points and then with more likely features data it put them all in a group called clusters . Basically, it makes the multiple clusters of the related data by identifying the features in the dataset. Credits: giphy.com It is an unsupervised machine learning algorithm, which use to classify the data into cluster form. Each cluster contains similar type of data points. For example, you have some apple, orange and banana then you have to classify them then if you feed it to KMC. KMC will make a group of fruits which looks like same like banana are long and yellow. Orange is spherical and the colour is orang

All about naive Bayes Classifier Algorithm | Mathematics | Machine Learning | Data Science

Image
It is a classification algorithm, which is extremely fast than other algorithms. If you are stuck with the large dataset then go with it. It will do it very effectively. Actually, it is a collection of the classifier algorithm. It is a family of algorithms in which every single algorithm assumes that the presence of one feature is independent of the presence of other features. Photo by Crissy Jarvis on Unsplash But also the assumption assumed by the Naive Algo is not correct in the real world, so we can also get undesired results. Because it assumes every features independent of each other but there may be a case that the features are directly dependent on each other. But still, it works well in practice and we use it. Bayes Theorem: Bayes theorem is used to find the probability of an event occurring given that another event already occurred. We can apply Bayes theorem like this... Here,      c = the target class     x = dependent feature vector of size n c = {y} x = {x1, x2, x3, ...

Something you don't know about the Support Vector Machine(SVM) is here | Full Mathematics Intuition| Machine Learning | Data Science

Image
Something you don't know about the Support Vector Machine(SVM) is here. Photo by Liam Tucker on Unsplash Key Terms: Hyperplane: A hyperplane is a flat subspace of (n-1) dimension in an " n " dimension space. For a 2D plane, hyperplane will be of 1D means a line. For 3D space, the hyperplane plane will consist of 2D, means a flat plane and furthermore dimensions can't visualise but it can be on paper. Let's have a look at how it is. The equation for hyperplane in two dimensional ( 2D ) is as follow: β0 + β1X1 + β2X2 = 0        ...    (i) This as simple as the equation of a line, right? Because this hyperplane is actually a line. But in case of the higher dimensional plane, we have a generalised equation for the n-dimensional plane... β0 + β1X1 + β2X2 + ... + βnXn = 0    ... (ii)  Again if the point is in this n-dimensional space then only be the equations ( (i) & (ii) ) are equal to 0 if the point X is lying on the hyperplane. But if... β0 + β1X1 + β2X2 + ..

Concept of Random Forest | Mathematics | Machine Learning | ML Algorithm | Data Science

Image
Concept of Random Forest | Mathematics | Machine Learning | ML Algorithm | Data Science Photo by David Kovalenko on Unsplash Trees don't have the same level of accuracy as the other prediction algorithm, so the random forest came up in the limelight, it uses trees as a building block to form a more powerful algorithm. In the random forest, the process of finding the root node and leaf node runs randomly and it is made of more than one decision trees. So, it is called Random Forest. Ensemble Technique:  Basically, sometimes we use more than one model together to increase the efficiency of model and accuracy of predictions. So, it is called Ensemble Technique . It has further two types i.e. Bagging and Boosting . The bagging is also known as the bootstrap aggregation. In bagging the different base models feed with the different sample of data from the main dataset for the purpose of training of the models. After training of all models, a test dataset is fed to all the trained models

Concept of Decision Tree Classification | Machine Learning | Data Science | Mathematics

Image
Concept of Decision Tree Algorithm | Machine Learning | Data Science | Mathematics Decision Tree Algorithm for Classification Decision Tree Algorithm is one of the most popular algorithms and widely used in machine learning. It is a type of supervised learning-based algorithm, can be used for both classification and regression. Photo by Fabrice Villard on Unsplash Let's see first how it works? A simple decision tree example So, now we are enough aware of the decision tree, so let's get deeper. Impurity It is a measurement that how much our data is impure, means how much homogeneity is present in your data. Image Source: Research Gate For measuring impurity we have several measures from which we will learn these two:  1. Entropy: Entropy is nothing but the randomness in your dataset. Which increase predictability. It is directly proportional to the non-homogeneity in your dataset. It measures the purity of the split. Use:  We analyse the entropy on every node in the decision tre

Deep insight of K-Nearest Neighbour(KNN) | The Lazy algorithm | Mathematics | Machine Learning

Image
Deep insight of K-Nearest Neighbour(KNN) | The Lazy algorithm | Mathematics | Machine Learning Photo by Nina Strehl on Unsplash It is an instance-based supervised machine learning algorithm. In this algorithm, we find what is the neighbour instance of this value. It is used for both regression and classification.  KNN Classification: Let's have a look at the following diagram, and focus on the Heart-shaped Marker (💜), try to examine visually that in which category it will lie in red class or blue class. The Graph If we are assuming 5 nearest neighbours around the Heart, then visually you can determine that is more likely to belong to the blue class. Because there are almost three points are the nearest neighbour of the heart from the blue class and only two points from the red class. See the Graph. The Graph So, this is all about the K-nearest Neighbour classification. K-nearest neighbour is a simple supervised machine learning algorithm, that stores the all available cases and cl