A machine learning model is trained on a large amount of data to make predictions or to gain knowledge from that data.
The model that makes predictions is called the Predictive Model and the model that gains knowledge from data is called the Descriptive Model.
Based on how the models are trained, machine learning is broadly classified into these broad categories:
- Supervised Machine Learning
- Unsupervised Machine Learning
- Reinforcement Machine Learning
- Semi-Supervised Learning
Now, we will give a detailed description of every type and the algorithms used in these Machine Learning types.
1. Supervised Machine Learning
Supervised machine learning is a type of machine learning in which machines are trained using labelled datasets.
By labelled data, we mean a dataset in which the inputs are labelled with correct output values.
The formula above represents a labelled dataset of length N, where each element xi is called a feature vector.
A feature vector may have D dimensions, where each dimension somehow describes the example. yi is the output value for each example in our dataset.
For example, if we have a dataset of girls and boys, first we will give some attributes i.e. dimensions of the girls and boys example (height, hair length, face shape, eyes, etc.) and the corresponding output of the vector to our Machine to train it.
After training, we will give it a photo and the machine should identify it, whether it is a boy or girl.
Now the machine will check the features like height, hair length, eyes etc, on which it has been trained to give the correct output.
The goal of a supervised machine learning algorithm is to use the dataset to produce a model that takes a feature vector x as input and outputs information that allows deducing the label (output) for this feature vector.
Categories of Supervised Machine Learning
- Classification
- Regression
1. Classification
In the classification problem, we have two classes: yes or no, 1 or 0, boy or girl, male or female, etc. The classification algorithm specifies the class of the input.
For example, in banking, credit scoring is a supervised machine learning technique which classifies the customer as a low-risk or high-risk customer. i.e. the model assigns the customer to one of the two categories.
Some algorithms used in classification are given below.
- Random Forest Algorithm
- Decision Tree Algorithm
- Logistic Regression Algorithm
- Support Vector Machine Algorithm
2. Regression
Such problems where the output is a number are Regression problems.
For example, if we want a system that can predict the price of a house, the inputs are house attributes like area, the number of bedrooms, bathrooms, room dimensions, place, etc. and the output is the price of the house.
If ‘X’ denotes the attributes of the house and ‘Y’ is the output (a price in this case), the task of the regression algorithm is to find a mapping from input (X) to output (Y). For example,
Some algorithms used in classification are given below.
- Simple Linear Regression Algorithm
- Multivariate Regression Algorithm
- Decision Tree Algorithm
- Lasso Regression
2. UnSupervised Machine Learning
In unsupervised machine learning, the dataset is a collection of unlabelled examples, i.e., we only have input data and the outputs are not given.
The model takes a feature vector ‘X’ as input and either transforms it into another vector or into a value that can be used to solve the practical problem.
For example, in Dimensionality reduction, the output of the model is a feature vector that has fewer features than the input ‘X’.
Simply the aim of the model is to find the regularities in the input, there is a structure to the input space such that certain patterns occur more often than others, and we want to see what regularities happen and what do not.
For example, in clustering, the aim is to find the grouping of input.
Companies use a clustering model to allocate customers similar in their attributes (the attributes may be past tractions, demographic information, etc) to the same group.
Categories of Unsupervised Machine Learning
- Clustering
- Association
a. Clustering
In clustering, the aim is to find the groups that share more similarities. i.e., the objects in one group are more similar than the objects in other groups.
For example, in document clustering, the aim is to group similar documents, news reports can be subdivided into those related to politics, sports, arts, fashion and so on.
Some of the popular clustering algorithms are given below:
- K-Means Clustering algorithm
- Principal Component Analysis
- Independent Component Analysis
b. Association
In Association the main aim is to find the relationship between variables within a dataset.
For example, in Basket Analysis, we find associations between products bought by customers if people buy A and typically also buy B, and if there is a customer who buys A and does not buy B, he or she is a potential B customer. (i.e., he or she is a customer who more often buys item B)
3. Reinforcement Machine Learning
Reinforcement Machine learning is a type of Machine learning where the machine works in an environment and is capable of perceiving the state of that environment as a vector of features.
The machine can execute actions in every state. Different actions bring different rewards and could also move the machine to another state of the environment.
The goal of the reinforcement Learning algorithm is to learn a policy, that is the sequence of current actions to reach the goal.
For example, game playing(chess) where a single move by itself is not that important it is the sequence of right moves that are good.
A move is good if it is a part of the good game-playing policy. Another good application of reinforcement learning is a robot navigating in an environment in search of a goal location.
The robot’s movement in one the direction at a time is one move of a policy, and after a number of trial movements, it should learn the correct sequence of actions (movements) to reach the goal.
Reinforcement learning is similar to the human learning process, humans learn from past mistakes or add something new to past actions.
Reinforcement learning has two categories:
- Positive Reinforcement learning
- Negative Reinforcement learning
1. Positive Reinforcement learning
In positive reinforcement learning, better moves are used to make the learning faster and more accurate. It enhances the learning model and impacts positively.
2. Negative Reinforcement learning
In negative reinforcement learning, weak moves are neglected to make the learning fast. It enhances the learning model by avoiding weak actions.
4. Semi-Supervised Machine Learning
Semi-Supervised machine learning can be called a hybrid type of machine learning because in semi-supervised learning the model is trained on a dataset which contains both labelled and unlabelled examples.
Semi-supervised is completely different from both supervised and unsupervised learning types, usually, the unlabelled data is more than labelled data examples.
The goal of semi-supervised learning is the same as supervised learning, then why do we use unlabelled examples in the data set of semi-supervised learning? The reason is to help the learning algorithm to produce (or find) a better model.
Semi-supervised learning is used to overcome the drawbacks of the supervised and unsupervised learning types.