Supervised Learning

Introduction

Supervised learning is a type of machine learning in which an algorithm learns from labeled training data, and makes predictions based on that data. It is called supervised learning because the process of an algorithm learning from the training dataset can be thought of as a teacher supervising the learning process^1(https://www.sciencedirect.com/science/article/pii/S187705091630111X). The teacher knows the correct answers, the algorithm iteratively makes predictions on the training data and is corrected by the teacher. The learning stops when the algorithm achieves an acceptable level of performance.

A computer screen showing a supervised learning algorithm in process.

Types of Supervised Learning

There are two main types of supervised learning problems, they are classified based on the type of output variable:

Regression

Regression is a method of modelling a target value based on independent predictors. This method is mostly used for forecasting and finding out cause and effect relationship between variables^2(https://www.sciencedirect.com/science/article/pii/S221083271400026X). For example, prediction of house prices, stock price prediction, height-weight prediction and so on.

Classification

Classification is a process of categorizing a given set of data into classes. The problem of identifying to which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known^3(https://www.sciencedirect.com/science/article/pii/S187705091630111X). Examples of classification problems are: speech recognition, handwriting recognition, biometric identification, document classification etc.

Supervised Learning Algorithms

There are a number of supervised learning algorithms, and they each have their strengths and weaknesses. The choice of algorithm depends largely on the data set at hand, its size, complexity, and the nature of the problem to be solved.

Linear Regression

Linear regression is one of the simplest and most commonly used supervised learning algorithms. It is used to predict a quantitative response Y from a single predictor variable X. It assumes that there is approximately a linear relationship between X and Y^4(https://www.sciencedirect.com/science/article/pii/S187705091630111X).

Logistic Regression

Logistic regression is a classification algorithm, used when the response variable is categorical in nature. It is a predictive analysis algorithm and based on the concept of probability^5(https://www.sciencedirect.com/science/article/pii/S187705091630111X).

Decision Trees

Decision trees are a type of supervised learning algorithm that is mostly used for classification problems. It works for both categorical and continuous input and output variables. In this technique, we split the population into two or more homogeneous sets based on the most significant attributes^6(https://www.sciencedirect.com/science/article/pii/S187705091630111X).

Random Forest

Random forest is a type of supervised machine learning algorithm based on ensemble learning. Ensemble learning is a type of learning where you join different types of algorithms or same algorithm multiple times to form a more powerful prediction model^7(https://www.sciencedirect.com/science/article/pii/S187705091630111X).

Support Vector Machines

Support Vector Machines (SVMs) are a set of supervised learning methods used for classification, regression and outliers detection. The advantages of support vector machines are effective in high dimensional spaces and uses a subset of training points in the decision function (called support vectors), so it is also memory efficient^8(https://www.sciencedirect.com/science/article/pii/S187705091630111X).

A diagram showing a support vector machine.

Applications of Supervised Learning

Supervised learning has a wide array of applications, including in the fields of medical imaging, speech recognition, and credit scoring.

Medical Imaging

In the field of medical imaging, supervised learning algorithms are often used to identify and classify abnormalities in images. This can help doctors to diagnose diseases and plan treatments^9(https://www.sciencedirect.com/science/article/pii/S187705091630111X).

Speech Recognition

Speech recognition technology uses supervised learning to convert spoken language into written text. This technology is used in a variety of applications, including voice-controlled assistants, transcription services, and customer service bots^10(https://www.sciencedirect.com/science/article/pii/S187705091630111X).

Credit Scoring

In the financial industry, supervised learning is often used for credit scoring. Algorithms are trained on historical data to predict whether or not a customer will default on a loan. This helps banks to make informed decisions about who to lend to^11(https://www.sciencedirect.com/science/article/pii/S187705091630111X).

References

1. Supervised Learning 2. Regression 3. Classification 4. Linear Regression 5. Logistic Regression 6. Decision Trees 7. Random Forest 8. Support Vector Machines 9. Medical Imaging 10. Speech Recognition 11. Credit Scoring