Unsupervised Learning

Introduction

Unsupervised learning is a type of machine learning that trains a machine using information that is neither classified nor labeled and allowing the algorithm to act on that information without guidance. Here the task of the machine is to group unsorted information according to similarities, patterns and differences without any prior training of data.

A computer screen displaying a clustering algorithm, which is a common application of unsupervised learning.

Types of Unsupervised Learning

Unsupervised learning can be classified into two categories of algorithms: clustering and association.

Clustering

Clustering is a method of unsupervised learning and a common technique for statistical data analysis used in many fields. Clustering algorithms will process your data and find natural clusters(groups) if they exist in the data. You can also modify how many clusters your algorithms should identify. It allows you to adjust the granularity of these groups.

Association

Association rules allow you to establish associations amongst data objects inside large databases. This unsupervised technique is about discovering interesting relationships hidden in large data sets.

Techniques in Unsupervised Learning

There are various techniques used in unsupervised learning. Some of the popular ones include k-means clustering, Hierarchical clustering, Anomaly detection, Neural Networks, and Principal Component Analysis.

K-means Clustering

K-means clustering is a type of unsupervised learning, which is used when you have unlabeled data (i.e., data without defined categories or groups). The goal of this algorithm is to find groups in the data, with the number of groups represented by the variable K.

Hierarchical Clustering

Hierarchical clustering, also known as hierarchical cluster analysis, is an algorithm that groups similar objects into groups called clusters. The endpoint is a set of clusters, where each cluster is distinct from each other cluster, and the objects within each cluster are broadly similar to each other.

Anomaly Detection

Anomaly detection is the identification of rare items, events or observations which raise suspicions by differing significantly from the majority of the data. Typically the anomalous items represent an issue such as bank fraud, a structural defect, medical problems or errors in a text.

Neural Networks

Neural networks, in the world of artificial intelligence, are a means of doing machine learning, in which a computer learns to perform some task by analyzing training examples. Usually, the job involves the transformation of one set of vectors into another.

Principal Component Analysis

Principal Component Analysis (PCA) is a statistical procedure that uses an orthogonal transformation which converts a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.

Applications of Unsupervised Learning

Unsupervised learning has numerous applications in the field of data analytics, pattern recognition, machine learning, artificial intelligence, and computer vision. Some of the common applications include social network analysis, market basket analysis, customer segmentation, and outlier detection.

Conclusion

Unsupervised learning is an important concept in machine learning, and it provides a method to help machines learn from unlabelled data. It has numerous applications in various fields and is a key technique for many advanced machine learning models.