Factor Analysis

From Canonica AI

Introduction

Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. In other words, it is possible to distill the information contained in several original variables into a smaller set of new composite dimensions, with each dimension representing a common factor. The observed variables are modeled as linear combinations of the potential factors, plus "error" terms.

History

Factor analysis was developed in the early 20th century and is closely related to principal component analysis and canonical correlation. Spearman (1904) originally developed factor analysis to study human intelligence, and Thurstone (1931) later developed it further.

An old statistics book opened on a page about factor analysis.
An old statistics book opened on a page about factor analysis.

Mathematical Formulation

Factor analysis starts with a data set of n observations on p variables. The goal is to transform the data to conform as closely as possible to the model:

X = ΛF + ε

where X is a px1 random vector representing the observed variables, Λ is a pxm matrix of the loadings, F is an mx1 random vector of the common factors, and ε is a px1 random vector of the unique factors or errors.

Assumptions

Factor analysis makes several assumptions about the data and the underlying model. These include:

1. There are fewer factors than original variables. 2. The factors are orthogonal, meaning they are uncorrelated and have a variance of one. 3. The errors are uncorrelated with each other and with the factors.

Methods

There are several methods for conducting factor analysis, including principal component analysis, principal axis factoring, and maximum likelihood.

Applications

Factor analysis is widely used in psychology, social sciences, marketing, product management, operations research, and finance. It helps to understand the structure of a set of variables and to reduce a large number of variables to a manageable set of factors.

Limitations

Despite its wide usage, factor analysis has several limitations. These include:

1. It is sensitive to the assumptions about the underlying model. 2. It may not provide a unique solution, as there can be many sets of factors that satisfy the model. 3. It can be difficult to interpret the factors.

See Also

Principal Component Analysis Canonical Correlation Cluster Analysis Discriminant Analysis