Least squares regression aims to minimize the sum of the squared

differences between observed and predicted values. For classification, the

idea is to adapt this approach to predict class labels, typically by encoding

classes numerically (e.g., -1 and 1 for binary classification).

➢ Process

• Encoding Class Labels: For binary classification, the class labels (e.g.,

positive and negative) are encoded as -1 and 1. For multi-class

classification, labels are often encoded using a one-vs-all strategy.

• Model Fitting: The regression model fits a linear function to these

encoded labels by minimizing the sum of squared errors. This involves

solving for the weights 𝑤w in the linear equation:

𝑦= 𝑋𝑤+𝜖y

where 𝑋 is the feature matrix, 𝑦 is the vector of encoded labels, and 𝜖 is

the error term.

• Prediction: The fitted linear model is used to predict the class label for

new data points. For binary classification, the sign of the predicted

value determines the class (positive if 𝑦>, negative if 𝑦<0). For multi-

class classification, the class with the highest predicted value is chosen.

➢ Advantages

• Simplicity: Least squares regression is straightforward to implement

and understand.

• Efficiency: Computationally efficient, especially for small to medium-

sized datasets.

• Interpretability: The linear model’s coefficients can be easily

interpreted to understand the relationship between features and the

target variable.

➢ Limitations

• Not Optimized for Classification: Least squares regression is designed

for continuous outcomes and may not handle classification boundaries

effectively.

• Assumptions: Assumes linear separability of classes, which may not

hold in practice.

• Sensitivity to Outliers: Can be sensitive to outliers, which may distort

the classification boundary.

[…] A. Least Square Regression for classification [05] […]