线性回归:从数学原理到代码实现
线性回归是最基础的机器学习算法,也是理解更复杂模型的基石。本文将从数学原理出发,逐步实现线性回归。
数学原理
线性回归假设目标变量与特征之间存在线性关系:
$$\hat{y} = w_1x_1 + w_2x_2 + \cdots + w_nx_n + b = \mathbf{w}^T\mathbf{x} + b$$
目标是最小化均方误差:
$$L(\mathbf{w}, b) = \frac{1}{2n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2$$
正规方程解
对于线性回归,存在解析解:
$$\mathbf{w}^* = (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y}$$
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
| import numpy as np
class LinearRegressionNormal: """使用正规方程的线性回归"""
def fit(self, X, y): X_b = np.c_[np.ones((X.shape[0], 1)), X] self.theta = np.linalg.inv(X_b.T @ X_b) @ X_b.T @ y return self
def predict(self, X): X_b = np.c_[np.ones((X.shape[0], 1)), X] return X_b @ self.theta
np.random.seed(42) X = 2 * np.random.rand(100, 1) y = 4 + 3 * X + np.random.randn(100, 1)
model = LinearRegressionNormal() model.fit(X, y) print(f"截距: {model.theta[0]:.4f}, 斜率: {model.theta[1]:.4f}")
|
梯度下降实现
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
| class LinearRegressionGD: """使用梯度下降的线性回归"""
def __init__(self, lr=0.01, n_iters=1000): self.lr = lr self.n_iters = n_iters self.weights = None self.bias = None self.losses = []
def fit(self, X, y): n_samples, n_features = X.shape self.weights = np.zeros(n_features) self.bias = 0
for i in range(self.n_iters): y_pred = np.dot(X, self.weights) + self.bias error = y_pred - y
dw = (1 / n_samples) * np.dot(X.T, error) db = (1 / n_samples) * np.sum(error)
self.weights -= self.lr * dw self.bias -= self.lr * db
loss = np.mean(error ** 2) / 2 self.losses.append(loss)
return self
def predict(self, X): return np.dot(X, self.weights) + self.bias
|
多项式回归
线性回归可以扩展为多项式回归来拟合非线性关系:
1 2 3 4 5 6 7
| from sklearn.preprocessing import PolynomialFeatures from sklearn.pipeline import Pipeline
poly_reg = Pipeline([ ('poly', PolynomialFeatures(degree=2)), ('linear', LinearRegressionGD(lr=0.01, n_iters=1000)) ])
|
正则化线性回归
Ridge回归(L2正则化)
$$L_{Ridge} = \frac{1}{2n}\sum(y_i - \hat{y}_i)^2 + \lambda|\mathbf{w}|_2^2$$
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
| class RidgeRegression: def __init__(self, alpha=1.0, lr=0.01, n_iters=1000): self.alpha = alpha self.lr = lr self.n_iters = n_iters
def fit(self, X, y): n_samples, n_features = X.shape self.weights = np.zeros(n_features) self.bias = 0
for _ in range(self.n_iters): y_pred = np.dot(X, self.weights) + self.bias error = y_pred - y
dw = (1 / n_samples) * (np.dot(X.T, error) + self.alpha * self.weights) db = (1 / n_samples) * np.sum(error)
self.weights -= self.lr * dw self.bias -= self.lr * db return self
def predict(self, X): return np.dot(X, self.weights) + self.bias
|
Lasso回归(L1正则化)
L1正则化可以产生稀疏解,用于特征选择:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| class LassoRegression: def __init__(self, alpha=1.0, lr=0.01, n_iters=1000): self.alpha = alpha self.lr = lr self.n_iters = n_iters
def fit(self, X, y): n_samples, n_features = X.shape self.weights = np.zeros(n_features) self.bias = 0
for _ in range(self.n_iters): y_pred = np.dot(X, self.weights) + self.bias error = y_pred - y
dw = (1 / n_samples) * np.dot(X.T, error) + self.alpha * np.sign(self.weights) db = (1 / n_samples) * np.sum(error)
self.weights -= self.lr * dw self.bias -= self.lr * db return self
|
模型评估
1 2 3 4 5 6 7 8 9 10 11
| from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error, r2_score
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = LinearRegressionGD(lr=0.01, n_iters=1000) model.fit(X_train, y_train) y_pred = model.predict(X_test)
print(f"MSE: {mean_squared_error(y_test, y_pred):.4f}") print(f"R²: {r2_score(y_test, y_pred):.4f}")
|
总结
线性回归虽然简单,但包含了机器学习的核心概念:损失函数、优化方法、正则化和模型评估。正规方程提供了精确解,梯度下降适用于大规模数据。Ridge和Lasso正则化分别解决过拟合和特征选择问题,是实际应用中不可或缺的工具。