线性回归:从数学原理到代码实现

🎙️ 语音朗读 当前: 晓晓 (温柔女声)

线性回归:从数学原理到代码实现

线性回归是最基础的机器学习算法,也是理解更复杂模型的基石。本文将从数学原理出发,逐步实现线性回归。

数学原理

线性回归假设目标变量与特征之间存在线性关系:

$$\hat{y} = w_1x_1 + w_2x_2 + \cdots + w_nx_n + b = \mathbf{w}^T\mathbf{x} + b$$

目标是最小化均方误差:

$$L(\mathbf{w}, b) = \frac{1}{2n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2$$

正规方程解

对于线性回归,存在解析解:

$$\mathbf{w}^* = (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y}$$

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import numpy as np

class LinearRegressionNormal:
"""使用正规方程的线性回归"""

def fit(self, X, y):
# 添加偏置项
X_b = np.c_[np.ones((X.shape[0], 1)), X]
# 正规方程
self.theta = np.linalg.inv(X_b.T @ X_b) @ X_b.T @ y
return self

def predict(self, X):
X_b = np.c_[np.ones((X.shape[0], 1)), X]
return X_b @ self.theta

# 使用示例
np.random.seed(42)
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)

model = LinearRegressionNormal()
model.fit(X, y)
print(f"截距: {model.theta[0]:.4f}, 斜率: {model.theta[1]:.4f}")

梯度下降实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
class LinearRegressionGD:
"""使用梯度下降的线性回归"""

def __init__(self, lr=0.01, n_iters=1000):
self.lr = lr
self.n_iters = n_iters
self.weights = None
self.bias = None
self.losses = []

def fit(self, X, y):
n_samples, n_features = X.shape
self.weights = np.zeros(n_features)
self.bias = 0

for i in range(self.n_iters):
y_pred = np.dot(X, self.weights) + self.bias
error = y_pred - y

dw = (1 / n_samples) * np.dot(X.T, error)
db = (1 / n_samples) * np.sum(error)

self.weights -= self.lr * dw
self.bias -= self.lr * db

loss = np.mean(error ** 2) / 2
self.losses.append(loss)

return self

def predict(self, X):
return np.dot(X, self.weights) + self.bias

多项式回归

线性回归可以扩展为多项式回归来拟合非线性关系:

1
2
3
4
5
6
7
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import Pipeline

poly_reg = Pipeline([
('poly', PolynomialFeatures(degree=2)),
('linear', LinearRegressionGD(lr=0.01, n_iters=1000))
])

正则化线性回归

Ridge回归(L2正则化)

$$L_{Ridge} = \frac{1}{2n}\sum(y_i - \hat{y}_i)^2 + \lambda|\mathbf{w}|_2^2$$

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class RidgeRegression:
def __init__(self, alpha=1.0, lr=0.01, n_iters=1000):
self.alpha = alpha
self.lr = lr
self.n_iters = n_iters

def fit(self, X, y):
n_samples, n_features = X.shape
self.weights = np.zeros(n_features)
self.bias = 0

for _ in range(self.n_iters):
y_pred = np.dot(X, self.weights) + self.bias
error = y_pred - y

dw = (1 / n_samples) * (np.dot(X.T, error) + self.alpha * self.weights)
db = (1 / n_samples) * np.sum(error)

self.weights -= self.lr * dw
self.bias -= self.lr * db
return self

def predict(self, X):
return np.dot(X, self.weights) + self.bias

Lasso回归(L1正则化)

L1正则化可以产生稀疏解,用于特征选择:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class LassoRegression:
def __init__(self, alpha=1.0, lr=0.01, n_iters=1000):
self.alpha = alpha
self.lr = lr
self.n_iters = n_iters

def fit(self, X, y):
n_samples, n_features = X.shape
self.weights = np.zeros(n_features)
self.bias = 0

for _ in range(self.n_iters):
y_pred = np.dot(X, self.weights) + self.bias
error = y_pred - y

dw = (1 / n_samples) * np.dot(X.T, error) + self.alpha * np.sign(self.weights)
db = (1 / n_samples) * np.sum(error)

self.weights -= self.lr * dw
self.bias -= self.lr * db
return self

模型评估

1
2
3
4
5
6
7
8
9
10
11
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = LinearRegressionGD(lr=0.01, n_iters=1000)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print(f"MSE: {mean_squared_error(y_test, y_pred):.4f}")
print(f"R²: {r2_score(y_test, y_pred):.4f}")

总结

线性回归虽然简单,但包含了机器学习的核心概念:损失函数、优化方法、正则化和模型评估。正规方程提供了精确解,梯度下降适用于大规模数据。Ridge和Lasso正则化分别解决过拟合和特征选择问题,是实际应用中不可或缺的工具。

© 2019-2026 ovo$^{mc^2}$ All Rights Reserved. | 站点总访问 28969 次 | 访客 19045
Theme by hiero