损失函数全面指南:从MSE到CrossEntropy
损失函数是衡量模型预测与真实标签之间差异的函数,是模型优化的核心。选择合适的损失函数对训练效果至关重要。
回归损失函数
均方误差(MSE)
MSE是最常用的回归损失函数:
$$L_{MSE} = \frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2$$
1 2 3 4 5 6 7 8 9 10 11 12
| import numpy as np
def mse_loss(y_true, y_pred): return np.mean((y_true - y_pred) ** 2)
def mse_gradient(y_true, y_pred): return 2 * (y_pred - y_true) / len(y_true)
y_true = np.array([1.0, 2.0, 3.0, 4.0]) y_pred = np.array([1.1, 1.9, 3.2, 3.8]) print(f"MSE Loss: {mse_loss(y_true, y_pred):.4f}")
|
平均绝对误差(MAE)
$$L_{MAE} = \frac{1}{n}\sum_{i=1}^{n}|y_i - \hat{y}_i|$$
1 2
| def mae_loss(y_true, y_pred): return np.mean(np.abs(y_true - y_pred))
|
MAE对异常值更鲁棒,但在零点不可导。
Huber Loss
Huber Loss结合了MSE和MAE的优点:
1 2 3 4 5 6
| def huber_loss(y_true, y_pred, delta=1.0): error = y_true - y_pred abs_error = np.abs(error) quadratic = np.minimum(abs_error, delta) linear = abs_error - quadratic return 0.5 * quadratic ** 2 + delta * linear
|
分类损失函数
二分类交叉熵
$$L_{BCE} = -\frac{1}{n}\sum_{i=1}^{n}[y_i\log(\hat{y}_i) + (1-y_i)\log(1-\hat{y}_i)]$$
1 2 3
| def binary_cross_entropy(y_true, y_pred, eps=1e-15): y_pred = np.clip(y_pred, eps, 1 - eps) return -np.mean(y_true * np.log(y_pred) + (1 - y_true) * np.log(1 - y_pred))
|
多分类交叉熵
1 2 3
| def categorical_cross_entropy(y_true, y_pred, eps=1e-15): y_pred = np.clip(y_pred, eps, 1 - eps) return -np.mean(np.sum(y_true * np.log(y_pred), axis=1))
|
Focal Loss
Focal Loss用于解决类别不平衡问题,在目标检测中广泛使用:
1 2 3 4 5 6
| def focal_loss(y_true, y_pred, gamma=2.0, alpha=0.25, eps=1e-15): y_pred = np.clip(y_pred, eps, 1 - eps) bce = -y_true * np.log(y_pred) - (1 - y_true) * np.log(1 - y_pred) p_t = y_true * y_pred + (1 - y_true) * (1 - y_pred) alpha_t = y_true * alpha + (1 - y_true) * (1 - alpha) return np.mean(alpha_t * (1 - p_t) ** gamma * bce)
|
PyTorch中的损失函数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| import torch import torch.nn as nn
mse = nn.MSELoss() mae = nn.L1Loss() huber = nn.SmoothL1Loss()
bce = nn.BCELoss() bce_logits = nn.BCEWithLogitsLoss() ce = nn.CrossEntropyLoss()
logits = torch.randn(32, 10) labels = torch.randint(0, 10, (32,)) loss = ce(logits, labels) print(f"CrossEntropy Loss: {loss.item():.4f}")
|
自定义损失函数
在某些场景下,我们需要自定义损失函数。例如,带权重的分类损失:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| class WeightedCrossEntropyLoss(nn.Module): def __init__(self, weight=None): super().__init__() if weight is not None: self.weight = torch.tensor(weight, dtype=torch.float32) else: self.weight = None
def forward(self, logits, targets): log_probs = torch.log_softmax(logits, dim=1) nll_loss = -log_probs.gather(1, targets.unsqueeze(1)).squeeze(1)
if self.weight is not None: self.weight = self.weight.to(logits.device) nll_loss = nll_loss * self.weight[targets]
return nll_loss.mean()
|
损失函数选择指南
| 任务类型 |
推荐损失函数 |
适用场景 |
| 回归 |
MSE |
一般回归任务 |
| 回归 |
MAE |
存在异常值 |
| 回归 |
Huber |
兼顾MSE和MAE |
| 二分类 |
BCE |
二分类任务 |
| 多分类 |
CrossEntropy |
多分类任务 |
| 目标检测 |
Focal Loss |
类别不平衡 |
| 图像分割 |
Dice Loss |
样本不平衡 |
正则化与损失函数
在实际训练中,通常在损失函数中加入正则化项:
1 2 3 4 5 6 7 8
| def l2_regularization(model, lambda_l2=0.001): l2_reg = torch.tensor(0.0) for param in model.parameters(): l2_reg += torch.norm(param, p=2) return lambda_l2 * l2_reg
total_loss = criterion(output, target) + l2_regularization(model)
|
总结
损失函数是连接模型预测与优化目标的桥梁。回归任务常用MSE和Huber Loss,分类任务常用交叉熵损失。在类别不平衡场景下,Focal Loss和Dice Loss更为有效。理解各种损失函数的特性和适用场景,是设计高效训练流程的关键。