损失函数全面指南:从MSE到CrossEntropy

🎙️ 语音朗读 当前: 晓晓 (温柔女声)

损失函数全面指南:从MSE到CrossEntropy

损失函数是衡量模型预测与真实标签之间差异的函数,是模型优化的核心。选择合适的损失函数对训练效果至关重要。

回归损失函数

均方误差(MSE)

MSE是最常用的回归损失函数:

$$L_{MSE} = \frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2$$

1
2
3
4
5
6
7
8
9
10
11
12
import numpy as np

def mse_loss(y_true, y_pred):
return np.mean((y_true - y_pred) ** 2)

def mse_gradient(y_true, y_pred):
return 2 * (y_pred - y_true) / len(y_true)

# 示例
y_true = np.array([1.0, 2.0, 3.0, 4.0])
y_pred = np.array([1.1, 1.9, 3.2, 3.8])
print(f"MSE Loss: {mse_loss(y_true, y_pred):.4f}")

平均绝对误差(MAE)

$$L_{MAE} = \frac{1}{n}\sum_{i=1}^{n}|y_i - \hat{y}_i|$$

1
2
def mae_loss(y_true, y_pred):
return np.mean(np.abs(y_true - y_pred))

MAE对异常值更鲁棒,但在零点不可导。

Huber Loss

Huber Loss结合了MSE和MAE的优点:

1
2
3
4
5
6
def huber_loss(y_true, y_pred, delta=1.0):
error = y_true - y_pred
abs_error = np.abs(error)
quadratic = np.minimum(abs_error, delta)
linear = abs_error - quadratic
return 0.5 * quadratic ** 2 + delta * linear

分类损失函数

二分类交叉熵

$$L_{BCE} = -\frac{1}{n}\sum_{i=1}^{n}[y_i\log(\hat{y}_i) + (1-y_i)\log(1-\hat{y}_i)]$$

1
2
3
def binary_cross_entropy(y_true, y_pred, eps=1e-15):
y_pred = np.clip(y_pred, eps, 1 - eps)
return -np.mean(y_true * np.log(y_pred) + (1 - y_true) * np.log(1 - y_pred))

多分类交叉熵

1
2
3
def categorical_cross_entropy(y_true, y_pred, eps=1e-15):
y_pred = np.clip(y_pred, eps, 1 - eps)
return -np.mean(np.sum(y_true * np.log(y_pred), axis=1))

Focal Loss

Focal Loss用于解决类别不平衡问题,在目标检测中广泛使用:

1
2
3
4
5
6
def focal_loss(y_true, y_pred, gamma=2.0, alpha=0.25, eps=1e-15):
y_pred = np.clip(y_pred, eps, 1 - eps)
bce = -y_true * np.log(y_pred) - (1 - y_true) * np.log(1 - y_pred)
p_t = y_true * y_pred + (1 - y_true) * (1 - y_pred)
alpha_t = y_true * alpha + (1 - y_true) * (1 - alpha)
return np.mean(alpha_t * (1 - p_t) ** gamma * bce)

PyTorch中的损失函数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import torch
import torch.nn as nn

# 回归损失
mse = nn.MSELoss()
mae = nn.L1Loss()
huber = nn.SmoothL1Loss()

# 分类损失
bce = nn.BCELoss()
bce_logits = nn.BCEWithLogitsLoss() # 结合Sigmoid和BCE
ce = nn.CrossEntropyLoss() # 结合Softmax和CE

# 示例
logits = torch.randn(32, 10) # 批量大小32,10个类别
labels = torch.randint(0, 10, (32,)) # 真实标签
loss = ce(logits, labels)
print(f"CrossEntropy Loss: {loss.item():.4f}")

自定义损失函数

在某些场景下,我们需要自定义损失函数。例如,带权重的分类损失:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class WeightedCrossEntropyLoss(nn.Module):
def __init__(self, weight=None):
super().__init__()
if weight is not None:
self.weight = torch.tensor(weight, dtype=torch.float32)
else:
self.weight = None

def forward(self, logits, targets):
log_probs = torch.log_softmax(logits, dim=1)
nll_loss = -log_probs.gather(1, targets.unsqueeze(1)).squeeze(1)

if self.weight is not None:
self.weight = self.weight.to(logits.device)
nll_loss = nll_loss * self.weight[targets]

return nll_loss.mean()

损失函数选择指南

任务类型 推荐损失函数 适用场景
回归 MSE 一般回归任务
回归 MAE 存在异常值
回归 Huber 兼顾MSE和MAE
二分类 BCE 二分类任务
多分类 CrossEntropy 多分类任务
目标检测 Focal Loss 类别不平衡
图像分割 Dice Loss 样本不平衡

正则化与损失函数

在实际训练中,通常在损失函数中加入正则化项:

1
2
3
4
5
6
7
8
def l2_regularization(model, lambda_l2=0.001):
l2_reg = torch.tensor(0.0)
for param in model.parameters():
l2_reg += torch.norm(param, p=2)
return lambda_l2 * l2_reg

# 总损失 = 任务损失 + 正则化损失
total_loss = criterion(output, target) + l2_regularization(model)

总结

损失函数是连接模型预测与优化目标的桥梁。回归任务常用MSE和Huber Loss,分类任务常用交叉熵损失。在类别不平衡场景下,Focal Loss和Dice Loss更为有效。理解各种损失函数的特性和适用场景,是设计高效训练流程的关键。

© 2019-2026 ovo$^{mc^2}$ All Rights Reserved. | 站点总访问 28969 次 | 访客 19045
Theme by hiero