损失函数全面指南：从MSE到CrossEntropy

Posted on 四月 8, 2019

🎙️ 语音朗读当前: 晓晓 (温柔女声)

损失函数全面指南：从MSE到CrossEntropy

损失函数是衡量模型预测与真实标签之间差异的函数，是模型优化的核心。选择合适的损失函数对训练效果至关重要。

回归损失函数

均方误差（MSE）

MSE是最常用的回归损失函数：

$$L_{MSE} = \frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2$$

import numpy as np

def mse_loss(y_true, y_pred):
    return np.mean((y_true - y_pred) ** 2)

def mse_gradient(y_true, y_pred):
    return 2 * (y_pred - y_true) / len(y_true)

# 示例
y_true = np.array([1.0, 2.0, 3.0, 4.0])
y_pred = np.array([1.1, 1.9, 3.2, 3.8])
print(f"MSE Loss: {mse_loss(y_true, y_pred):.4f}")

平均绝对误差（MAE）

$$L_{MAE} = \frac{1}{n}\sum_{i=1}^{n}|y_i - \hat{y}_i|$$

1 2	def mae_loss(y_true, y_pred): return np.mean(np.abs(y_true - y_pred))

MAE对异常值更鲁棒，但在零点不可导。

Huber Loss

Huber Loss结合了MSE和MAE的优点：

def huber_loss(y_true, y_pred, delta=1.0):
    error = y_true - y_pred
    abs_error = np.abs(error)
    quadratic = np.minimum(abs_error, delta)
    linear = abs_error - quadratic
    return 0.5 * quadratic ** 2 + delta * linear

分类损失函数

二分类交叉熵

$$L_{BCE} = -\frac{1}{n}\sum_{i=1}^{n}[y_i\log(\hat{y}_i) + (1-y_i)\log(1-\hat{y}_i)]$$

1
2
3

def binary_cross_entropy(y_true, y_pred, eps=1e-15):
    y_pred = np.clip(y_pred, eps, 1 - eps)
    return -np.mean(y_true * np.log(y_pred) + (1 - y_true) * np.log(1 - y_pred))

多分类交叉熵

1
2
3

def categorical_cross_entropy(y_true, y_pred, eps=1e-15):
    y_pred = np.clip(y_pred, eps, 1 - eps)
    return -np.mean(np.sum(y_true * np.log(y_pred), axis=1))

Focal Loss

Focal Loss用于解决类别不平衡问题，在目标检测中广泛使用：

def focal_loss(y_true, y_pred, gamma=2.0, alpha=0.25, eps=1e-15):
    y_pred = np.clip(y_pred, eps, 1 - eps)
    bce = -y_true * np.log(y_pred) - (1 - y_true) * np.log(1 - y_pred)
    p_t = y_true * y_pred + (1 - y_true) * (1 - y_pred)
    alpha_t = y_true * alpha + (1 - y_true) * (1 - alpha)
    return np.mean(alpha_t * (1 - p_t) ** gamma * bce)

PyTorch中的损失函数

import torch
import torch.nn as nn

# 回归损失
mse = nn.MSELoss()
mae = nn.L1Loss()
huber = nn.SmoothL1Loss()

# 分类损失
bce = nn.BCELoss()
bce_logits = nn.BCEWithLogitsLoss()  # 结合Sigmoid和BCE
ce = nn.CrossEntropyLoss()  # 结合Softmax和CE

# 示例
logits = torch.randn(32, 10)  # 批量大小32，10个类别
labels = torch.randint(0, 10, (32,))  # 真实标签
loss = ce(logits, labels)
print(f"CrossEntropy Loss: {loss.item():.4f}")

自定义损失函数

在某些场景下，我们需要自定义损失函数。例如，带权重的分类损失：

class WeightedCrossEntropyLoss(nn.Module):
    def __init__(self, weight=None):
        super().__init__()
        if weight is not None:
            self.weight = torch.tensor(weight, dtype=torch.float32)
        else:
            self.weight = None

    def forward(self, logits, targets):
        log_probs = torch.log_softmax(logits, dim=1)
        nll_loss = -log_probs.gather(1, targets.unsqueeze(1)).squeeze(1)

        if self.weight is not None:
            self.weight = self.weight.to(logits.device)
            nll_loss = nll_loss * self.weight[targets]

        return nll_loss.mean()

损失函数选择指南

任务类型	推荐损失函数	适用场景
回归	MSE	一般回归任务
回归	MAE	存在异常值
回归	Huber	兼顾MSE和MAE
二分类	BCE	二分类任务
多分类	CrossEntropy	多分类任务
目标检测	Focal Loss	类别不平衡
图像分割	Dice Loss	样本不平衡

正则化与损失函数

在实际训练中，通常在损失函数中加入正则化项：

def l2_regularization(model, lambda_l2=0.001):
    l2_reg = torch.tensor(0.0)
    for param in model.parameters():
        l2_reg += torch.norm(param, p=2)
    return lambda_l2 * l2_reg

# 总损失 = 任务损失 + 正则化损失
total_loss = criterion(output, target) + l2_regularization(model)

总结

损失函数是连接模型预测与优化目标的桥梁。回归任务常用MSE和Huber Loss，分类任务常用交叉熵损失。在类别不平衡场景下，Focal Loss和Dice Loss更为有效。理解各种损失函数的特性和适用场景，是设计高效训练流程的关键。