BERT预训练模型实战微调指南

🎙️ 语音朗读 当前: 晓晓 (温柔女声)

BERT概述

BERT(Bidirectional Encoder Representations from Transformers)是NLP领域的里程碑模型。

模型结构

graph TB
    A[输入] --> B[Token Embedding]
    A --> C[Segment Embedding]
    A --> D[Position Embedding]
    B --> E[叠加]
    C --> E
    D --> E
    E --> F[BERT Encoder层×12]
    F --> G[输出]

安装与加载

1
2
3
4
5
from transformers import BertTokenizer, BertForSequenceClassification
import torch

tokenizer = BertTokenizer.from_pretrained('bert-base-chinese')
model = BertForSequenceClassification.from_pretrained('bert-base-chinese', num_labels=2)

文本预处理

1
2
3
4
5
6
7
8
9
10
11
def encode_text(text):
encoding = tokenizer(
text,
max_length=128,
padding='max_length',
truncation=True,
return_tensors='pt'
)
return encoding['input_ids'], encoding['attention_mask']

input_ids, attention_mask = encode_text("这是一个正面评论")

微调训练

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from torch.utils.data import DataLoader

optimizer = torch.optim.AdamW(model.parameters(), lr=2e-5)

model.train()
for epoch in range(3):
for batch in dataloader:
optimizer.zero_grad()
input_ids = batch['input_ids']
attention_mask = batch['attention_mask']
labels = batch['labels']

outputs = model(input_ids, attention_mask=attention_mask, labels=labels)
loss = outputs.loss
loss.backward()
optimizer.step()

推理预测

1
2
3
4
5
6
7
def predict(text):
input_ids, attention_mask = encode_text(text)
model.eval()
with torch.no_grad():
outputs = model(input_ids, attention_mask=attention_mask)
probs = torch.softmax(outputs.logits, dim=1)
return probs.numpy()

总结

BERT开创了预训练+微调范式,是现代NLP的基础。

graph LR
A[预训练] --> B[大规模语料]
B --> C[BERT权重]
C --> D[微调]
D --> E[特定任务]
E --> F[下游应用]
© 2019-2026 ovo$^{mc^2}$ All Rights Reserved. | 站点总访问 28969 次 | 访客 19045
Theme by hiero