Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

增加Sentence-Bert论文讲解 #858

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

liucongg
Copy link
Contributor

增加Sentence-Bert论文讲解

@w5688414
Copy link
Contributor

w5688414 commented Feb 20, 2022

  1. Sentence Transformer飞桨已经实现了。把模型的代码加入进去,并对代码加入注释,做一定的解析。
    https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/text_matching/sentence_transformers

请参考:
https://paddlepedia.readthedocs.io/en/latest/tutorials/computer_vision/classification/AlexNet.html


并且在对bert模型进行微调时,设置了三个目标函数,用于不同任务的训练优化,具体如下:

### 2.1分类目标函数(Classification Objective Function)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

标题不需要打括号带英文,正文里带英文


![](../../images/natural_language_processing/Sentence-Bert/sentence_bert_1.png)

### 2.2回归目标函数(Regression Objective Function)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上


![](../../images/natural_language_processing/Sentence-Bert/sentence_bert_2.png)

### 2.3三元目标函数(Triplet Objective Function)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

## 1介绍
[Bert模型](https://paddlepedia.readthedocs.io/en/latest/tutorials/pretrain_model/bert.html#)已经在NLP各大任务中都展现出了强者的姿态。在语义相似度计算(semantic textual similarity)任务上也不例外,但是,由于bert模型规定,在计算语义相似度时,需要将两个句子同时进入模型,进行信息交互,这造成大量的计算开销,使得它既不适合语义相似度搜索,也不适合非监督任务,比如聚类。

例如,有10000个句子,我们想要找出最相似的句子对,需要计算(10000*9999/2)次,需要大约65个小时。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个计算过程是?9999表示啥?2表示啥?


### 2.3三元目标函数(Triplet Objective Function)

在这个目标函数下,将模型框架进行修改,将原来的两个输入,变成三个句子输入。给定一个锚定句(anchor sentence)$a$,一个肯定句(positive sentence)$p$和一个否定句(negative sentence)$n$,模型通过使$a到p$的距离小于$a到n$的距离,来优化模型。使其目标函数$o$最小,即
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anchor sentence是啥?解释一下

其中,$s_{a}$、$s_{p}$和$s_{n}$分别表示句子$a$、$p$和$n$的向量,$||·||$表示距离度量,$\varepsilon$表示边距。在论文中,距离度量为欧式距离,边距大小为1。

### 2.4训练参数
模型训练过程中,批次大小为16,学习率为2e-5,采用Adam优化器进行优化,并且默认的池化策略为平均池化。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adam还是AdamW?现在的BERT的优化器应该都是AdamW

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants