1. Introduction

[BERT]

BERT의 기존 MLM 방법은 individual word나 subword 단위에서 강력한 성능
2개 이상의 span(범위) 간의 관계를 추론하는 question answering, coreference resolution과 같은 NLP task들이 존재

[SpanBERT]

span-level pre-training 방법(2) span boundary objective를 사용하여 boundary token을 이용해 masking된 전체 span을 예측(3) NSP를 사용하지 않고 single segment 사용
→ BERT와 비교하여 다양한 downstream task에서 BERT를 능가하는 성능
→ boundary token에 span-level 정보가 저장되어 있어 fine-tuning시에도 쉽게 접근
(1) 개별 token을 masking하지 않고 contiguous random span을 masking

논문 링크

https://arxiv.org/abs/1907.10529

SpanBERT: Improving Pre-training by Representing and Predicting Spans

We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text. Our approach extends BERT by (1) masking contiguous random spans, rather than random tokens, and (2) training the span boundary representations to pr

arxiv.org

설명 링크

https://coding-moomin.notion.site/SpanBERT-eb9a8a0bf1984c81866cc3ff10529e4e

SpanBERT

contents

coding-moomin.notion.site