NLP

[PET 논문리뷰] It’s Not Just Size That Matters:Small Language Models Are Also Few-Shot Learners

코딩무민 2022. 5. 18. 16:12
반응형

1. 핵심 요약 

  • GPT-3 : 175 billions parameters
  • PET (Pattern-Exploiting Training) : combines the idea of reformulating tasks as cloze questions with regular gradient-based finetuning
  • In this Model : ALBERT + iPET(PET and its iterative variant)
    • GPT-3보다 0.01% 더 작은 파라미터 수로 비슷한 퍼포먼스를 냄

2. 논문 링크

https://arxiv.org/abs/2009.07118

 

It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners

When scaled to hundreds of billions of parameters, pretrained language models such as GPT-3 (Brown et al., 2020) achieve remarkable few-shot performance. However, enormous amounts of compute are required for training and applying such big models, resulting

arxiv.org

3. 논문 설명 링크 

https://coding-moomin.notion.site/It-s-Not-Just-Size-That-Matters-Small-Language-Models-Are-Also-Few-Shot-Learners-d04adf432ce74105bb374821c96a1508

 

It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners

1. It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners

coding-moomin.notion.site

 

반응형