NLP
[PET 논문리뷰] It’s Not Just Size That Matters:Small Language Models Are Also Few-Shot Learners
코딩무민
2022. 5. 18. 16:12
반응형
1. 핵심 요약
- GPT-3 : 175 billions parameters
- PET (Pattern-Exploiting Training) : combines the idea of reformulating tasks as cloze questions with regular gradient-based finetuning
- In this Model : ALBERT + iPET(PET and its iterative variant)
- GPT-3보다 0.01% 더 작은 파라미터 수로 비슷한 퍼포먼스를 냄
2. 논문 링크
https://arxiv.org/abs/2009.07118
3. 논문 설명 링크
반응형