NLP 19

[Dialog Response Selection] Do Response Selection Models Really Know What’s Next? Utterance Manipulation Strategies For Multi-turn Response Selection

1. 핵심 요약 user, system의 utterance history로 optimal response를 찾는 task pre-trained language model이 다양한 NLP tasks에서 좋은 성능을 보이고 있음 →response selection tasks에서는 이를 dialog–response binary classification tasks로 품 위의 방법론은 sequential nature of multi-turn dialog system을 무시 위 논문 the response selection task 하나는 불충분e.g. insertion, deletion, search 등 다양한 방법 제시 → dialog coherence 유지에 도움이 됨 : utterance manipulat..

NLP 2022.05.24

[Dialog Response Selection] An Effective Domain Adaptive Post-Training Method for BERT in Response Selection

1. 핵심 요약 multi-turn response selection in a retrieval-based dialog system BERT로 domain-specific corpus post-training 진행 2 response selection benchmarks SOTA 2. 논문 링크 https://arxiv.org/abs/1908.04812 An Effective Domain Adaptive Post-Training Method for BERT in Response Selection We focus on multi-turn response selection in a retrieval-based dialog system. In this paper, we utilize the powerful p..

NLP 2022.05.24

[SBERT 논문리뷰] Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

1. 핵심 요약 BERT의 문제점 STS task에서 문장이 같은 네트워크에 들어가기 때문에 massive computational overhead 발생 → BERT : Semantic Search나 unsupervised tasks에 적합하지 않음 Sentence-BERT(SBERT) siamese and triplet network 구조를 활용한 pre-trained BERT 네트워크 가장 비슷한 similar pairs를 찾는 시간을 시간을 65 hours에서 약 5초로 단축 & accuarcy는 유지 sentence embedding methods에서 SOTA outperform 2. 논문 링크 https://arxiv.org/abs/1908.10084 Sentence-BERT: Sentence ..

NLP 2022.05.24

[KBP] Zero-shot Slot Filling with DPR and RAG

1. 핵심 요약 slot filling documents collection으로 부터 KG를 자동으로 추출해내는 능력을 평가하기 위한 방법 $[ENTITY, SLOT, ?]$ 중 관련된 passages들로부터 $?$를 filling 하는 방법 최근에는 retrieval-based LM을 사용하여 end-to-end fashion으로 문제를 해결함. RAG : information extraction pipelines 없이도 좋은 성능을 냄 하지만 KILT Benchmark에서 real-world IE system을 따라가지 못함. 논문 더 좋은 slot filler를 만들기 위해, RAG의 retriever과 generator를 어떻게 적용했는지 다양한 전략들 → $KGI_0$ : T-REx, zsRE를..

NLP 2022.05.24

[Numerical Reasoning] Have You Seen That Number? Investigating Extrapolation in Question Answering Models

1. 핵심 요약 기존 NR 모델 learned numerical reasoning capabilities를 interpolate BUT training set에 unseen numbers에까지 좋은 성능 X 이번 모델 key findings : 모델이 unseen numbers를 extrapolate X 해결 model input : digit-by-digit numbers 추가 extrapolation 부족 완화 text에서 number를 다르게 취급해야 한다는 사실을 알아냄 : E-digit number form 2. 논문 링크 https://aclanthology.org/2021.emnlp-main.563/ Have You Seen That Number? Investigating Extrapolat..

NLP 2022.05.24

[PromptBERT 논문리뷰] PromptBERT: Improving BERT Sentence Embeddings with Prompts

1. 핵심 요약 OriginalBERT : sentence semantic similarity에서 poor performance 이유 static token embeddings biases and the ineffective BERT layers NOT the high cosine similarity of the sentence embeddings 모델 prompt based sentence embeddings method reduce token embeddings biases make the original BERT layers more effective reformulating sentence embeddings task → fillin-the-blanks problem 2 prompt represe..

NLP 2022.05.18

[PET 논문리뷰] It’s Not Just Size That Matters:Small Language Models Are Also Few-Shot Learners

1. 핵심 요약 GPT-3 : 175 billions parameters PET (Pattern-Exploiting Training) : combines the idea of reformulating tasks as cloze questions with regular gradient-based finetuning In this Model : ALBERT + iPET(PET and its iterative variant) GPT-3보다 0.01% 더 작은 파라미터 수로 비슷한 퍼포먼스를 냄 2. 논문 링크 https://arxiv.org/abs/2009.07118 It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learner..

NLP 2022.05.18

[Prompt Learning] Prompting Contrastive Explanations for Commonsense Reasoning Tasks

1. 핵심 요약 PLM : near-human performance 만큼 achieved, BUT human-interpretable evidence에는 약함 → 이를 해결하기 위해 PLM을 “contrast alternatives”를 이용한 explanation prompts로 완성 e.g. peanuts are usally salty while raisins are sweet 2. 논문 링크 https://arxiv.org/abs/2106.06823 Prompting Contrastive Explanations for Commonsense Reasoning Tasks Many commonsense reasoning NLP tasks involve choosing between one or more p..

NLP 2022.05.18

[KG-BERT 논문 리뷰] KG-BERT: BERT for Knowledge Graph Completion

1. 핵심 요약 KBC tasks에 pre-trained LM 이용 knowledge graph의 triples를 textual sequence로 인식 → 이러한 triples를 만들기 위해 KG-BERT라는 새로운 framework 제안 input : entity, relation description of a triple → computes scoring function of the triple 결과 : triple classification, link prediction, relation prediction tasks에서 SOTA 달성 2. 논문 링크 https://arxiv.org/abs/1909.03193 KG-BERT: BERT for Knowledge Graph Completion Knowl..

NLP 2022.05.18

[PaLM 논문 리뷰] PaLM: Scaling Language Modeling with Pathways

1. 핵심 요약 최근 모델들 BERT, T5 등의 encoder-only, encoder-decoder architectures 들이 MLM, span corruption 등을 활용하며 NLP tasks에서 좋은 성적을 냄. 위 모델의 한계점 model fine-tuning을 위해 상당한 양의 task-specific training examples를 필요로 함 task에 맞게 fitting 하는 과정에서 model parameter update가 필요함 → model finetuning & deployment에 complexity를 더함 GPT-3 Model few-shot predictions을 사용한 extremely large Autoregressive LMs → decoder-only Trans..

NLP 2022.04.26