Language models (LMs) such as GPT-3 and PaLM have shown impressive abilities in a range of natural language processing (NLP) tasks. However, relying solely on their parameters to encode a wealth of world knowledge requires a prohibitively large number of parameters and hence massive computing, and they often struggle to learn long-rail knowledge. Moreover, these parametric LMs are fundamentally incapable of adapting over time, often hallucinate, and may leak private data from the training corpus. To overcome these limitations, there has been growing interest in retrieval-based LMs which incorporate a non-parametric datastore (e.g., text chunks from an external corpus) with their parametric counterparts. Retrieval-based LMs can outperform LMs without retrieval by a large margin with much fewer parameters, can update their knowledge by replacing their retrieval corpora, and provide citations for users to easily verify and evaluate the predictions.
In this tutorial, we aim to provide a comprehensive and coherent overview of recent advances in retrieval-based LMs. We will start by first providing preliminaries covering the foundations of LMs and retrieval systems. We will then focus on recent progress in architectures, learning approaches, and applications of retrieval-based LMs.
from Hacker News https://ift.tt/DmCuvIb
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.