Ra-dit: Retrieval-augmented dual instruction tuning.

Lin, X. V., Chen, X., Chen, M., Shi, W., Lomeli, M., James, R., ... & Yih, S. (2023).
arXiv preprint arXiv:2310.01352 [PDF]

Key Points

Retrieval augmented langauge models (RALMs) improve performance by accessing long-tail and up-to-date knowledge.
Existing approach to RALMs is (i) Retriever specific modifications to LM pre-training. (ii) post-hoc integration of data-store.
RADIT mainly has
- fine tunining pre-trained LM to better use retrieved information.
- fine tuninig retreiver to return more relevant results
RALM improves 0-zhot performance by around 8.9% and 1.4% in 5-shot settings.
Retriever uses DRAGON+ - a state of the art dense encoder model trained with contrastive learning objective.
Chained Objective: Retreival and Generation.
$pLM(y|x, C) = X_{c \in C} pLM(y|c, x) · pR(c|x)$

for each (x,y) record, it fetches top 5 contexts, and for each context, it generate a training pair (y, (c, x)).
$L(D_L) = −\sum_i\sum_jlogp_{LM}(y_i|c_{ij} . x_i)$

learn KL divergence function for each context c:
$L(D_R) = E_{(x,y)∈D_R} KL(p_R(c|x), p_{LSR}(c|x, y))$
where, $p_{LSR}$ is generalized version of LM-Supervised Retrieval, Shi et al., 2023b.
$p_{LSR}(c|x, y) = \frac{exp (p_{LM}(y|c ◦ x)/\tau )}{\sum_{c'\inC} </li> </ul> <h1>Results</h1> <ul> <li>Following figure cites the results</li> </ul> <p align=$ Source: Author