Skip to content

Latest commit

 

History

History
43 lines (33 loc) · 2.56 KB

SenseBERT.md

File metadata and controls

43 lines (33 loc) · 2.56 KB

SenseBERT: Driving Some Sense into BERT

Yoav Levine et al. AI21 Labs,

May 2020 [arXiv]

Whats Unique This paper present a technique to predict super-sense of the masked word, with the help of weak supervision by infusing supersense of the words during pretraining. And, it surpasses state of the art results for WiC task of SuperGLUE.

How It Works

  • It has selected set of 45 supersenses as the candidates for infusion in weak supervision as well as for prediction.

  • Following figure layout the diagram for architecture

    Source: Author

  • Following example illustrate the concept of weak supervion

    Source: Author

  • Sense-language-modelling: Allowed senses prediction

    \begin{aligned}
\mathcal{L}_{\mathrm{SLM}}^{\text {allowed }} &=-\log p(s \in A(w) \mid \text { context }) \\
&=-\log \sum_{s \in A(w)} p(s \mid \text { context })
\end{aligned}

  • Sense-language-modelling Regularisation

    \mathcal{L}_{\mathrm{SLM}}^{\mathrm{reg}}=-\sum_{s \in A(w)} \frac{1}{|A(w)|} \log p(s \mid \text { context })
  • SLM Loss function \mathcal{L}_{\mathrm{SLM}}=\mathcal{L}_{\mathrm{SLM}}^{\text {allowed }}+\mathcal{L}_{\mathrm{SLM}}^{\mathrm{reg}}

  • Following figure shows an example of how it predicts supersenses for masked words as well as for unmasked sentence.

    Source: Author

Results

  • It has shown 2.5% improvement for WiC task.