Skip to content

Latest commit

 

History

History
89 lines (66 loc) · 6.7 KB

NQG_semantic_graphs.md

File metadata and controls

89 lines (66 loc) · 6.7 KB

Semantic Graphs for Generating Deep Questions

Liangming Pan, Yuxi Xie, Yansong Feng, Tat-Seng Chua, Min-Yen Kan

ACL 2020 [arXiv]

Whats Unique This paper presents a technique for Deep Question Generation. It fuses document level and semantic graph level representations, and generate questions using attention mechanism. It has dual objective of content selection and question construction.

How It Works

Source: Author

The architecture diagram for the DQG can be seen as below:

Source: Author

  • Problem statement Given the document D and the answer A, the objective is to generate a question Q, that satisfies:

    \overline{\mathcal{Q}}=\arg \max _{\mathcal{Q}} P(\mathcal{Q} \mid \mathcal{D}, \mathcal{A})
  • Semantic Graph Construction

    • SRL based graph
    • Dependency parse tree graph
  • Semantic Enriched Document Representations

    • Document Encoding

      • D = [w1, · · · , wl]
      • X_D = [x1, · · · , xl]
      • x_i = [x_i->; x_i<-]
    • Node Initialisation

      • G = (V, E)

      • V = {v_i}_i=1:Nv

      • E = {e_k}_k=1:Ne

      • Each node of the graph is a text span in document involving neighbour words.

      • Initial representation of node is obtained by computing word to node attention.

      • Document encoding: d_D in both direction

      • \left\{w_{m_{v}}, \cdots, w_{j}, \cdots, w_{n_{v}}\right\}$ in $v$ as follows\\
      • 
\beta_{j}^{v}=\frac{\exp \left(\operatorname{Attn}\left(\mathbf{d}_{\mathcal{D}}, \mathbf{x}_{j}\right)\right)}{\sum_{k=m_{n}}^{n_{v}} \exp \left(\mathbf{A} \operatorname{ttn}\left(\mathbf{d}_{\mathcal{D}}, \mathbf{x}_{k}\right)\right)}

      • \mathbf{h}_{v}^{0}=\sum_{j=m_{v}}^{n_{v}} \beta_{j}^{v} \mathbf{x}_{j}
      • In esscence, each node of the graph would have multiple words, attention for each of those word would be computed in context of whole document embeddings. And, node represenation would be the weighted embedding of those words.

    • Graph Encoding (Att-GGNN) - Attention based gated graph neural network

      • Represeantions based on incoming edges, and outgoing edges, weighted by attentions between nodes, and also weighted by the type of the edge between nodes is computed as follow:

      • \mathbf{h}_{\mathcal{N}_{\vdash(i)}}^{(k)}=\sum_{v_{j} \in \mathcal{N}_{\vdash(i)}} \alpha_{i j}^{(k)} \mathbf{W}^{t_{e_{i j}}} \mathbf{h}_{j}^{(k)}\\
\mathbf{h}_{\mathcal{N}_{\dashv(i)}}^{(k)}=\sum_{v_{j} \in \mathcal{N}_{\dashv(i)}} \alpha_{i j}^{(k)} \mathbf{W}^{t_{e j i}} \mathbf{h}_{j}^{(k)}

      • Where, attention between two nodes are computed as below:

      \alpha_{i j}^{(k)}=\frac{\exp \left(\operatorname{Attn}\left(\mathbf{h}_{i}^{(k)}, \mathbf{h}_{j}^{(k)}\right)\right)}{\sum_{t \in \mathcal{N}_{(i)}} \exp \left(\operatorname{Attn}\left(\mathbf{h}_{i}^{(k)}, \mathbf{h}_{t}^{(k)}\right)\right)}
      • Hiddent state after k-th transition is as follow:
      \mathbf{h}_{i}^{(k+1)}=\operatorname{GRU}\left(\mathbf{h}_{i}^{(k)},\left[\mathbf{h}_{\mathcal{N}_{\vdash}(i)}^{(k)} ; \mathbf{h}_{\mathcal{N}_{\dashv(i)}}^{(k)}\right]\right)
    • Fusing graph and document represeantions. It uses a matching strategy where a finest granular node in graph which comaintains the document word is selected.

  • Joint Task Question Generation

    • Sementic enriched encoded representations as the attention memomry to generate the output sequence.
    • Decoder hidden state is initialised with answer embeddings
    • At each step model learns to generate
      • Context vector c_t from semantic enriched encoded representations
      • Decoding state s_t
      • Copying probability is computed based on c_t, s_t, and y_t-1
      • Copyting probability to generate word from vocabolory or from document inputs
      • Coverage vector - to penalise attending over same locations of input documents.
    • Content selection task specific layer as multi task training.
  • Evaluation

    • BLEU, ROUGE, METEOR scores
    • Human evaluations with three dimensions, fluency, relevance, and complexity.

    Source: Author

    • Also, Ablation study for design decisions.