logo

SCIENCE CHINA Information Sciences, Volume 64 , Issue 5 : 152102(2021) https://doi.org/10.1007/s11432-019-2740-1

Syntax-guided text generation via graph neural network

More info
  • ReceivedMay 15, 2019
  • AcceptedDec 26, 2019
  • PublishedMar 31, 2021

Abstract


Acknowledgment

This work was supported by National Key Research and Development Program of China (Grant No. 2018YFC0831103), Shanghai Municipal Science and Technology Major Project (Grant No. 2018SHZDZX01), and Zhejiang Lab.


References

[1] Sordoni A, Galley M, Auli M, et al. A neural network approach to context-sensitive generation of conversational responses. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL), Denver, 2015. 196--205. Google Scholar

[2] Bahdanau D, Cho K H, Bengio Y. Neural machine translation by jointly learning to align and translate. In: Proceedings of the 5th International Conference on Learning Representations, 2015. Google Scholar

[3] Serban I V, Sordoni A, Bengio Y, et al. Building end-to-end dialogue systems using generative hierarchical neural network models. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016, 3776--3784. Google Scholar

[4] Ranzato M A, Chopra S, Auli M, et al. Sequence level training with recurrent neural networks. In: Proceedings of the 4th International Conference on Learning Representations, 2016. Google Scholar

[5] Wiseman S, Rush A M. Sequence-to-sequence learning as beam-search optimization. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Austin, 2016. 1296--1306. Google Scholar

[6] Bowman S R, Vilnis L, Vinyals O, et al. Generating sentences from a continuous space. In: Proceedings of the SIGNLL Conference on Computational Natural Language Learning, Berlin, 2016. 10--21. Google Scholar

[7] Hochreiter S, Schmidhuber J. Long short-term memory.. Neural Computation, 1997, 9: 1735-1780 CrossRef PubMed Google Scholar

[8] Chung J Y, Gulcehre C, Cho K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling. In: Proceedings of the Advances in Neural Information Processing Systems Deep Learning Workshop, 2014. Google Scholar

[9] Henaff M, Burna J, LeCun Y. Deep convolutional networks on graph-structured data. 2015,. arXiv Google Scholar

[10] Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations, Toulon, 2017. Google Scholar

[11] Battaglia P W, Pascanu R, Lai M, et al. Interaction networks for learning about objects, relations and physics. In: Proceedings of the Thirtieth Conference on Neural Information Processing Systems, 2016. 4502--4510. Google Scholar

[12] Gilmer J, Schoenholz S S, Riley P F, et al. Neural message passing for quantum chemistry. In: Proceedings of the 34th International Conference on Machine Learning, Sydney, 2017. 1263--1272. Google Scholar

[13] Bengio Y, Louradour J, Collobert R, et al. Curriculum learning. In: Proceedings of the 26th International Conference on Machine Learning, Montreal, 2009. 41--48. Google Scholar

[14] Bengio S, Vinyals O, Jaitly N, et al. Scheduled sampling for sequence prediction with recurrent neural networks. In: Proceedings of the Twenty-ninth Conference on Neural Information Processing Systems, Montréal, 2015. 1171--1179. Google Scholar

[15] Yu L T, Zhang W N, Wang J, et al. SeqGAN: sequence generative adversarial nets with policy gradient. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017. 2852--2858. Google Scholar

[16] Guo J X, Lu S D, Cai H, et al. Long text generation via adversarial training with leaked information. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018. 5141--5148. Google Scholar

[17] Fedus W, Goodfellow I J, Dai A M. MaskGAN: better text generation via filling in the. Google Scholar

[18] Diao Q M, Qiu M H, Wu C Y, et al. Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS). In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014. 193--202. Google Scholar

[19] Papineni K, Roukos S, Ward T, et al. BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, 2002. 311--318. Google Scholar

[20] Wang K, Wan X J. SentiGAN: generating sentimental texts via mixture adversarial networks. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18), 2018. 4446--4452. Google Scholar

[21] Vinyals O, Kaiser L, Koo T, et al. Grammar as a foreign language. In: Proceedings of the Neural Information Processing Systems, 2015. 2773--2781. Google Scholar

[22] Tai K S, Socher R, Manning C D. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, 2015. 1556--1566. Google Scholar

[23] Dyer C, Kuncoro A, Ballesteros M, et al. Recurrent neural network grammars. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, 2016. 199--209. Google Scholar

[24] Alvarez-Melis D, Jaakkola T S. Tree-structured decoding with doubly-recurrent neural networks. In: Proceedings of the International Conference on Learning Representations, 2017. Google Scholar

[25] Zhou G B, Luo P, Cao R Y, et al. Tree-structured neural machine for linguistics-aware sentence generation. In: Thirty-Second AAAI Conference on Artificial Intelligence, 2018. 5722--5729. Google Scholar

[26] Scarselli F, Gori M, Ah Chung Tsoi M. The graph neural network model.. IEEE Trans Neural Netw, 2009, 20: 61-80 CrossRef PubMed Google Scholar

[27] Wu S Z, Zhang D D, Yang N, et al. Sequence-to-dependency neural machine translation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017. 698--707. Google Scholar

[28] Li Y J, Tarlow D, Brockschmidt M, et al. Gated Graph Sequence Neural Networks. In: Proceedings of the 4th International Conference on Learning Representations, San Juan, 2016. Google Scholar

  • Figure 1

    (Color online) Word graph of a sentence. In this paper, we consider both syntactical (black arrows) and sequential (blue arrows) dependencies in the text generation process. As a result, the sentence is modeled as a graph instead of a sequence or a tree.

  •   

    Algorithm 1 Graph-based text generation

    Initialize operating queue: $Q~\leftarrow~\emptyset$;

    $Q.\texttt{append}({\rm~root})$; //use special parameters for root

    while $Q$ is not empty do

    $u~\leftarrow~Q.{\rm~pop}()$; //pop the node from the queue

    for $i$=1 to MAX_CHILDREN

    Sampling an action $a_i~\sim~p(a|u,G)~$; //Eq. (6)

    if $a_i$ = stop then break; //stop the generation of operating node

    $v~\leftarrow$ empty node; //if not break, there is a new node

    if $a_i$ = LC then $u.\texttt{left\_children}.\texttt{append}(v)~$; //left child

    if $a_i$ = RC then $u.\texttt{right\_children}.\texttt{append}(v)~$; //right child

    Modify the seq-dep edges related to $v$;

    MP: $N(v)~\rightarrow~v$; //update the new node $v$; Eqs. (4) and (5)

    Sampling a word $w_v~\sim~p(w|{\boldsymbol~h}_v)~$; //Eq. (7)

    MP: $v~\rightarrow~N(v)~\rightarrow~\cdots~\rightarrow~G~$; //global update from $v$; Eqs. (4) and (5)

    $Q.\texttt{append}(v)$;

    end for

    end while

    for $v~\in~V$ do//final round, re-sample words for the entire sentence

    MP: $N(v)~\rightarrow~v$;//collect messages from its neighbors

    Pick the word with highest probability $w_v~=~{\rm~argmax}_w~p(w|{\boldsymbol~h}_v)~$;

    end for

  • Table 11  

    Table 1Table 1

    (Color online) An illustration of our graph generation$^{\rm~a)}$

  • Table 2  

    Table 2Results on synthetic datasets$^{\rm~a)}$

    Model Len NLL NLL(-) Fail (%) Len NLL NLL(-) Fail (%) Len NLL NLL(-) Fail (%)
    Oracle $5/20$ $~3.27$ 6.21$-$ $10/40$ $3.32$ 6.21$-$ $15/60~$ $~3.34$ 6.21$-$
    LSTM $5/20$ $4.04$ 6.47$~41.7$ $10/40$ $4.10$ 6.49$~49.5$ $15/60$ $~4.14~$ 6.54$~72.4$
    SeqGAN $5/20$ $4.40^\dagger$ 6.60 $~59.4$ $10/40$ $4.52^\dagger$ 6.64 $74.4$ $15/60$ $~4.67^\dagger~$ 6.75 $~79.7$
    LeakGAN $5/20$ $4.81^\dagger$ 6.66 $53.0$ $10/40$ $4.97^\dagger~$ 6.63 $~62.1$ $15/60$ $5.10^\dagger$ 6.74 $~78.9$
    Ours $5/20$ $\mathbf{3.34}$ 6.23 $~0.7$ $10/40$ $~\bf~3.37~$ 6.23 $~0.6~$ $15/60$ $~\bf~3.39$ 6.23 $0.8$

    a

  • Table 3  

    Table 3Samples generated on the synthetic task of different models$^{\rm~a)}$. Our model succeeds in preserving structure integrity most of the time.

    Model Sample
    Oracle ttfamily (TW(TW(TW(TW)(TW)(TW))(TW(TW(TW)(TW)))))
    LSTM ttfamily (TW(TW(TW))(TW)(TW(TW))(TW))_(TW(TW)_(TW(T
    SeqGAN ttfamily (TW(TW))(TW)(TW(TW))_(TW_(TW(TW)_(TW_(TW(TW)
    LeakGAN ttfamily (TW)_(TW_(TW_(TW_(TW(TW(TW(TW))(TW))(TW)(TW)
    Ours ttfamily (TW(TW)(TW)(TW(TW(TW(TW)))(TW)(TW(TW))))

    a

  • Table 4  

    Table 4Results on the IMDB dataset$^{\rm~a)}$. The top half gives the results of generation quality. The bottom half shows the novelty measure by comparing the BLEU score and edit-distance of generated samples against the training set.

    LSTM SeqGAN LeakGAN Ours
    Test/quality $\uparrow$ BLEU-2 0.652 0.683 0.809 0.876 $\uparrow$BLEU-3 0.405 0.418 0.554 0.643 $\uparrow$BLEU-4 0.304 0.315 0.358 0.415 $\uparrow$BLEU-5 0.202 0.221 0.252 0.286
    Train/novelty $\downarrow$BLEU-2 0.9150.997 0.987 0.941
    $\downarrow$BLEU-3 0.7500.990 0.949 0.827
    $\downarrow$BLEU-4 0.5450.980 0.892 0.613
    $\downarrow$BLEU-5 0.387 0.971 0.840 0.361 $\uparrow$Edit dist 14.88 1.05 6.09 18.40 $\uparrow$ED/LEN 71.2%5.7% 27.4% 70.2%
    Human evaluation $\uparrow$ 0.494 0.535 0.644 0.552

    a

  • Table 5  

    Table 5Samples from different models$^{\rm~a)}$

    Method Generated samples
    LSTMThis is a heart but after the news reporter comes together of survival and lost and do you'd really simply work.
    But that and their is a great score just as a good start to entertain because it genuine.
    I think it would leave an impression which bad is the exact same formula pretty big .
    SeqGAN$\dagger$ This does not star Kurt Russell, but rather allows him what amounts to an extended cameo.
    This does good to make a movie and film relies a stupid sense of credibility for the genre or any movie not going to it.
    This also too hard at all, but is also a scene but I am not sure of anything, humor and I think you shouldn't go
    LeakGAN$\dagger$ This movie is very creepy and has some good gory scenes that would be rather disturbing .
    $\dagger$ The story itself we may have seen a dozen times before but it doesn't much matter .
    $\dagger$ This was a great family film and one of my new favorites from Disney and Pixar .
    OursI guess I was some good elements for that attempt in the country movie .
    It 's a movie ends up in this franchise and say this is a fan of his girlfriend.
    The beauty of the film, in this movie, I was a good man and that I don't know.

    a

  • Table 6  

    Table 6Step-by-step examples$^{\rm~a)}$. Note how re-sampling helps to correct mistakes (c.f. the correction of “liked” in the first example).

    Step Case 1 Case 2
    1 _ get _ is
    2 _ I get _ This is
    3 I get _ racing This is _ one
    4 I get racing _ because This is _ a one
    5 I get racing because _ I This is a _ good one
    6 I get racing because I _ liked This is a good one _ ,
    7 I get racing because I liked _ . This is a good one , _ and
    8 I get _ the racing because I liked . This is a good one , and _ excellent
    9 I get the racing because I _ would liked . This is a good one , and excellent _ .
    10 I get the racing because I would like .This is a good one , and excellent _ job .
    11 This is a good one , and excellent job _ of .
    12 This is a good one , and excellent job of _ NUM .
    13 This is a good one , and excellent job of _ the NUM .
    14 This is a good film , and excellent job of the film .

    a