SCIENCE CHINA Information Sciences, Volume 64 , Issue 5 : 152102(2021) https://doi.org/10.1007/s11432-019-2740-1

## Syntax-guided text generation via graph neural network

• AcceptedDec 26, 2019
• PublishedMar 31, 2021
Share
Rating

### Acknowledgment

This work was supported by National Key Research and Development Program of China (Grant No. 2018YFC0831103), Shanghai Municipal Science and Technology Major Project (Grant No. 2018SHZDZX01), and Zhejiang Lab.

### References

[1] Sordoni A, Galley M, Auli M, et al. A neural network approach to context-sensitive generation of conversational responses. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL), Denver, 2015. 196--205. Google Scholar

[2] Bahdanau D, Cho K H, Bengio Y. Neural machine translation by jointly learning to align and translate. In: Proceedings of the 5th International Conference on Learning Representations, 2015. Google Scholar

[3] Serban I V, Sordoni A, Bengio Y, et al. Building end-to-end dialogue systems using generative hierarchical neural network models. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016, 3776--3784. Google Scholar

[4] Ranzato M A, Chopra S, Auli M, et al. Sequence level training with recurrent neural networks. In: Proceedings of the 4th International Conference on Learning Representations, 2016. Google Scholar

[5] Wiseman S, Rush A M. Sequence-to-sequence learning as beam-search optimization. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Austin, 2016. 1296--1306. Google Scholar

[6] Bowman S R, Vilnis L, Vinyals O, et al. Generating sentences from a continuous space. In: Proceedings of the SIGNLL Conference on Computational Natural Language Learning, Berlin, 2016. 10--21. Google Scholar

[7] Hochreiter S, Schmidhuber J. Long short-term memory.. Neural Computation, 1997, 9: 1735-1780 CrossRef PubMed Google Scholar

[8] Chung J Y, Gulcehre C, Cho K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling. In: Proceedings of the Advances in Neural Information Processing Systems Deep Learning Workshop, 2014. Google Scholar

[9] Henaff M, Burna J, LeCun Y. Deep convolutional networks on graph-structured data. 2015,. arXiv Google Scholar

[10] Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations, Toulon, 2017. Google Scholar

[11] Battaglia P W, Pascanu R, Lai M, et al. Interaction networks for learning about objects, relations and physics. In: Proceedings of the Thirtieth Conference on Neural Information Processing Systems, 2016. 4502--4510. Google Scholar

[12] Gilmer J, Schoenholz S S, Riley P F, et al. Neural message passing for quantum chemistry. In: Proceedings of the 34th International Conference on Machine Learning, Sydney, 2017. 1263--1272. Google Scholar

[13] Bengio Y, Louradour J, Collobert R, et al. Curriculum learning. In: Proceedings of the 26th International Conference on Machine Learning, Montreal, 2009. 41--48. Google Scholar

[14] Bengio S, Vinyals O, Jaitly N, et al. Scheduled sampling for sequence prediction with recurrent neural networks. In: Proceedings of the Twenty-ninth Conference on Neural Information Processing Systems, Montréal, 2015. 1171--1179. Google Scholar

[15] Yu L T, Zhang W N, Wang J, et al. SeqGAN: sequence generative adversarial nets with policy gradient. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017. 2852--2858. Google Scholar

[16] Guo J X, Lu S D, Cai H, et al. Long text generation via adversarial training with leaked information. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018. 5141--5148. Google Scholar

[17] Fedus W, Goodfellow I J, Dai A M. MaskGAN: better text generation via filling in the. Google Scholar

[18] Diao Q M, Qiu M H, Wu C Y, et al. Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS). In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014. 193--202. Google Scholar

[19] Papineni K, Roukos S, Ward T, et al. BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, 2002. 311--318. Google Scholar

[20] Wang K, Wan X J. SentiGAN: generating sentimental texts via mixture adversarial networks. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18), 2018. 4446--4452. Google Scholar

[21] Vinyals O, Kaiser L, Koo T, et al. Grammar as a foreign language. In: Proceedings of the Neural Information Processing Systems, 2015. 2773--2781. Google Scholar

[22] Tai K S, Socher R, Manning C D. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, 2015. 1556--1566. Google Scholar

[23] Dyer C, Kuncoro A, Ballesteros M, et al. Recurrent neural network grammars. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, 2016. 199--209. Google Scholar

[24] Alvarez-Melis D, Jaakkola T S. Tree-structured decoding with doubly-recurrent neural networks. In: Proceedings of the International Conference on Learning Representations, 2017. Google Scholar

[25] Zhou G B, Luo P, Cao R Y, et al. Tree-structured neural machine for linguistics-aware sentence generation. In: Thirty-Second AAAI Conference on Artificial Intelligence, 2018. 5722--5729. Google Scholar

[26] Scarselli F, Gori M, Ah Chung Tsoi M. The graph neural network model.. IEEE Trans Neural Netw, 2009, 20: 61-80 CrossRef PubMed Google Scholar

[27] Wu S Z, Zhang D D, Yang N, et al. Sequence-to-dependency neural machine translation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017. 698--707. Google Scholar

[28] Li Y J, Tarlow D, Brockschmidt M, et al. Gated Graph Sequence Neural Networks. In: Proceedings of the 4th International Conference on Learning Representations, San Juan, 2016. Google Scholar

• Figure 1

(Color online) Word graph of a sentence. In this paper, we consider both syntactical (black arrows) and sequential (blue arrows) dependencies in the text generation process. As a result, the sentence is modeled as a graph instead of a sequence or a tree.

•

Algorithm 1 Graph-based text generation

Initialize operating queue: $Q~\leftarrow~\emptyset$;

$Q.\texttt{append}({\rm~root})$; //use special parameters for root

while $Q$ is not empty do

$u~\leftarrow~Q.{\rm~pop}()$; //pop the node from the queue

for $i$=1 to MAX_CHILDREN

Sampling an action $a_i~\sim~p(a|u,G)~$; //Eq. (6)

if $a_i$ = stop then break; //stop the generation of operating node

$v~\leftarrow$ empty node; //if not break, there is a new node

if $a_i$ = LC then $u.\texttt{left\_children}.\texttt{append}(v)~$; //left child

if $a_i$ = RC then $u.\texttt{right\_children}.\texttt{append}(v)~$; //right child

Modify the seq-dep edges related to $v$;

MP: $N(v)~\rightarrow~v$; //update the new node $v$; Eqs. (4) and (5)

Sampling a word $w_v~\sim~p(w|{\boldsymbol~h}_v)~$; //Eq. (7)

MP: $v~\rightarrow~N(v)~\rightarrow~\cdots~\rightarrow~G~$; //global update from $v$; Eqs. (4) and (5)

$Q.\texttt{append}(v)$;

end for

end while

for $v~\in~V$ do//final round, re-sample words for the entire sentence

MP: $N(v)~\rightarrow~v$;//collect messages from its neighbors

Pick the word with highest probability $w_v~=~{\rm~argmax}_w~p(w|{\boldsymbol~h}_v)~$;

end for

• Table 11

Table 1Table 1

(Color online) An illustration of our graph generation$^{\rm~a)}$

• Table 2

Table 2Results on synthetic datasets$^{\rm~a)}$

 Model Len NLL NLL(-) Fail (%) Len NLL NLL(-) Fail (%) Len NLL NLL(-) Fail (%) Oracle $5/20$ $~3.27$ 6.21 $-$ $10/40$ $3.32$ 6.21 $-$ $15/60~$ $~3.34$ 6.21 $-$ LSTM $5/20$ $4.04$ 6.47 $~41.7$ $10/40$ $4.10$ 6.49 $~49.5$ $15/60$ $~4.14~$ 6.54 $~72.4$ SeqGAN $5/20$ $4.40^\dagger$ 6.60 $~59.4$ $10/40$ $4.52^\dagger$ 6.64 $74.4$ $15/60$ $~4.67^\dagger~$ 6.75 $~79.7$ LeakGAN $5/20$ $4.81^\dagger$ 6.66 $53.0$ $10/40$ $4.97^\dagger~$ 6.63 $~62.1$ $15/60$ $5.10^\dagger$ 6.74 $~78.9$ Ours $5/20$ $\mathbf{3.34}$ 6.23 $~0.7$ $10/40$ $~\bf~3.37~$ 6.23 $~0.6~$ $15/60$ $~\bf~3.39$ 6.23 $0.8$

a

• Table 3

Table 3Samples generated on the synthetic task of different models$^{\rm~a)}$. Our model succeeds in preserving structure integrity most of the time.

 Model Sample Oracle ttfamily (TW(TW(TW(TW)(TW)(TW))(TW(TW(TW)(TW))))) LSTM ttfamily (TW(TW(TW))(TW)(TW(TW))(TW))_(TW(TW)_(TW(T SeqGAN ttfamily (TW(TW))(TW)(TW(TW))_(TW_(TW(TW)_(TW_(TW(TW) LeakGAN ttfamily (TW)_(TW_(TW_(TW_(TW(TW(TW(TW))(TW))(TW)(TW) Ours ttfamily (TW(TW)(TW)(TW(TW(TW(TW)))(TW)(TW(TW))))

a

• Table 4

Table 4Results on the IMDB dataset$^{\rm~a)}$. The top half gives the results of generation quality. The bottom half shows the novelty measure by comparing the BLEU score and edit-distance of generated samples against the training set.

 LSTM SeqGAN LeakGAN Ours Test/quality $\uparrow$ BLEU-2 0.652 0.683 0.809 0.876 $\uparrow$ BLEU-3 0.405 0.418 0.554 0.643 $\uparrow$ BLEU-4 0.304 0.315 0.358 0.415 $\uparrow$ BLEU-5 0.202 0.221 0.252 0.286 Train/novelty $\downarrow$ BLEU-2 0.915 0.997 0.987 0.941 $\downarrow$ BLEU-3 0.750 0.990 0.949 0.827 $\downarrow$ BLEU-4 0.545 0.980 0.892 0.613 $\downarrow$ BLEU-5 0.387 0.971 0.840 0.361 $\uparrow$ Edit dist 14.88 1.05 6.09 18.40 $\uparrow$ ED/LEN 71.2% 5.7% 27.4% 70.2% Human evaluation $\uparrow$ 0.494 0.535 0.644 0.552

a

• Table 5

Table 5Samples from different models$^{\rm~a)}$

 Method Generated samples LSTM This is a heart but after the news reporter comes together of survival and lost and do you'd really simply work. But that and their is a great score just as a good start to entertain because it genuine. I think it would leave an impression which bad is the exact same formula pretty big . SeqGAN $\dagger$ This does not star Kurt Russell, but rather allows him what amounts to an extended cameo. This does good to make a movie and film relies a stupid sense of credibility for the genre or any movie not going to it. This also too hard at all, but is also a scene but I am not sure of anything, humor and I think you shouldn't go LeakGAN $\dagger$ This movie is very creepy and has some good gory scenes that would be rather disturbing . $\dagger$ The story itself we may have seen a dozen times before but it doesn't much matter . $\dagger$ This was a great family film and one of my new favorites from Disney and Pixar . Ours I guess I was some good elements for that attempt in the country movie . It 's a movie ends up in this franchise and say this is a fan of his girlfriend. The beauty of the film, in this movie, I was a good man and that I don't know.

a

• Table 6

Table 6Step-by-step examples$^{\rm~a)}$. Note how re-sampling helps to correct mistakes (c.f. the correction of “liked” in the first example).

 Step Case 1 Case 2 1 _ get _ is 2 _ I get _ This is 3 I get _ racing This is _ one 4 I get racing _ because This is _ a one 5 I get racing because _ I This is a _ good one 6 I get racing because I _ liked This is a good one _ , 7 I get racing because I liked _ . This is a good one , _ and 8 I get _ the racing because I liked . This is a good one , and _ excellent 9 I get the racing because I _ would liked . This is a good one , and excellent _ . 10 I get the racing because I would like . This is a good one , and excellent _ job . 11 This is a good one , and excellent job _ of . 12 This is a good one , and excellent job of _ NUM . 13 This is a good one , and excellent job of _ the NUM . 14 This is a good film , and excellent job of the film .

a

Citations

Altmetric