This work was supported by National Key Research and Development Program of China (Grant No. 2018YFC0831103), Shanghai Municipal Science and Technology Major Project (Grant No. 2018SHZDZX01), and Zhejiang Lab.
[1] Sordoni A, Galley M, Auli M, et al. A neural network approach to context-sensitive generation of conversational responses. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL), Denver, 2015. 196--205. Google Scholar
[2] Bahdanau D, Cho K H, Bengio Y. Neural machine translation by jointly learning to align and translate. In: Proceedings of the 5th International Conference on Learning Representations, 2015. Google Scholar
[3] Serban I V, Sordoni A, Bengio Y, et al. Building end-to-end dialogue systems using generative hierarchical neural network models. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016, 3776--3784. Google Scholar
[4] Ranzato M A, Chopra S, Auli M, et al. Sequence level training with recurrent neural networks. In: Proceedings of the 4th International Conference on Learning Representations, 2016. Google Scholar
[5] Wiseman S, Rush A M. Sequence-to-sequence learning as beam-search optimization. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Austin, 2016. 1296--1306. Google Scholar
[6] Bowman S R, Vilnis L, Vinyals O, et al. Generating sentences from a continuous space. In: Proceedings of the SIGNLL Conference on Computational Natural Language Learning, Berlin, 2016. 10--21. Google Scholar
[7] Hochreiter S, Schmidhuber J. Long short-term memory.. Neural Computation, 1997, 9: 1735-1780 CrossRef PubMed Google Scholar
[8] Chung J Y, Gulcehre C, Cho K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling. In: Proceedings of the Advances in Neural Information Processing Systems Deep Learning Workshop, 2014. Google Scholar
[9] Henaff M, Burna J, LeCun Y. Deep convolutional networks on graph-structured data. 2015,. arXiv Google Scholar
[10] Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations, Toulon, 2017. Google Scholar
[11] Battaglia P W, Pascanu R, Lai M, et al. Interaction networks for learning about objects, relations and physics. In: Proceedings of the Thirtieth Conference on Neural Information Processing Systems, 2016. 4502--4510. Google Scholar
[12] Gilmer J, Schoenholz S S, Riley P F, et al. Neural message passing for quantum chemistry. In: Proceedings of the 34th International Conference on Machine Learning, Sydney, 2017. 1263--1272. Google Scholar
[13] Bengio Y, Louradour J, Collobert R, et al. Curriculum learning. In: Proceedings of the 26th International Conference on Machine Learning, Montreal, 2009. 41--48. Google Scholar
[14] Bengio S, Vinyals O, Jaitly N, et al. Scheduled sampling for sequence prediction with recurrent neural networks. In: Proceedings of the Twenty-ninth Conference on Neural Information Processing Systems, Montréal, 2015. 1171--1179. Google Scholar
[15] Yu L T, Zhang W N, Wang J, et al. SeqGAN: sequence generative adversarial nets with policy gradient. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017. 2852--2858. Google Scholar
[16] Guo J X, Lu S D, Cai H, et al. Long text generation via adversarial training with leaked information. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018. 5141--5148. Google Scholar
[17] Fedus W, Goodfellow I J, Dai A M. MaskGAN: better text generation via filling in the. Google Scholar
[18] Diao Q M, Qiu M H, Wu C Y, et al. Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS). In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014. 193--202. Google Scholar
[19] Papineni K, Roukos S, Ward T, et al. BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, 2002. 311--318. Google Scholar
[20] Wang K, Wan X J. SentiGAN: generating sentimental texts via mixture adversarial networks. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18), 2018. 4446--4452. Google Scholar
[21] Vinyals O, Kaiser L, Koo T, et al. Grammar as a foreign language. In: Proceedings of the Neural Information Processing Systems, 2015. 2773--2781. Google Scholar
[22] Tai K S, Socher R, Manning C D. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, 2015. 1556--1566. Google Scholar
[23] Dyer C, Kuncoro A, Ballesteros M, et al. Recurrent neural network grammars. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, 2016. 199--209. Google Scholar
[24] Alvarez-Melis D, Jaakkola T S. Tree-structured decoding with doubly-recurrent neural networks. In: Proceedings of the International Conference on Learning Representations, 2017. Google Scholar
[25] Zhou G B, Luo P, Cao R Y, et al. Tree-structured neural machine for linguistics-aware sentence generation. In: Thirty-Second AAAI Conference on Artificial Intelligence, 2018. 5722--5729. Google Scholar
[26] Scarselli F, Gori M, Ah Chung Tsoi M. The graph neural network model.. IEEE Trans Neural Netw, 2009, 20: 61-80 CrossRef PubMed Google Scholar
[27] Wu S Z, Zhang D D, Yang N, et al. Sequence-to-dependency neural machine translation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017. 698--707. Google Scholar
[28] Li Y J, Tarlow D, Brockschmidt M, et al. Gated Graph Sequence Neural Networks. In: Proceedings of the 4th International Conference on Learning Representations, San Juan, 2016. Google Scholar
Figure 1
(Color online) Word graph of a sentence. In this paper, we consider both syntactical (black arrows) and sequential (blue arrows) dependencies in the text generation process. As a result, the sentence is modeled as a graph instead of a sequence or a tree.
Initialize operating queue: $Q~\leftarrow~\emptyset$; |
$Q.\texttt{append}({\rm~root})$; //use special parameters for root |
$u~\leftarrow~Q.{\rm~pop}()$; //pop the node from the queue |
|
Sampling an action $a_i~\sim~p(a|u,G)~$; //Eq. |
|
$v~\leftarrow$ empty node; //if not break, there is a new node |
|
|
Modify the seq-dep edges related to $v$; |
MP: $N(v)~\rightarrow~v$; //update the new node $v$; Eqs. |
Sampling a word $w_v~\sim~p(w|{\boldsymbol~h}_v)~$; //Eq. |
MP: $v~\rightarrow~N(v)~\rightarrow~\cdots~\rightarrow~G~$; //global update from $v$; Eqs. |
$Q.\texttt{append}(v)$; |
|
MP: $N(v)~\rightarrow~v$;//collect messages from its neighbors |
Pick the word with highest probability $w_v~=~{\rm~argmax}_w~p(w|{\boldsymbol~h}_v)~$; |
(Color online) An illustration of our graph generation$^{\rm~a)}$
Model | Len | NLL | NLL(-) | Fail (%) | Len | NLL | NLL(-) | Fail (%) | Len | NLL | NLL(-) | Fail (%) |
Oracle | $5/20$ | $~3.27$ | 6.21 | $-$ | $10/40$ | $3.32$ | 6.21 | $-$ | $15/60~$ | $~3.34$ | 6.21 | $-$ |
LSTM | $5/20$ | $4.04$ | 6.47 | $~41.7$ | $10/40$ | $4.10$ | 6.49 | $~49.5$ | $15/60$ | $~4.14~$ | 6.54 | $~72.4$ |
SeqGAN | $5/20$ | $4.40^\dagger$ | 6.60 | $~59.4$ | $10/40$ | $4.52^\dagger$ | 6.64 | $74.4$ | $15/60$ | $~4.67^\dagger~$ | 6.75 | $~79.7$ |
LeakGAN | $5/20$ | $4.81^\dagger$ | 6.66 | $53.0$ | $10/40$ | $4.97^\dagger~$ | 6.63 | $~62.1$ | $15/60$ | $5.10^\dagger$ | 6.74 | $~78.9$ |
Ours | $5/20$ | $\mathbf{3.34}$ | 6.23 | $~0.7$ | $10/40$ | $~\bf~3.37~$ | 6.23 | $~0.6~$ | $15/60$ | $~\bf~3.39$ | 6.23 | $0.8$ |
a
Model | Sample |
Oracle | ttfamily (TW(TW(TW(TW)(TW)(TW))(TW(TW(TW)(TW))))) |
LSTM | ttfamily (TW(TW(TW))(TW)(TW(TW))(TW))_(TW(TW)_(TW(T |
SeqGAN | ttfamily (TW(TW))(TW)(TW(TW))_(TW_(TW(TW)_(TW_(TW(TW) |
LeakGAN | ttfamily (TW)_(TW_(TW_(TW_(TW(TW(TW(TW))(TW))(TW)(TW) |
Ours | ttfamily (TW(TW)(TW)(TW(TW(TW(TW)))(TW)(TW(TW)))) |
a
LSTM | SeqGAN | LeakGAN | Ours | |||||||||||||||||||||
Test/quality | $\uparrow$ | BLEU-2 | 0.652 | 0.683 | 0.809 | $\uparrow$ | BLEU-3 | 0.405 | 0.418 | 0.554 | $\uparrow$ | BLEU-4 | 0.304 | 0.315 | 0.358 | $\uparrow$ | BLEU-5 | 0.202 | 0.221 | 0.252 | ||||
Train/novelty | $\downarrow$ | BLEU-2 | 0.997 | 0.987 | 0.941 | |||||||||||||||||||
$\downarrow$ | BLEU-3 | 0.990 | 0.949 | 0.827 | ||||||||||||||||||||
$\downarrow$ | BLEU-4 | 0.980 | 0.892 | 0.613 | ||||||||||||||||||||
$\downarrow$ | BLEU-5 | 0.387 | 0.971 | 0.840 | $\uparrow$ | Edit dist | 14.88 | 1.05 | 6.09 | $\uparrow$ | ED/LEN | 5.7% | 27.4% | 70.2% | ||||||||||
Human evaluation | $\uparrow$ | 0.494 | 0.535 | 0.552 |
a
Method | Generated samples |
LSTM | This is a heart but after the news reporter comes together of survival and lost and do you'd really simply work. |
But that and their is a great score just as a good start to entertain because it genuine. | |
I think it would leave an impression which bad is the exact same formula pretty big . | |
SeqGAN | $\dagger$ This does not star Kurt Russell, but rather allows him what amounts to an extended cameo. |
This does good to make a movie and film relies a stupid sense of credibility for the genre or any movie not going to it. | |
This also too hard at all, but is also a scene but I am not sure of anything, humor and I think you shouldn't go | |
LeakGAN | $\dagger$ This movie is very creepy and has some good gory scenes that would be rather disturbing . |
$\dagger$ The story itself we may have seen a dozen times before but it doesn't much matter . | |
$\dagger$ This was a great family film and one of my new favorites from Disney and Pixar . | |
Ours | I guess I was some good elements for that attempt in the country movie . |
It 's a movie ends up in this franchise and say this is a fan of his girlfriend. | |
The beauty of the film, in this movie, I was a good man and that I don't know. |
a
Step | Case 1 | Case 2 |
1 | _ get | _ is |
2 | _ I get | _ This is |
3 | I get _ racing | This is _ one |
4 | I get racing _ because | This is _ a one |
5 | I get racing because _ I | This is a _ good one |
6 | I get racing because I _ liked | This is a good one _ , |
7 | I get racing because I liked _ . | This is a good one , _ and |
8 | I get _ the racing because I liked . | This is a good one , and _ excellent |
9 | I get the racing because I _ would liked . | This is a good one , and excellent _ . |
10 | This is a good one , and excellent _ job . | |
11 | This is a good one , and excellent job _ of . | |
12 | This is a good one , and excellent job of _ NUM . | |
13 | This is a good one , and excellent job of _ the NUM . | |
14 |
a