SCIENTIA SINICA Informationis, Volume 50 , Issue 7 : 1033-1054(2020) https://doi.org/10.1360/SSI-2019-0272

## Text correlation calculation based on passage-level event representation

Ming LIU 1,2, Bing QIN 1,2,*,
• AcceptedMay 13, 2020
• PublishedJul 13, 2020
Share
Rating

### References

[1] Sarawagi S. Information Extraction. FNT Databases, 2007, 1: 261-377 CrossRef Google Scholar

[2] Riloff E. Automatically constructing a dictionary for information extraction tasks. In: Proceedings of the 11th National Conference on Artificial Intelligence, Washington, 1993. 811--816. Google Scholar

[3] Jun-Tae Kim , Moldovan D I. Acquisition of linguistic patterns for knowledge-based information extraction. IEEE Trans Knowl Data Eng, 1995, 7: 713-724 CrossRef Google Scholar

[4] Chai J Y. Learning and generalization in the creation of information extraction systems. Dissertation for Ph.D. Degree. Durham: Duke University, 1998. Google Scholar

[5] Yangarber R. Scenario customization for information extraction. Dissertation for Ph.D. Degree. New York: New York University, 2001. Google Scholar

[6] Chen Y B, Liu S L, Zhang X, et al. Automatically labeled data generation for large scale event extraction. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, 2017. 409--419. Google Scholar

[7] Turchi M, Zavarella V, Tanev H. Pattern learning for event extraction using monolingual statistical machine translation. In: Proceedings of Recent Advances in Natural Language Processing, 2011. 371--377. Google Scholar

[8] Li Q, Ji H, Huang L. Joint event extraction via structured prediction with global features. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, 2013. 73--82. Google Scholar

[9] Chieu H L, Ng H T. A maximum entropy approach to information extraction from semi-structured and free text. In: Proceedings of the 18th National Conference on Artificial Intelligence, Edmonton, 2002. 786--791. Google Scholar

[10] Alan R, Etzioni O, Clark S. Open domain event extraction from Twitter. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, 2012. 1104--1112. Google Scholar

[11] Ahn D. The stages of event extraction. In: Proceedings of Workshop on Annotations and Reasoning about Time and Events, Sydney, 2006. Google Scholar

[12] Ji H, Grishman R. Refining event extraction through unsupervised cross-document inference. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, Columbus, 2008. 254--262. Google Scholar

[13] Ji H. Cross-lingual predicate cluster acquisition to improve bilingual event extraction by inductive learning. In: Proceedings of NAACL HLT Workshop on Unsupervised and Minimally Supervised Learning of Lexical Semantics, Boulder, 2009. 27--35. Google Scholar

[14] Liao S S, Grishman R. Using document level cross-event inference to improve event extraction. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, 2010. 789--797. Google Scholar

[15] Hong Y, Zhang J F, Ma B, et al. Using cross-entity inference to improve event extraction. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, 2011. 1127--1136. Google Scholar

[16] Liu S L, Liu K, He S Z, et al. A probabilistic soft logic based approach to exploiting latent and global information in event classification. In: Proceedings of the 13th AAAI Conference on Artificial Intelligence, Snowbird, 2016. 2993--2999. Google Scholar

[17] Chen Y B, Xu L H, Liu K, et al. Event extraction via dynamic multi-pooling convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, Beijing, 2015. 167--176. Google Scholar

[18] Nguyen T H, Grishman R . Event detection and domain adaptation with convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, Beijing, 2015. 365--371. Google Scholar

[19] Nguyen T H, Grishman R. Modeling skip-grams for event detection with convolutional neural networks. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, Austin, Texas, 2016. 886--891. Google Scholar

[20] Liu J, Chen Y B, Liu K, et al. Event detection via gated multilingual attention mechanism. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, 2018. 4865--4872. Google Scholar

[21] Yang S, Feng D W, Qiao L B, et al. Exploring pre-trained language models for event extraction and generation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, 2019. 5284--5294. Google Scholar

[22] Liu X, Luo Z C, Huang H Y. Jointly multiple events extraction via attention-based graph information aggregation. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, Brussels, 2018. 1247--1256. Google Scholar

[23] Peng H R, Song Y Q, Roth D. Event detection and co-reference with minimal supervision. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, Austin, 2016. 392--402. Google Scholar

[24] Mihalcea R, Tarau P. TextRank: bringing order into text. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, Barcelona, 2004. 404--411. Google Scholar

[25] Lamberti F, Sanna A, Demartini C. A Relation-Based Page Rank Algorithm for Semantic Web Search Engines. IEEE Trans Knowl Data Eng, 2009, 21: 123-136 CrossRef Google Scholar

[26] Avrachenkov K, Litvak N, Nemirovsky D. Monte Carlo Methods in PageRank Computation: When One Iteration is Sufficient. SIAM J Numer Anal, 2007, 45: 890-904 CrossRef Google Scholar

[27] Qiu L K, Zhang Y. ZORE: a syntax-based system for chinese open relation extraction. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, Doha, 2014. 1870--1880. Google Scholar

[28] David M W Powers. Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies, 2011, 2(1): 37-63. Google Scholar

[29] Pennington J, Socher R, Manning C. Glove: global vectors for word representation. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, Doha, 2014. 1532--1543. Google Scholar

[30] Dongsuk O, Kwon S, Kim K, et al. Word sense disambiguation based on word similarity calculation using word vector representation from a knowledge-based graph. In: Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, 2018. 2704--2714. Google Scholar

[31] Che W X, Li Z H, Liu T. LTP: a chinese language technology platform. In: Proceedings of the 23rd International Conference on Computational Linguistics, Beijing, 2010. 13--16. Google Scholar

[32] Heafield K, Kayser M, Manning C D. Faster phrase-based decoding by refining feature state. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, 2014. 130--135. Google Scholar

[33] Huang F, Yates A. Open-domain semantic role labeling by modeling word spans. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, 2010. 968--978. Google Scholar

[34] Tan C Q, Wei F R, Wang W H, et al. Multiway attention networks for modeling sentence pair. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, 2018. 4411--4417. Google Scholar

[35] Gong Y C, Lu H, Zhang J. Natural language inference over interaction space. In: Proceedings of International Conference on Learning Representations, Vancouver, 2018. Google Scholar

[36] Kim S, Kang I, Kwak N. Semantic sentence matching with densely-connected recurrent and co-attentive information. In: Proceedings of AAAI Conference on Artificial Intelligence, Honolulu, 2019. 6586--6593. Google Scholar

[37] Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, 2019. 4171--4186. Google Scholar

[38] Cui Y M, Che W X, Liu T, et al. Pre-training with whole word masking for Chinese BERT. 2019,. arXiv Google Scholar

[39] Zhang X X, Lapata M, Wei F R, et al. Neural latent extractive document summarization. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, Brussels, 2018. 779--784. Google Scholar

[40] Zhou Q Y, Yang N, Wei F R, et al. Neural document summarization by jointly learning to score and select sentences. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, 2018. 654--663. Google Scholar

[41] Jadhav A, Rajan V. Extractive summarization with SWAP-NET: sentences and words from alternating pointer networks. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, 2018. 142--151. Google Scholar

• Figure 1

Sentence-level event graph

• Figure 2

Sentence-level event polygon

• Figure 3

Passage-level event connection graph

• Figure 4

(Color online) Matrix of cosine similarity

• Figure 5

(Color online) One example of passage-level event connection graph

• Figure 6

(Color online) Performance graph on different selecting proportions of sentences

• Figure 7

(Color online) Performance graph on different selecting numbers of words

• Table 1   Results of different unsupervised methods$^{\rm~a)}$
 Method Data set $P$ (%) $R$ (%) $F$1 Sogou 84.67 55.43 0.67 TF/IDF ByteDance 81.66 55.91 0.65 Mannual 50.79 10.66 0.17 Sogou 83.37 21.77 0.35 TF/IDF+Cosine ByteDance 79.44 20.59 0.32 Mannual 77.88 2.66 0.05 Sogou 84.44 64.33 0.73 TF+Cosine ByteDance 82.17 65.12 0.73 Mannual 63.43 38.01 0.42 Sogou 85.29 68.25 0.76 MI+Cosine ByteDance 83.71 66.73 0.74 Mannual 69.36 35.18 0.46 Sogou 86.45 66.29 0.75 WS+Cosine ByteDance 86.41 65.11 0.74 Mannual 70.41 33.97 0.45 Sogou 87.25 63.33 0.73 FE+Cosine ByteDance 86.31 59.28 0.70 Mannual 74.67 17.26 0.28 Sogou 84.67 78.52 0.81 TR+Cosine ByteDance 82.79 77.31 0.80 Mannual 67.13 46.13 0.54 Sogou 81.22 72.19 0.76 EN+Graph ByteDance 77.65 73.15 0.75 Mannual 33.15 42.67 0.37 Sogou 86.54 79.29 0.83 Ours ByteDance 85.77 76.26 0.81 Mannual 75.89 43.97 0.56

a) The value of black bold indicates the maximum value per column.

• Table 2   Results of different supervised methods$^{\rm~a)}$
 Method Data set $P$ (%) $R$ (%) $F$1 Sogou 84.24 78.15 0.81 MwAN ByteDance 84.56 75.93 0.80 Mannual 73.25 30.34 0.43 Sogou 88.57 76.83 0.82 DIIN ByteDance 85.36 77.22 0.82 Mannual 76.15 33.57 0.47 Sogou 87.82 78.44 0.84 DRCN ByteDance 84.51 76.78 0.81 Mannual 71.44 38.67 0.50 Sogou 86.54 79.29 0.83 Ours ByteDance 85.77 76.26 0.81 Mannual 75.89 43.97 0.56

a) The value of black bold indicates the maximum value per column.

• Table 3   Results of different pre-training methods$^{\rm~a)}$
 Method Data set $P$ (%) $R$ (%) $F$1 Sogou 86.79 83.28 0.85 BERT ByteDance 86.33 76.29 0.81 Mannual 74.29 37.68 0.50 Sogou 93.78 82.88 0.88 BERT-wwm ByteDance 92.13 78.89 0.85 Mannual 77.29 39.17 0.52 Sogou 86.54 79.29 0.83 Ours ByteDance 85.77 76.26 0.81 Mannual 75.89 43.97 0.56

a) The value of black bold indicates the maximum value per column.

• Table 4   Results of different abstract extraction based methods$^{\rm~a)}$
 Method Data set $P$ (%) $R$ (%) $F$1 Sogou 85.13 78.46 0.82 Extract ByteDance 83.81 77.59 0.81 Mannual 74.31 33.26 0.46 Sogou 83.15 76.11 0.79 NEUSUM ByteDance 82.55 75.31 0.79 Mannual 75.44 36.48 0.49 Sogou 86.59 79.37 0.83 SWAP-NET ByteDance 85.67 75.93 0.81 Mannual 74.64 39.50 0.52 Sogou 86.54 79.19 0.83 Ours ByteDance 85.77 76.16 0.81 Mannual 75.89 43.97 0.56

a) The value of black bold indicates the maximum value per column.

Citations

Altmetric