logo

SCIENTIA SINICA Informationis, Volume 48 , Issue 11 : 1533-1545(2018) https://doi.org/10.1360/N112018-00157

Combining entity co-occurrence information and sentence semantic features for relation extraction

More info
  • ReceivedOct 5, 2018
  • AcceptedOct 30, 2018
  • PublishedNov 14, 2018

Abstract


Funded by

国家重点研发计划(2018YFC0830200)


References

[1] Jurafsky D, Martin J. Speech and Language Processing. Beijing: Publishing House of Electronics Industry, 2018. Google Scholar

[2] Santos C N D, Xiang B, Zhou B W. Classifying relations by ranking with convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, 2015. Google Scholar

[3] Appelt D E, Bear J, Hobbs J R, et al. SRI international FASTUS system: MUC-4 test results and analysis. In: Proceedings of the 4th Conference on Message Understanding, 1992. 143--147. Google Scholar

[4] Yangarber R, Grishman R. NYU: description of the Proteus/PET system as used for MUC-7 ST. In: Proceedings of the 6th Message Understanding Conference, 1998. Google Scholar

[5] Zhou G D, Su J, Zhang J, et al. Exploring various knowledge in relation extraction. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, 2005. Google Scholar

[6] Bin X I, Qian L H, Zhou G D, et al. The application of combined linguistic features in semantic relation extraction. J Chinese Inf Process, 2008, 22: 44--49. Google Scholar

[7] Miao Q L, Zhang S, Zhang B, et al. Extracting and visualizing semantic relationships from Chinese biomedical text. In: Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, 2012. 99--107. Google Scholar

[8] Zeng D J, Liu K, Lai S W, et al. Relation classification via convolutional deep neural network. In: Proceedings of the 25th International Conference on Computational Linguistics, 2014. 23--29. Google Scholar

[9] Wang L L, Cao Z, Melo G D, et al. Relation classification via multi-level attention CNNs. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016. 1298--1307. Google Scholar

[10] Ji G L, Liu K, He S Z, et al. Distant supervision for relation extraction with sentence-level attention and entity descriptions. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017. 3060--3066. Google Scholar

[11] Socher R, Huval B, Manning C D, et al. Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012. 1201--1211. Google Scholar

[12] Hashimoto K, Miwa M, Tsuruoka Y, et al. Simple customization of recursive neural networks for semantic relation classification. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, 2013. 1372--1376. Google Scholar

[13] Miwa M, Bansal M. End-to-end relation extraction using LSTMs on sequences and tree structures. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016. 1105--1116. Google Scholar

[14] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. 2014,. arXiv Google Scholar

[15] Hendrickx I, Su N K, Kozareva Z, et al. SemEval-2010 task 8: multi-way classification of semantic relations between pairs of nominals. In: Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions, 2009. 94--99. Google Scholar

[16] Cortes C, Vapnik V. Support-vector networks. Mach Learn, 1995, 20: 273--297. Google Scholar

[17] Jaynes E T. Information Theory and Statistical Mechanics. Phys Rev, 1957, 106: 620-630 CrossRef ADS Google Scholar

[18] Lafferty J D, Mccallum A, Pereira F C N. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th International Conference on Machine Learning, 2001. 282--289. Google Scholar

[19] Zhang Y M, Zhou J F. A trainable method for extracting Chinese entity names and their relations. In: Proceedings of the 2nd Workshop on Chinese Language Processing: Held in Conjunction with the 38th Annual Meeting of the Association for Computational Linguistics, 2000. 66--72. Google Scholar

[20] Suchanek F M, Ifrim G, Weikum G. Combining linguistic and statistical analysis to extract relations from web documents. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006. 712--717. Google Scholar

[21] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst, 2013, 26: 3111--3119. Google Scholar

[22] Nanda K. Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations. In: Proceedings of ACL 2004 on Interactive Poster and Demonstration Sessions, 2013. Google Scholar

[23] Zhou P, Shi W, Tian J, et al. Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016. 207--212. Google Scholar

[24] Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space. 2013,. arXiv Google Scholar

[25] Cho K, Merrienboer B V, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. 2014,. arXiv Google Scholar

[26] Lin Z H, Feng M W, Santos C N D, et al. A structured self-attentive sentence embedding. In: Proceedings of International Conference on Learning Representations (ICLR), 2017. Google Scholar

[27] Kingma D P, Ba J. Adam: a method for stochastic optimization. In: Proceedings of International Conference on Learning Representations (ICLR), 2015. Google Scholar

  • Figure 1

    (Color online) Overall framework of CNSSNN

  • Figure 2

    (Color online) P-R curves of different approaches. (a) SemEval; (b) CnNews

  • Table 1   Symbols and their description
    Symbol Description
    $C$ The corpus
    $s$ A sentence in corpus $C$
    $\boldsymbol{S}$ The matrix representation of a sentence $s$
    $w$ A word in a sentence
    $\boldsymbol{w}$ The vector representation of a word $w$
    $e$ An entity
    $\boldsymbol{f}^c$ Corpus-level features
    $\boldsymbol{f}^s$ Sentence-level features
    $\boldsymbol{f}$ Features of entity pair after features combination
  • Table 2   Numbers of samples of each label in the labeled dataset
    Label Number of samples
    “hold" 1031
    “study at" 923
    “work at" 3033
    “others" 4053
    Total 9040
  • Table 3   Numbers of samples of each label in the SemEval dataset
    Label Number of samples
    “others" 1864
    “cause-effect" 1331
    “instrument-agency" 660
    “product-producer" 948
    “content-container" 732
    “entity-origin" 974
    “entity-destination" 1137
    “component-whole" 1253
    “member-collection" 923
    “message-topic" 895
    Total 10717
  • Table 4   Hyper-parameter setting of CNSSNN
    Hyper-parameter u layer_num $q$ batchsize learning_rate $d$ $d_p$
    Value 100 1 64 250 1E$-$4 400 1
  • Table 5   Performance comparison of different relation extraction approaches on all labels
    Model $F1$ on SemEval (%) $F1$ on CnNews (%)
    CNN 80.43 85.32
    CR-CNN 81.09 86.47
    GRU 81.52 86.83
    ATT-GRU 83.69 88.15
    CNSSNN (ours) 85.99 90.34
  • Table 6   Performance comparison of different relation extraction approaches without “other” label
    Model SemEval $F1$ on CnNews (%)
    Precision (%) Recall (%) $F1$(%) Precision (%) Recall (%) $F1$(%)
    CNN 84.00 79.82 81.76 86.79 82.87 84.83
    CR-CNN 84.20 80.82 82.40 86.85 85.99 85.86
    GRU 82.85 81.07 81.94 85.91 86.16 86.21
    ATT-GRU 85.19 83.07 84.06 87.21 87.93 87.58
    CNSSNN (ours) 87.51 85.66 86.56 92.83 86.15 89.72
qqqq

Contact and support