logo

SCIENTIA SINICA Informationis, Volume 48 , Issue 11 : 1521-1532(2018) https://doi.org/10.1360/N112018-00208

Knowledge-representation-enhanced question-answering system

More info
  • ReceivedAug 10, 2018
  • AcceptedOct 22, 2018
  • PublishedNov 15, 2018

Abstract


Funded by

国家自然科学基金(61433015,61572477,61772505)

中国科协青年人才托举工程(YESS20160177)


References

[1] Bollacker K, Evans C, Paritosh P, et al. Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of ACM SIGMOD International Conference on Management of Data, 2000. 1247--1250. Google Scholar

[2] Suchanek F M, Kasneci G, Weikum G. Yago: a core of semantic knowledge. In: Proceedings of International Conference on World Wide Web, 2007. 697--706. Google Scholar

[3] Miller G A. WordNet: a lexical database for English. Commun ACM, 1995, 38: 39--41. Google Scholar

[4] Zettlemoyer L S, Collins M. Learning context-dependent mappings from sentences to logical form. In: Proceedings of the Meeting of the Association for Computational Linguistics, 2009. 976--984. Google Scholar

[5] Bos J, Clark S, Steedman M, et al. Scaling semantic parsers with on-the-fly ontology matching. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2013. 1545--1556. Google Scholar

[6] Golub D, He X D. Character-level question answering with attention. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016. 1598--1607. Google Scholar

[7] Bordes A, Usunier N, Chopra S, et al. Large-scale simple question answering with memory networks. 2015,. arXiv Google Scholar

[8] Yin W P, Yu M, Xiang B, et al. Simple question answering by attentive convolutional neural network. 2016,. arXiv Google Scholar

[9] Zhou B T, Sun C J, Lin L, et al. LSTM based question answering for large scale knowledge base. Acta Sci Natl Univ Pekinensis, 2018, 54: 286--292. Google Scholar

[10] Hao Y C, Zhang Y Z, Liu K, et al. An end-to-end model for question answering over knowledge base with cross-attention combining global knowledge. In: Proceedings of Meeting of the Association for Computational Linguistics, 2017. 221--231. Google Scholar

[11] Bordes A, Usunier N, Garcia-Duran A, et al. Translating embeddings for modeling multi-relational data. In: Proceedings of Advances in Neural Information Processing Systems, 2013. 2787--2795. Google Scholar

[12] Collobert R, Weston J. A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, 2008. 160--167. Google Scholar

[13] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality. In: Proceedings of Advances in Neural Information Processing Systems, 2013. 3111--3119. Google Scholar

[14] Pennington J, Socher R, Manning C D. Glove: global vectors for word representation. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, 2014. 1532--1543. Google Scholar

[15] Zhang X, Zhao J B, Lecun Y. Character-level convolutional networks for text classification. In: Proceedings of International Conference on Neural Information Processing Systems, 2015. 649--657. Google Scholar

[16] Ling W, Lu'ıs T, Marujo L, et al. Finding function in form: compositional character models for open vocabulary word representation. 2015,. arXiv Google Scholar

[17] Mitchell J, Lapata M. Composition in distributional models of semantics.. Cognitive Sci, 2010, 34: 1388-1429 CrossRef PubMed Google Scholar

[18] Paulus R, Socher R, Manning C D. Global belief recursive neural networks. In: Proceedings of Advances in Neural Information Processing Systems, 2014. 2888--2896. Google Scholar

[19] Melamud O, Goldberger J, Dagan I. Context2vec: learning generic context embedding with bidirectional LSTM. In: Proceedings of SIGNLL Conference on Computational Natural Language Learning, 2016. 51--61. Google Scholar

[20] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. 2017,. arXiv Google Scholar

[21] Bordes A, Glorot X, Weston J. A semantic matching energy function for learning with multi-relational data. Mach Learn, 2014, 94: 233-259 CrossRef Google Scholar

[22] Bordes A, Weston J, Collobert R, et al. Learning structured embeddings of knowledge bases. In: Proceedings of AAAI Conference on Artificial Intelligence, 2011. Google Scholar

[23] Socher R, Chen D, Manning C D, et al. Reasoning with neural tensor networks for knowledge base completion. In: Proceedings of International Conference on Intelligent Control and Information Processing, 2013. 464--469. Google Scholar

[24] Sutskever I. Modelling relational data using bayesian clustered tensor factorization. In: Proceedings of Advances in Neural Information Processing Systems, 2009. 1821--1828. Google Scholar

[25] Yang B, Yih W T, He X. Learning multi-relational semantics using neural-embedding models. 2014,. arXiv Google Scholar

[26] Trouillon T, Welbl J, Riedel S, et al. Complex embeddings for simple link prediction. In: Proceedings of International Conference on Machine Learning, 2016. 2071--2080. Google Scholar

[27] Wang Z, Zhang J W, Feng J L, et al. Knowledge graph embedding by translating on hyperplanes. In: Proceedings of the 28th AAAI Conference on Artificial Intelligence, 2014. 1112--1119. Google Scholar

[28] Fan M, Zhou Q, Chang E, et al. Transition-based knowledge graph embedding with relational mapping properties. In: Proceedings of Pacific Asia Conference on Language, Information and Computation, 2014. 328--337. Google Scholar

[29] Lin Y K, Liu Z Y, Sun M S, et al. Learning entity and relation embeddings for knowledge graph completion. In: Proceedings of Conference on Artificial Intelligence, 2015. 2181--2187. Google Scholar

[30] Lin Y K, Liu Z Y, Luan H B, et al. Modeling relation paths for representation learning of knowledge bases. 2015,. arXiv Google Scholar

[31] Wang Q, Wang B, Guo L. Knowledge base completion using embeddings and rules. In: Proceedings of International Conference on Artificial Intelligence, 2015. 1859--1865. Google Scholar

[32] Rocktäschel T, Singh S, Riedel S. Injecting logical background knowledge into embeddings for relation extraction. In: Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics, 2015. 1119--1129. Google Scholar

[33] Guo S, Wang Q, Wang L H, et al. Jointly embedding knowledge graphs and logical rules. In: Proceedings of Conference on Empirical Methods on Natural Language Processing, 2016. 192--202. Google Scholar

[34] Milajevs D, Kartsaklis D, Sadrzadeh M, et al. Evaluating neural word representations in tensor-based compositional settings. 2014,. arXiv Google Scholar

[35] Milajevs D, Kartsaklis D, Sadrzadeh M, et al. Evaluating neural word representations in tensor-based compositional settings. 2014,. arXiv Google Scholar

[36] Zettlemoyer L S, Collins M. Learning to map sentences to logical form: structured classification with probabilistic categorial grammars. 2012,. arXiv Google Scholar

[37] Lukovnikov D, Fischer A, Lehmann J. Neural network-based question answering over knowledge graphs on word and character level. In: Proceedings of International Conference on World Wide Web, 2017. 1211--1220. Google Scholar

[38] Dai Z H, Li L, Xu W. CFO: conditional focused neural question answering with large-scale knowledge bases. 2016,. arXiv Google Scholar

  • Figure 1

    (Color online) The framework of character-based word representation learning

  •   

    Algorithm 1 融合知识表示的知识库问答系统

    Require:三元组 $\mathbb{T}$, 问句–实体/关系对, 实体名、实体类型名、关系描述集合;

    Output:字符向量 $\mathbb{C}$, 词向量 $\mathbb{W}$, CNN模型, BiLSTM模型;

    根据三元组 $\mathbb{T}$构建负例集合三元组$\mathbb{T'}$;

    构建词表、初始化向量及CNN和BiLSTM组合模型;

    while 未收敛或未到停止条件 do

    利用三元组、文本描述等信息, 训练知识表示模型;

    更新字符、词向量, 更新CNN, BiLSTM组合模型;

    利用问答数据、词向量和组合模型等训练基于知识库的问答系统;

    更新字符、词向量, 更新CNN, BiLSTM组合模型;

    end while

  • Table 1   The results on SimpleQuestion dataset
    Method Precision (%) Improvement
    Bordes et al. [7] 62.7 %+15.3%
    Yin et al. [8] 68.3 %+5.85%
    Dai et al. [38] 62.6 %+15.5%
    Golub and He [6] 70.9 %+1.97%
    Lukovnikov et al. [37] 71.2 %+1.54%
    Ours 72.3
  • Table 2   The results achieved based on different modules of our model
    Number Method Precision (%)
    1 Random (50) 2.1
    2 WE + BiLSTM 43.1
    3 WE + BiLSTM + CharCNN 62.3
    4 WE + BiLSTM + CharCNN + attention 66.6
    5 WE + BiLSTM + CharCNN + attention + KB Structure 71.3
    5 WE + BiLSTM + CharCNN + attention + KB Structure + Jointly 72.3
  • Table 3   The impact of lost entities for accuracy
    Entity recall Overall precision (%) The precision after filtering out the non-recall entities Improvement
    88.1 72.3 82.1 %+12.3%
qqqq

Contact and support