This work was supported by National Key Research and Development Program of China (Grant No. 2018YFB1004202), National Natural Science Foundation of China (Grant No. 61672078), and State Key Laboratory of Software Development Environment of China (Grant No. SKLSDE-2018ZX-12).
[1] Guo J W, Xu S L, Bao S H, et al. Tapping on the potential of q&a community by recommending answer providers. In: Proceedings of the 17th ACM International Conference on Information and Knowledge Management, California, 2008. 921--930. Google Scholar
[2] Tian Y, Kochhar P S, Lim E P, et al. Predicting best answerers for new questions: an approach leveraging topic modeling and collaborative voting. In: Proceedings of the 5th International Conference on Social Informatics, Kyoto, 2013. 55--68. Google Scholar
[3] Liu Y, Qiu M H, Gottipati S, et al. Cqarank: jointly model topics and expertise in community question answering. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, San Francisco, 2013. 99--108. Google Scholar
[4] Meng Z D, Gandon F, Zucker C F. Joint model of topics, expertises, activities and trends for question answering web applications. In: Proceedings of IEEE/WIC/ACM International Conference on Web Intelligence, Omaha, 2016. 296--303. Google Scholar
[5] Heinrich G. Parameter Estimation for Text Analysis. Technical Report. 2005. Google Scholar
[6] Jensen-shannon divergence. https://en.wikipedia.org/wiki/Jensen-Shannon divergence. Google Scholar
[7] J?rvelin K, Kek?l?inen J. Cumulated gain-based evaluation of IR techniques. ACM Trans Inf Syst, 2002, 20: 422-446 CrossRef Google Scholar
[8] Kendall rank correlation coefficient. https://en.wikipedia.org/wiki/Kendall rank correlation coefficient. Google Scholar
[9] Xia X, David L, Wang X Y, et al. Accurate developer recommendation for bug resolution. In: Proceedings of the 20th Working Conference on Reverse Engineering, Koblenz, 2013. 72--81. Google Scholar
[10] Mann H B, Whitney D R. On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other. Ann Math Statist, 1947, 18: 50-60 CrossRef Google Scholar
[11] Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation. J Mach Learn Res, 2003, 3: 993-1022. Google Scholar
[12] Hu Z T, Yao J J, Cui B. User group oriented temporal dynamics exploration. In: Proceedings of the 28th AAAI Conference on Artificial Intelligence, Québec, 2014. 66--72. Google Scholar
[13] Wang X R, McCallum A. Topics over time: a non-markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, 2006. 424--433. Google Scholar
[14] Zhou G Y, Lai S, Liu K, et al. Topic-sensitive probabilistic model for expert finding in question answer communities. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, Maui, 2012. 1662--1666. Google Scholar
[15] Barua A, Thomas S W, Hassan A E. What are developers talking about? An analysis of topics and trends in stack overflow. Empir Software Eng, 2014, 19: 619-654 CrossRef Google Scholar
[16] Beyer S, Pinzger M. A manual categorization of Android APP development issues on stack overflow. In: Proceedings of the 30th IEEE International Conference on Software Maintenance and Evolution, Victoria, 2014. 531--535. Google Scholar
[17] Li H W, Xing Z C, Peng X, et al. What help do developers seek, when and how? In: Proceedings of the 20th Working Conference on Reverse Engineering, Koblenz, 2013. 142--151. Google Scholar
[18] Mario Linares-Vásquez M, Dit B, Poshyvanyk D. An exploratory analysis of mobile development issues using stack overflow. In: Proceedings of the 10th Working Conference on Mining Software Repositories, San Francisco, 2013. 93--96. Google Scholar
[19] Nadi S, Krüger S, Mezini M, et al. Jumping through hoops: why do java developers struggle with cryptography APIs? In: Proceedings of the 38th International Conference on Software Engineering, Austin, 2016. 935--946. Google Scholar
[20] Rosen C, Shihab E. What are mobile developers asking about? A large scale study using stack overflow. Empir Software Eng, 2016, 21: 1192-1223 CrossRef Google Scholar
[21] Xu B W, Ye D H, Xing Z C, et al. Predicting semantically linkable knowledge in developer online forums via convolutional neural network. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, Singapore, 2016. 51--62. Google Scholar
[22] Anvik J, Hiew L, Murphy G C. Who should fix this bug? In: Proceedings of the 28th International Conference on Software Engineering, Shanghai, 2006. 361--370. Google Scholar
[23] Hossen M K, Kagdi H, Poshyvanyk D. Amalgamating source code authors, maintainers, and change proneness to triage change requests. In: Proceedings of the 22nd International Conference on Program Comprehension, Hyderabad, 2014. 130--141. Google Scholar
[24] Jeong G, Kim S, Zimmermann T. Improving bug triage with bug tossing graphs. In: Proceedings of the 7th joint meeting of European Software Engineering Conference and ACM SIGSOFT International Symposium on Foundations of Software Engineering, Amsterdam, 2009. 111--120. Google Scholar
[25] Linares-Vásquez M, Hossen K, Dang H, et al. Triaging incoming change requests: bug or commit history, or code authorship? In: Proceedings of the 28th IEEE International Conference on Software Maintenance, Trento, 2012. 451--460. Google Scholar
[26] Liu H, Ma Z, Shao W. Schedule of Bad Smell Detection and Resolution: A New Way to Save Effort. IIEEE Trans Software Eng, 2012, 38: 220-235 CrossRef Google Scholar
[27] Matter D, Kuhn A, Nierstrasz O. Assigning bug reports using a vocabulary-based expertise model of developers. In: Proceedings of the 6th International Working Conference on Mining Software Repositories, Vancouver, 2009. 131--140. Google Scholar
Figure 1
(Color online) An example of a question on stack overflow.
Figure 2
(Color online) Percentage of comment activities.
Figure 3
(Color online) Active users in successive days.
Figure 4
(Color online) Overall framework of our method IEA.
Figure 5
The graphical model of TEM.
The number of answers per question | The number of questions |
0 | 3053 |
1 | 20423 |
2 | 10074 |
3 | 4066 |
4 | 1640 |
5 | 646 |
6 | 274 |
7 | 125 |
8 | 67 |
9 | 31 |
$\geqslant~10$ | 56 |
User ID | Activity | Creation time | Question ID |
3523446 | Answer | 2014-04-11 11:20 | 23011187 |
3523446 | Answer | 2014-04-13 04:31 | 23039131 |
3523446 | Answer | 2014-04-13 04:36 | 23039155 |
3523446 | Comment | 2014-04-13 05:47 | 23039155 |
3523446 | Answer | 2014-04-13 06:16 | 23039802 |
3523446 | Answer | 2014-04-24 12:57 | 23269620 |
3523446 | Answer | 2014-04-24 13:23 | 23270226 |
3523446 | Comment | 2014-04-24 13:29 | 23269620 |
3523446 | Answer | 2014-04-25 04:38 | 23284281 |
3523446 | Answer | 2014-04-25 09:47 | 23289638 |
3523446 | Answer | 2014-04-25 12:02 | 23292561 |
3523446 | Comment | 2014-04-25 12:32 | 23284281 |
3523446 | Comment | 2014-04-25 15:16 | 23284281 |
3523446 | Comment | 2014-04-25 15:25 | 23292561 |
3523446 | Comment | 2014-04-25 16:04 | 23284281 |
3523446 | Comment | 2014-04-26 02:24 | 23305872 |
Notation | Type | Description |
$~U~$ | Scalar | The total number of users |
$~N_{u}~$ | Scalar | The total number of questions and answers for user $~u~$ |
$~M_{u,n}~$ | Scalar | The total number of words in $~u~$'s $~n~$-th question or answer |
$~L_{u,n}~$ | Scalar | The total number of tags in $~u~$'s $~n~$-th question or answer |
$~K~$ | Scalar | The total number of topics |
$~E~$ | Scalar | The total number of expertise levels |
$~\alpha~$ | Scalar | Hyperparameter of the Dirichlet prior for the user topic distribution |
$~\beta~$ | Scalar | Hyperparameter of the Dirichlet prior for the user topical expertise distribution |
$~\eta~$ | Scalar | Hyperparameter of the Dirichlet prior for the topic-word distribution |
$~\gamma~$ | Scalar | Hyperparameter of the Dirichlet prior for the topic-tag distribution |
$~\alpha_{0}~$, $~\beta_{0}~$, $~\mu_{0}~$, $~k_{0}~$ | Scalar | Normal-Gamma parameters |
$~\theta_{u}~$ | Vector | Topic distribution for user $~u~$ |
$~\phi_{k}~$ | Vector | Word distribution for topic $~k~$ |
$~\varphi_{k}~$ | Vector | Tag distribution for topic $~k~$ |
$~\theta_{k,u}~$ | Vector | Expertise distribution for user $~u~$ under topic $~k~$ |
$G(~\mu_{e}~$, $~\Sigma_{e}~$) | Vector | Expertise specific vote distribution |
nDCG@1 | nDCG@5 | nDCG | Pearson | Kendall | |
IEA | |||||
TEM | 0.6006 | 0.8131 | 0.8802 | 0.0559 | 0.0315 |
TTEA | 0.5784 | 0.8048 | 0.8719 | 0.1017 | 0.0085 |
TTEA-ACT | 0.5752 | 0.8020 | 0.8690 | 0.0580 | 0.0189 |
nDCG@1 | nDCG@5 | nDCG | Pearson | Kendall | |
gain (%) | gain (%) | gain (%) | gain (%) | gain (%) | |
IEA vs. TEM | 10.29 $~\ast\ast\ast~$ | 2.68 $~\ast\ast~$ | 2.48 $~\ast\ast~$ | 236.20 $~\ast\ast\ast~$ | 424.18 $~\ast\ast~$ |
IEA vs. TTEA | 14.53 $~\ast~$ | 3.74 $~\ast~$ | 3.45 $~\ast~$ | 84.91 $~\ast\ast~$ | 1845.30 $~\ast\ast~$ |
IEA vs. TTEA-ACT | 15.17 $~\ast\ast~$ | 4.11 $~\ast\ast~$ | 3.79 $~\ast\ast~$ | 224.12 $~\ast\ast~$ | 772.60 $~\ast\ast~$ |
$~\ast\ast\ast$
nDCG@1 | nDCG@5 | nDCG | Pearson | Kendall | |
ratio (%) | ratio (%) | ratio (%) | ratio (%) | ratio (%) | |
IEA vs. TEM | 89.38 | 87.19 | 87.19 | 82.51 | 88.05 |
IEA vs. TTEA | 85.63 | 83.44 | 83.44 | 79.88 | 85.13 |
IEA vs. TTEA-ACT | 86.88 | 85.31 | 85.31 | 79.30 | 86.59 |
normalsize | nDCG@1 | nDCG@5 | nDCG | Pearson | Kendall |
IEA | |||||
IEA-no-comment | 0.6555 | 0.8328 | 0.8998 | 0.1303 | 0.1602 |
nDCG@1 | nDCG@5 | nDCG | Pearson | Kendall | |
gain (%) | gain (%) | gain (%) | gain (%) | gain (%) | |
IEA vs. IEA-no-comment | 1.05 | 0.26 | 0.2378 | 44.30 | 2.91 |
nDCG@1 | nDCG@5 | nDCG | Pearson | Kendall | |
ratio (%) | ratio (%) | ratio (%) | ratio (%) | ratio (%) | |
IEA vs. IEA-no-comment | 95.31 | 94.06 | 94.06 | 93.29 | 94.46 |
nDCG@1 | nDCG@5 | nDCG | Pearson | Kendall | |
IEA | |||||
TEM | 0.6006 | 0.8131 | 0.8802 | 0.0559 | 0.0315 |
TA | 0.6333 | 0.8237 | 0.8908 | 0.1180 | 0.1029 |
EA | 0.6480 | 0.8297 | 0.8968 | 0.1216 | 0.1497 |
INT | 0.5204 | 0.7797 | 0.8467 | $-$0.0685 | $-$0.0908 |
EXP | 0.5586 | 0.7988 | 0.8659 | $-$0.0557 | $-$0.0122 |
ACT | 0.6250 | 0.8205 | 0.8876 | 0.0930 | 0.1063 |
nDCG@1 | nDCG@5 | nDCG | Pearson | Kendall | |
gain (%) | gain (%) | gain (%) | gain (%) | gain (%) | |
IEA vs. TEM | 10.29 | 2.68 | 2.48 | 236.20 | 424.18 |
IEA vs. TA | 4.60 | 1.36 | 1.25 | 59.40 | 60.19 |
IEA vs. EA | 2.22 | 0.63 | 0.58 | 54.68 | 10.13 |
IEA vs. INT | 27.29 | 7.09 | 6.53 | $-$374.47 | $-$281.53 |
IEA vs. EXP | 18.58 | 4.52 | 4.17 | $-$437.70 | $-$1456.60 |
IEA vs. ACT | 5.99 | 1.75 | 1.62 | 102.19 | 55.07 |
nDCG@1 | nDCG@5 | nDCG | Pearson | Kendall | |
$T~=~1$ | 0.6417 | 0.8274 | 0.8944 | 0.1691 | 0.1310 |
$T~=~2$ | 0.6458 | 0.8288 | 0.8958 | 0.1394 | 0.1310 |
$T~=~3$ | 0.6432 | 0.8271 | 0.8942 | 0.1155 | 0.1111 |
$T~=~4$ | 0.6283 | 0.8218 | 0.8888 | 0.1274 | 0.1012 |
$T~=~5$ | 0.6420 | 0.8270 | 0.8941 | 0.1586 | 0.1127 |
$T~=~6$ | 0.6500 | 0.8309 | 0.8979 | 0.1843 | 0.1385 |
$T~=~7$ | 0.6464 | 0.8289 | 0.8960 | 0.1061 | 0.1277 |
$T~=~8$ | 0.6343 | 0.8247 | 0.8918 | 0.0886 | 0.1114 |
$T~=~9$ | 0.8347 | 0.9018 | 0.1598 | 0.1583 | |
$T~=~10$ | 0.6624 | ||||
$T~=~11$ | 0.6425 | 0.8271 | 0.8942 | 0.1437 | 0.1332 |
$T~=~12$ | 0.6231 | 0.8190 | 0.8861 | 0.0901 | 0.0856 |
$T~=~13$ | 0.6502 | 0.8303 | 0.8973 | 0.1309 | 0.1435 |
$T~=~14$ | 0.6246 | 0.8213 | 0.8883 | 0.1102 | 0.0869 |
$T~=~15$ | 0.6242 | 0.8208 | 0.8878 | 0.1277 | 0.1304 |
nDCG@1 | nDCG@5 | nDCG | Pearson | Kendall | |
$E~=~1$ | 0.6250 | 0.8205 | 0.8876 | 0.0988 | 0.1063 |
$E~=~2$ | 0.6262 | 0.8202 | 0.8873 | 0.1061 | 0.0760 |
$E~=~3$ | 0.6257 | 0.8203 | 0.8873 | 0.1009 | 0.0950 |
$E~=~4$ | 0.6410 | 0.8266 | 0.8937 | 0.1341 | 0.1162 |
$E~=~5$ | 0.6437 | 0.8282 | 0.8953 | 0.1344 | 0.1087 |
$E~=~6$ | 0.6309 | 0.8236 | 0.8906 | 0.1146 | 0.1176 |
$E~=~7$ | 0.6187 | 0.8185 | 0.8855 | 0.0644 | 0.0784 |
$E~=~8$ | 0.6262 | 0.8218 | 0.8888 | 0.1493 | 0.1161 |
$E~=~9$ | 0.6070 | 0.8151 | 0.8821 | 0.1204 | 0.0712 |
$E~=~10$ | |||||
$E~=~11$ | 0.6300 | 0.8224 | 0.8895 | 0.0825 | 0.0898 |
$E~=~12$ | 0.6469 | 0.8301 | 0.8972 | 0.1397 | 0.1511 |
$E~=~13$ | 0.6328 | 0.8244 | 0.8915 | 0.1111 | 0.1201 |
$E~=~14$ | 0.6287 | 0.8229 | 0.8900 | 0.1090 | 0.1013 |
$E~=~15$ | 0.6377 | 0.8246 | 0.8916 | 0.1286 | 0.1210 |