Information and Software Technology 129 (2021) 106394
13
neural network-based classication [9]. Li et al. designed a new tag
recommendation approach TagDeepRec using attention-based Bi-LSTM
[10]. Experiment analysis showed that TagMulRec outperformed
EnTagRec [6], and TagDeepRec outperformed FastTagRec [10]. Exper-
iment results show that in the recommendation of pull requests’ tags,
our approach FNNRec achieves higher precisions, recalls and F1-scores
than TagDeepRec and TagMulRec.
Previous work [34] proposed a graph-based approach to assign tags
for repositories in GitHub. This work recommended tags to annotate
repositories, and helped developers to efciently search repositories.
Different from this work, our approach FNNRec recommends tags for
pull requests in a project.
Reviewer recommendation. There have been a number of studies
on reviewer recommendation for pull requests in GitHub [21,35–37].
Jiang et al. used support vector machines to analyze integrators’ pre-
vious decisions, and designed an approach CoreDevRec to recommend
integrators for pull requests [21]. Yu et al. built comment networks to
predict appropriate reviewers of incoming pull requests in GitHub [36,
37].
Different from these works, we solve a different problem and design
an automatic approach to recommend tags, rather than reviewers.
8. Conclusion
In this paper, we rst make a survey on the usage of pull requests in
GitHub. Survey results show that tags are useful for developers to track,
search or classify pull requests. However, it is difcult to choose right
tags and keep consistency of tags. 60.61% of respondents think that a tag
recommendation tool is useful. In order to help developers choose tags,
we propose an approach FNNRec. Based on titles, description, le paths
and contributors, FNNRec uses feed-forward neural networks to
compute probabilities and recommends tags. We evaluate effectiveness
of FNNRec on 10 projects containing 68,497 pull requests. We compare
it to approaches TagDeepRec [10] and TagMulRec [8]. The experi-
mental results show that on average across 10 projects, FNNRec out-
performs approaches TagDeepRec [10] and TagMulRec [8] by 62.985%
and 24.953% in terms of F1 − score@3, respectively. FNNRec achieves
better recommendation performance than TagDeepRec and TagMulRec.
Therefore, we believe that FNNRec is useful to nd appropriate tags and
improve tag setting process in GitHub.
CRediT authorship contribution statement
Jing Jiang: Conceptualization, Methodology, Writing - original
draft, Writing - review & editing. Qiudi Wu: Software, Validation,
Investigation. Jin Cao: Software, Investigation. Xin Xia: Methodology,
Writing - review & editing. Li Zhang: Conceptualization, Writing - re-
view & editing, Funding acquisition.
Declaration of Competing Interest
The authors declare that they have no known competing nancial
interests or personal relationships that could have appeared to inuence
the work reported in this paper.
Acknowledgment
This work is supported by the National Key Research and Develop-
ment Program of China No. 2018AAA0102304, the State Key Laboratory
of Software Development Environment under Grant No. SKLSDE-
2019ZX-05, Fundamental Research Funds for the Central Universities
under Grant No. YWF-20-BJ-J-1018 and the National Natural Science
Foundation of China under Grant No. 61732019.
References
[1] G. Gousios, A. Zaidman, M.-A. Storey, A. van Deursen, Work practices and
challenges in pull-based development: The integrator’s perspective. Proc. of ICSE,
2015, pp. 1–11.Florence, Italy
[2] G. Gousios, M.-A. Storey, A. Bacchelli, Work practices and challenges in pull-based
development: the contributor’s perspective. Proc. of ICSE, 2016, pp. 285–296.
Austin, USA
[3] J. Cabot, J.L.C. Izquierdo, V. Cosentino, B. Rolandi, Exploring the use of labels to
categorize issues in open-source software projects. Proc. of SANER, 2015,
pp. 550–554.
[4] I. Steinmache, I.S. Wiese, I. Polato, A.P. Chaves, M.A. Gerosa, M. Wessel, B.M. de
Souza, The power of bots: Understanding bots in oss projects. Proc. of CSCW, 2018,
pp. 1–19.New York, USA
[5] X. Xia, D. Lo, X. Wang, B. Zhou, Tag recommendation in software information sites.
Proc. of MSR, 2013, pp. 287–296.
[6] S. Wang, D. Lo, B. Vasilescu, A. Serebrenik, Entagrec: an enhanced tag
recommendation system for software information sites. Proc. of ICSME, 2014,
pp. 291–300.
[7] S. Wang, D. Lo, B. Vasilescu, A. Serebrenik, Entagrec++: an enhanced tag
recommendation system for software information sites, Empiric. Softw. Eng. 23
(2018) 800–832.
[8] P. Zhou, J. Liu, Z. Yang, G. Zhou, Scalable tag recommendation for software
information sites. Proc. of SANER, 2017, pp. 272–282.
[9] J. Liu, P. Zhou, Z. Yang, X. Liu, J. Grundy, Fasttagrec: fast tag recommendation for
software information sites, Automat. Softw. Eng. 25 (2018) 675–701.
[10] C. Li, L. Xu, M. Yan, J. He, Z. Zhang, Tagdeeprec: tag recommendation for software
information sites using attention-based bi-lstm. Proc. of KSEM, 2019, pp. 11–24.
Athens, Greece
[11] M.-L. Zhang, Z.-H. Zhou, Multi-label neural networks with applications to
functional genomics and text categorization, IEEE Trans. Knowl. Data Eng. 18 (10)
(2006) 1338–1351.
[12] J. Tsay, L. Dabbish, J. Herbsleb, Let’s talk about it: evaluating contributions
through discussion in github. Proc. of FSE, 2014, pp. 144–154.Hong Kong, China
[13] S. Yu, L. Xu, Y. Zhang, J. Wu, Nbsl: a supervised classication model of pull request
in github. Proc. of ICC, 2018, pp. 1–6.Kansas City, USA
[14] B. Vasilescu, Y. Yu, H. Wang, P. Devanbu, V. Filkov, Quality and productivity
outcomes relating to continuous integration in github. Proc. of FSE, 2015.Bergamo,
Italy
[15] G. Gousios, M. Pinzger, A. van Deursen, An exploratory study of the pull-based
software development model. Proc. of ICSE, 2014, pp. 345–355.Hyderabad, India
[16] J. Cohen, Weighted chi square: an extension of the kappa method, Educ. Psychol.
Meas. 32 (1) (1972) 61–74.
[17] Y.A. Llave, T. Hagiwara, T. Sakiyama, Articial neural network model for
prediction of cold spot temperature in retort sterilization of starch-based foods,
J. Food Eng. 109 (3) (2012) 553–560.
[18] J. Anvik, L. Hiew, G.C. Murphy, Who should x this bug?. Proc. the 28th ICSE,
2006, pp. 361–370.Shanghai, China
[19] D. Matter, A. Kuhn, O. Nierstrasz, Assigning bug reports using a vocabulary-based
expertise model of developers. Proc. of MSR, Vancouver, Canada, 2009,
pp. 131–140.
[20] X. Xia, D. Lo, X. Wang, B. Zhou, Accurate developer recommendation for bug
resolution. Proc. of WCRE, Koblenz, Germany, 2013, pp. 72–81.
[21] J. Jiang, J.-H. He, X.-Y. Chen, Coredevrec: automatic core member
recommendation for contribution evaluation, J. Comput. Sci. Technol. 30 (5)
(2015) 998–1016.
[22] P. Thongtanunam, C. Tantithamthavorn, R.G. Kula, N. Yoshida, H. Iida, K. ichi
Matsumoto, Who should review my code? A le location-based code-reviewer
recommendation approach for modern code review. Proc. of SANER, Montreal,
Canada, 2015, pp. 141–150.
[23] Y. Zhang, S. Wang, G. Ji, P. Phillips, Fruit classication using computer vision and
feedforward neural network, J. Food Eng. 143 (2014) 167–177.
[24] S.F. Crone, N. Kourentzes, Feature selection for time series prediction–a combined
lter and wrapper approach for neural networks, Neurocomputing 73 (10–12)
(2010) 1923–1936.
[25] M.B. Zanjani, H. Kagdi, C. Bird, Automatically recommending peer reviewers in
modern code review, IEEE Trans. Softw. Eng. 42 (6) (2016) 530–543.
[26] Z. Liu, X. Xia, C. Treude, D. Lo, S. Li, Automatic generation of pull request
descriptions. Proc. of ASE, San Diego, USA, 2019, pp. 1–13.
[27] J. Zhang, X. Wang, H. Zhang, H. Sun, K. Wang, X. Liu, A novel neural source code
representation ased on abstract syntax tree. Proc. of ICSE, Montreal, Canada, 2019,
pp. 783–794.
[28] T.F. Bissyande, D. Lo, L. Jiang, L. Reveillere, J. Klein, Y.L. Traon, Got issues? who
cares about it? A large scale investigation of issue trackers from github. Proc. of
ISSRE, Washington DC, USA, 2013.
[29] J.L.C. Izquierdo, V. Cosentino, B. Rolandi, A. Bergel, J. Cabot, Gila: Github label
analyzer. Proc. of SANER, 2015, pp. 479–483.
[30] T. Wang, H. Wang, G. Yin, C.X. Ling, X. Li, P. Zou, Tag recommendation for open
source software, Front. Comput. Sci. 8 (1) (2014) 69–82.
[31] J.M. Al-Kofahi, A. Tamrawi, T.T. Nguyen, H.A. Nguyen, T.N. Nguyen, Fuzzy set
approach for automatic tagging in evolving software. Proc. of ICSM, 2010,
pp. 1–10.
[32] F.M. Belem, J.M. Almeida, M.A. Goncalves, A survey on tag recommendation
methods, J. Assoc. Inf. Sci. Technol. 68 (4) (2017) 830–844.
[33] X.-Y. Wang, X. Xia, D. Lo, Tagcombine: recommending tags to contents in software
information sites, J. Comput. Sci. Technol. 30 (5) (2015) 1017–1035.
J. Jiang et al.