advertisement

The Quality of Bug Reports in Eclipse ETX'07

100 %
0 %
advertisement
Information about The Quality of Bug Reports in Eclipse ETX'07

Published on October 26, 2007

Author: nicbet

Source: slideshare.net

Description

Talk given at ETX 2007 Workshop (ooPSLA 2007, Montreal)
advertisement

The Quality of Bug Reports in Eclipse Nicolas Bettenburg Sascha Just Adrian Schröter Saarland University Saarland University Saarland University Cathrin Weiß Rahul Premraj Tom Zimmermann Saarland University Saarland University University of Calgary

Basis of Research: Bug Reports Who Should Fix This Bug? Detection of Duplicate Defect Reports Using Natural Language Processing John Anvik, Lyndon Hiew and Gail C. Murphy Department of Computer Science University of British Columbia Per Runeson, Magnus Alexandersson and Oskar Nyholm lyndonh, murphy}@cs.ubc.ca {janvik, Software Engineering Research Group Lund University, Box 118, SE-221 00 Lund, Sweden per.runeson@telecom.lth.se ABSTRACT However, this potential advantage also comes with a sig- nificant cost. Each bug that is reported must be triaged Open source development projects typically support an open How Long will it Take to Fix This Bug? to determine if it describes a meaningful new problem or bug repository to which both developers and users can re- enhancement, and if it does, it must be assigned to an ap- port bugs. The reports that appear in this repository must propriate developer for further handling [13]. Consider the be triaged to determine if the report is one which requires Abstract and handling, hence support to speed up the duplicate case of the Eclipse open source project1 over a four month attention and if it is, which developer will be assigned the period (January 1, 2005 to April 30, 2005) when 3426 re- Rahul Premraj Cathrin Weiß Thomas Zimmermann Andreas Zeller detection process is appreciated. responsibility of resolving the report. Large open source de- Saarland University Saarland University Saarland University Defect reports are generated from various testing and Saarland University The defect reports are written in natural language, ports were filed, averaging 29 reports per day. Assuming velopments are burdened by the rate at which new bug re- that a triager takes approximately five minutes to read and and the duplicate identification requires suitable infor- weiss@st.cs.uni-sb.de premraj@cs.uni-sb.de tz@acm.org zeller@acm.org development activities in software engineering. Some- ports appear in the bug repository. In this paper, we present handle each report, two person-hours per day is being spent mation retrieval methods. In this study, we investigate a semi-automated approach intended to ease one part of this times two reports are submitted that describe the same on this activity. If all of these reports led to improvements process, the assignment of reports to a developer. Our ap- the use of Natural Language Processing (NLP) [17] problem, leading to duplicate reports. These reports in the code, this might be an acceptable cost to the project. proach applies a machine learning algorithm to the open bug techniques to help automate this process. NLP is previ- are mostly written in structured natural language, and However, since many of the reports are duplicates of exist- Abstract repository to learn the kinds of reports each developer re- project managers, because they allow to plan the cost and ously used in requirements engineering [12][3][19], as such, it is hard to compare two reports for similarity ing reports or are not valid reports, much of this work does solves. When a new report arrives, the classifier produced time of future releases. program comprehension [2] and in defect report man- not improve the product. For instance, of the 3426 reports with formal methods. In order to identify duplicates, by the machine learning technique suggests a small number for Eclipse, 1190 (36%) were marked either as invalid, a du- software problem has Predicting the time and effort for a Our approach is illustrated in Figure 1. As a new issue agement [15], although with a different angle. we investigate using Natural Language Processing of developers suitable to resolve the report. With this ap- plicate, a bug that could not be a difficult task.one that will an approach that au- long been replicated, or We present report r is entered into the bug database (1), we search for Basically, we take the words in the defect report in proach, we have reached precision levels of 57% and 64% on (NLP) techniques to support the identification. A pro- not be fixed. tomatically predicts the fixing effort, i.e., the person-hours the existing issue reports which have a description that is plain English, make some processing of the text and the Eclipse and Firefox development projects respectively. totype tool is developed and evaluated in a case study As a means of reducing the time spent triaging, we present spent on fixing an issue. Our technique leverages existing most similar to r (2). We then combine their reported effort We have also applied our approach to the gcc open source de- then use the statistics on the occurrences of the words analyzing defect reports at Sony Ericsson Mobile Com- an approach for semi-automating one part of the process, the assignment of a developer to a newly received given a new issue report, we use issue tracking systems: report. Our as a prediction for our issue report r (3). velopment with less positive results. We describe the condi- to identify similar defect reports. We implemented a munications. The evaluation shows that about 2/3 of tions under which the approach is applicable and also report approach uses a machine Lucene framework to search for similar, earlier reports the learning algorithm to recommend In contrast to previous work (see Section 8), the present prototype tool and evaluated its effects on the internal the duplicates can possibly be found using the NLP on the lessons we learned about applying machine learning to a triager a set of and use their average be appropriate developers who may time as a prediction. Our approach paper makes the following original contributions: defect reporting system of Sony Ericsson Mobile techniques. Different variants of the techniques pro- to repositories used in open source development. for resolving the bug.thus allows for early effort the triage helping in assign- This information can help estimation, Communications which contained thousands of reports. 1. We leverage existing vide databases toresult differences, indicating a robust bug only minor automatically Categories and Subject Descriptors: D.2 [Software]: process in two ways: ing issues and schedulingto process a it may allow a triager stable releases. We evaluated our Further, we interviewed some users of the prototype estimate effort for new problems. User testing shows that the overall attitude technology. Software Engineering bug more quickly, andapproach using effort with less overallJBoss project. Given it may allow triagers data from the tool to get a qualitative view of the effects. The proto- towards the technique is positive and that it has a knowledge of the system to perform bug assignments more General Terms: Management. a sufficient number of issues reports, our automatic predic- 2. We use text similarity techniques to identify those issue type tool identified about 40% of the marked duplicate growth potential. correctly. Our approach requires a project to have had an Keywords: Problem tracking, issue tracking, bug report open bug repository for some period to time from which the issues that are bugs, tions are close of the actual effort; for reports which are most closely related. defect reports, which can be seen as low figure. How- assignment, bug triage, machine learning patterns of who solves what off by only onecan bebeating na¨ve predictions by a we are kinds of bugs hour, learned. ı ever, since only one type of duplicate reports are possi- 3. Given a sufficient number of issue reports to learn Our approach also requires thefour. factor of specification of heuristics to bly found by the technique, we estimate that the tech- 1. Introduction from, our predictions are close to the actual effort, es- 1. INTRODUCTION interpret how a project uses the bug repository. We believe nique finds 2/3 of the possible duplicates. Also, in pecially for issues that are bugs. that neither of these requirements are arduous for the large Most open source software developments incorporate an terms of working hours, reducing the effort to identify When a complex software product like a mobile projects we are targeting with this approach. Using our ap- open bug repository that allows both developers and users to proach we have been 1. Introduction The remainder of the paper is organized as follows: In Sec- duplicate reports with 40% is still a substantial saving able to correctly suggest appropriate phone is developed, it is natural and common that post problems encountered with the software, suggest possi- tion 2, we give background information on the role of issue for a major software development company, which developers to whom to assign a bug with a precision between ble enhancements, and comment upon existing bug reports. software defects slip into the product, leading to func- reports in the software process. Section 3 briefly describes 57% and 64% for the Eclipse and Firefox2a bug repositories, handles thousands of defect reports every year. Predicting when particular software development task One potential advantage of an open bug repository is that it tional failures, i.e. the phone does not have the ex- how we accessed the data. Section 4 describes our statistical which we used to develop the approach. We have also ap- The paper is outlined as follows. Section 2 intro- may allow more bugs to be identified and solved, improving will be completed has always been difficult. The time it pected behavior. These failures are found in testing or plied our approach to the gcc repository, but the results the quality of the software produced [12]. approach, which is then evaluated in a case study (Section 5 duces the theory on defect reporting and on natural were not as encouraging, hovering a defect6% particularly challenging to predict. takes to fix around is precision. We other development activities and reported in a defect and 6) involving JBoss and four of its subprojects. After language processing. Section 3 presents the tailoring believe this is in partWhy to that so? In contrast to programming, which is a con- due is a prolific bug-fixing developer management system [5][18]. If the development proc- discussing threats to validity (Section 7) and related work made of the NLP techniques to fit the duplicate detec- struction process, debugging is a search process—a search who skews the learning process. ess is highly parallel, or a product line architecture is (Section 8), we close with consequences (Section 9). tion purpose. In Section 4, we specify the case study The paper makes two contributions: all of the program’s code, its runs, its which can involve used, where components are used in different products, conducted for evaluation of the technique, and Section states, or even its history. Debugging is particularly nasty the same defect may easily be reported multiple times, 5 presents the case study results. Finally Section 6 con- Eclipse provides an because the original assumptions of the program’s authors 1 extensible development environment, 6+-%,*quot;5(+'%*+,quot;*$ resulting in duplicate reports in the defect management 7*+0&/$+0%+11quot;*$ cludes the paper and outlines further work. including a Java IDE, cannot be trusted.at www.eclipse.org identified, fixing it is and can be found Once the defect is )#%-+&4.$+0%)8+*)4+ © ACM, (2006). This is the author’s version of the work. It is posted here system. These duplicates cost effort in identification (verified 31/08/05). Firefox provides a web browser and can be found butwww.earlier effort to search again a programming activity, at the by permission of ACM for your personal use. Not for redistribution. 2 ICSE’06, May 20–28, 2006, Shanghai, China. mozilla.org/products/firefox/ outweighs the correction effort. typically far (verified 07/09/05). Copyright 2006 ACM 1-59593-085-X/06/0005 ...$5.00. In this paper, we address the problem of estimating the time it takes to fix an issue1 from a novel perspective. Our 9:; 9<; approach is based on leveraging the experience from earlier issues—or, more prosaic, to extract issues reports from bug 29th International Conference on Software Engineering (ICSE'07) databases and to use their features to make predictions for 9=; 0-7695-2828-7/07 $20.00 © 2007 new, similar problems. We have used this approach to pre- dict the fixing effort—that is, the effort (in person-hours) it 234%0)$)5)#+ !quot;#$%#&'&()*%*+,quot;*$# takes to fix a particular issue. These estimates are central to -&$.%*+/quot;*0+0%+11quot;*$ 1 An issue is either a bug, feature request, or task. We refer to the Figure 1. Predicting effort for an issue report database that collects issues as bug database or issue tracking system.

  • Basis of Research: Bug Reports Who Should Fix This Bug? Detection of Duplicate Defect Reports Using Natural Language Processing John Anvik, Lyndon Hiew and Gail C. Murphy Department of Computer Science University of British Columbia Per Runeson, Magnus Alexandersson and Oskar Nyholm lyndonh, murphy}@cs.ubc.ca {janvik, Software Engineering Research Group Lund University, Box 118, SE-221 00 Lund, Sweden per.runeson@telecom.lth.se ABSTRACT However, this potential advantage also comes with a sig- nificant cost. Each bug that is reported must be triaged Open source development projects typically support an open How Long will it Take to Fix This Bug? to determine if it describes a meaningful new problem or bug repository to which both developers and users can re- enhancement, and if it does, it must be assigned to an ap- port bugs. The reports that appear in this repository must propriate developer for further handling [13]. Consider the be triaged to determine if the report is one which requires Abstract and handling, hence support to speed up the duplicate case of the Eclipse open source project1 over a four month attention and if it is, which developer will be assigned the period (January 1, 2005 to April 30, 2005) when 3426 re- Rahul Premraj Cathrin Weiß Thomas Zimmermann Andreas Zeller detection process is appreciated. responsibility of resolving the report. Large open source de- Saarland University Saarland University Saarland University Defect reports are generated from various testing and Saarland University The defect reports are written in natural language, ports were filed, averaging 29 reports per day. Assuming velopments are burdened by the rate at which new bug re- that a triager takes approximately five minutes to read and and the duplicate identification requires suitable infor- weiss@st.cs.uni-sb.de premraj@cs.uni-sb.de tz@acm.org zeller@acm.org development activities in software engineering. Some- ports appear in the bug repository. In this paper, we present handle each report, two person-hours per day is being spent mation retrieval methods. In this study, we investigate a semi-automated approach intended to ease one part of this times two reports are submitted that describe the same on this activity. If all of these reports led to improvements process, the assignment of reports to a developer. Our ap- the use of Natural Language Processing (NLP) [17] problem, leading to duplicate reports. These reports in the code, this might be an acceptable cost to the project. Good Reports proach applies a machine learning algorithm to the open bug techniques to help automate this process. NLP is previ- are mostly written in structured natural language, and However, since many of the reports are duplicates of exist- Abstract repository to learn the kinds of reports each developer re- project managers, because they allow to plan the cost and ously used in requirements engineering [12][3][19], as such, it is hard to compare two reports for similarity ing reports or are not valid reports, much of this work does solves. When a new report arrives, the classifier produced time of future releases. program comprehension [2] and in defect report man- not improve the product. For instance, of the 3426 reports with formal methods. In order to identify duplicates, by the machine learning technique suggests a small number for Eclipse, 1190 (36%) were marked either as invalid, a du- software problem has Predicting the time and effort for a Our approach is illustrated in Figure 1. As a new issue agement [15], although with a different angle. we investigate using Natural Language Processing of developers suitable to resolve the report. With this ap- plicate, a bug that could not be a difficult task.one that will an approach that au- long been replicated, or We present report r is entered into the bug database (1), we search for Basically, we take the words in the defect report in proach, we have reached precision levels of 57% and 64% on (NLP) techniques to support the identification. A pro- not be fixed. tomatically predicts the fixing effort, i.e., the person-hours the existing issue reports which have a description that is plain English, make some processing of the text and the Eclipse and Firefox development projects respectively. totype tool is developed and evaluated in a case study As a means of reducing the time spent triaging, we present spent on fixing an issue. Our technique leverages existing most similar to r (2). We then combine their reported effort We have also applied our approach to the gcc open source de- then use the statistics on the occurrences of the words analyzing defect reports at Sony Ericsson Mobile Com- an approach for semi-automating one part of the process, the assignment of a developer to a newly received given a new issue report, we use issue tracking systems: report. Our as a prediction for our issue report r (3). velopment with less positive results. We describe the condi- to identify similar defect reports. We implemented a munications. The evaluation shows that about 2/3 of tions under which the approach is applicable and also report approach uses a machine Lucene framework to search for similar, earlier reports the learning algorithm to recommend In contrast to previous work (see Section 8), the present prototype tool and evaluated its effects on the internal the duplicates can possibly be found using the NLP on the lessons we learned about applying machine learning to a triager a set of and use their average be appropriate developers who may time as a prediction. Our approach paper makes the following original contributions: defect reporting system of Sony Ericsson Mobile techniques. Different variants of the techniques pro- to repositories used in open source development. for resolving the bug.thus allows for early effort the triage helping in assign- This information can help estimation, Communications which contained thousands of reports. 1. We leverage existing vide databases toresult differences, indicating a robust bug only minor automatically Categories and Subject Descriptors: D.2 [Software]: process in two ways: ing issues and schedulingto process a it may allow a triager stable releases. We evaluated our Further, we interviewed some users of the prototype estimate effort for new problems. User testing shows that the overall attitude technology. Software Engineering bug more quickly, andapproach using effort with less overallJBoss project. Given it may allow triagers data from the tool to get a qualitative view of the effects. The proto- towards the technique is positive and that it has a knowledge of the system to perform bug assignments more General Terms: Management. a sufficient number of issues reports, our automatic predic- 2. We use text similarity techniques to identify those issue type tool identified about 40% of the marked duplicate growth potential. correctly. Our approach requires a project to have had an Keywords: Problem tracking, issue tracking, bug report open bug repository for some period to time from which the issues that are bugs, tions are close of the actual effort; for reports which are most closely related. defect reports, which can be seen as low figure. How- assignment, bug triage, machine learning patterns of who solves what off by only onecan bebeating na¨ve predictions by a we are kinds of bugs hour, learned. ı ever, since only one type of duplicate reports are possi- 3. Given a sufficient number of issue reports to learn Our approach also requires thefour. factor of specification of heuristics to bly found by the technique, we estimate that the tech- 1. Introduction from, our predictions are close to the actual effort, es- 1. INTRODUCTION interpret how a project uses the bug repository. We believe nique finds 2/3 of the possible duplicates. Also, in pecially for issues that are bugs. that neither of these requirements are arduous for the large Most open source software developments incorporate an terms of working hours, reducing the effort to identify When a complex software product like a mobile projects we are targeting with this approach. Using our ap- open bug repository that allows both developers and users to proach we have been 1. Introduction The remainder of the paper is organized as follows: In Sec- duplicate reports with 40% is still a substantial saving able to correctly suggest appropriate phone is developed, it is natural and common that post problems encountered with the software, suggest possi- tion 2, we give background information on the role of issue for a major software development company, which developers to whom to assign a bug with a precision between ble enhancements, and comment upon existing bug reports. software defects slip into the product, leading to func- reports in the software process. Section 3 briefly describes 57% and 64% for the Eclipse and Firefox2a bug repositories, handles thousands of defect reports every year. Predicting when particular software development task One potential advantage of an open bug repository is that it tional failures, i.e. the phone does not have the ex- how we accessed the data. Section 4 describes our statistical which we used to develop the approach. We have also ap- The paper is outlined as follows. Section 2 intro- may allow more bugs to be identified and solved, improving will be completed has always been difficult. The time it pected behavior. These failures are found in testing or plied our approach to the gcc repository, but the results the quality of the software produced [12]. approach, which is then evaluated in a case study (Section 5 duces the theory on defect reporting and on natural were not as encouraging, hovering a defect6% particularly challenging to predict. takes to fix around is precision. We other development activities and reported in a defect and 6) involving JBoss and four of its subprojects. After language processing. Section 3 presents the tailoring believe this is in partWhy to that so? In contrast to programming, which is a con- due is a prolific bug-fixing developer management system [5][18]. If the development proc- discussing threats to validity (Section 7) and related work made of the NLP techniques to fit the duplicate detec- struction process, debugging is a search process—a search who skews the learning process. ess is highly parallel, or a product line architecture is (Section 8), we close with consequences (Section 9). tion purpose. In Section 4, we specify the case study The paper makes two contributions: all of the program’s code, its runs, its which can involve used, where components are used in different products, conducted for evaluation of the technique, and Section states, or even its history. Debugging is particularly nasty

  • Add a comment

    Related pages

    The Quality of Bug Reports in Eclipse – Cloudy with a ...

    Introduction. The information in bug reports influences the speed at which bugs are fixed. However, bug reports differ in their quality of information.
    Read more

    Duplicate bug report detection with a combination of ...

    ... with a combination of information retrieval and ... bug reports based on their types, quality, ... bug reports in Eclipse. In ETX’07, ...
    Read more

    Duplicate bug report detection with a combination of ...

    ... with a Combination of Information Retrieval and ... quality of bug reports. However, Eclipse, ... bug reports in Eclipse. In ETX’07, ...
    Read more

    Tao Xie - Publications by Types

    Publications by Types journals ... Identifying Security Bug Reports via Text Mining: An Industrial Case ... [ETX 07] Yoonki Song, Suresh ...
    Read more

    Research Utmjb - scribd.com

    One of the major shortcomings of such duplicated fragments is that if a bug is ... Clones 79 15 Quality Analysis Based ... reports that clones do ...
    Read more

    Software clone detection: A systematic review - ScienceDirect

    Software clones may lead to bug ... This study reports an extensive ... all of the included papers contain high-quality software clone ...
    Read more

    Bug Reports | LinkedIn

    Bug Reports. Articles, experts, jobs, and more: get all the professional insights you need on LinkedIn. ... Owner at Superior Bed Bug Solutions Past
    Read more