Predicting Defects using Network Analysis on Dependency Graphs

0 %
100 %
Information about Predicting Defects using Network Analysis on Dependency Graphs

Published on June 23, 2008

Author: tom.zimmermann

Source: slideshare.net

Description

Presented at ICSE 2008.

Predicting Defects using Network Analysis on Dependency Graphs Thomas Zimmermann, University of Calgary, Canada Nachiappan Nagappan, Microsoft Research, USA

Bugs are everywhere

Bugs are everywhere

Bugs are everywhere

Quality assurance is limited... ...by time...

Quality assurance is limited... ...by time... ...and by money.

Spent resources on the components that need it most, i.e., are most likely to fail.

Meet Jacob

Meet Jacob • Your QA manager

Meet Jacob • Your QA manager • Ten years knowledge of your project

Meet Jacob • Your QA manager • Ten years knowledge of your project • Aware of its history and the hot spots

But then Jacob left...

Meet Emily • Your new QA manager (replaces Jacob) • Not much experience with your project yet • How can she allocate resources effectively?

Meet Emily • Your new QA manager (replaces Jacob) • Not much experience with your project yet • How can she allocate resources effectively?

Indicators of defects • Code complexity - Basili et al. 1996, Subramanyam and Krishnan 2003, - Binkley and Schach 1998, Ohlsson and Alberg 1996, Nagappan et al. 2006

Indicators of defects • Code complexity - Basili et al. 1996, Subramanyam and Krishnan 2003, - Binkley and Schach 1998, Ohlsson and Alberg 1996, Nagappan et al. 2006 • Code churn - Nagappan and Ball 2005

Indicators of defects • Code complexity - Basili et al. 1996, Subramanyam and Krishnan 2003, - Binkley and Schach 1998, Ohlsson and Alberg 1996, Nagappan et al. 2006 • Code churn - Nagappan and Ball 2005 • Historical data - Khoshgoftaar et al. 1996, Graves et al. 2000, Kim et al. 2007, - Ostrand et al. 2005, Mockus et al. 2005

Indicators of defects • Code complexity - Basili et al. 1996, Subramanyam and Krishnan 2003, - Binkley and Schach 1998, Ohlsson and Alberg 1996, Nagappan et al. 2006 • Code churn - Nagappan and Ball 2005 • Historical data - Khoshgoftaar et al. 1996, Graves et al. 2000, Kim et al. 2007, - Ostrand et al. 2005, Mockus et al. 2005 • Code dependencies - Nagappan and Ball 2007, Schröter et al. 2006 - Zimmermann and Nagappan 2007

Centrality

Hypothesis Network measures on dependency graphs - correlate with the number of post-release defects (H1) - can predict the number of post-release defects (H2) - can indicate critical “escrow” binaries (H3)

DATA. .

2252 Binaries 28.3 MLOC

Windows Server layout

Windows Server layout

Windows Server layout

Windows Server layout

Data collection Release point for Windows Server 2003

Data collection Release point for Windows Server 2003 Complexity Metrics Dependencies Network Measures

Data collection six months Release point for to collect Windows Server 2003 defects Complexity Metrics Dependencies Network Measures Defects

Dependencies • Directed relationship between two pieces of code (here: binaries) • MaX dependency analysis framework -Caller-callee dependencies - Imports and exports - RPC, COM - Runtime dependencies (such as LoadLibrary) - Registry access - etc.

Centrality • Degreethe number dependencies centrality - counts • Closeness centrality binaries into account - takes distance to all other - Closeness: How close are the other binaries? - Reach: How many binaries can be reached (weighted)? - Eigenvector: similar to Pagerank • Betweenness centrality paths through a binary - counts the number of shortest

Structural holes A B C No structural hole

Structural holes A A B B C C No structural hole No structural hole between B and C

Ego networks EGO

Ego networks EGO INOUT

Ego networks EGO IN INOUT

Ego networks EGO IN OUT INOUT

Complexity metrics Group Metrics Aggregation Module metrics # functions in B for a binary B # global variables in B # executable lines in f() # parameters in f() Per-function metrics Total # functions calling f() for a function f() Max # functions called by f() McCabe’s cyclomatic complexity of f() # methods in C # subclasses of C OO metrics Total Depth of C in the inheritance tree for a class C Max Coupling between classes Cyclic coupling between classes

RESULTS. .

1 PATTERNS

Star pattern With defects No defects

Undirected cliques ... ...

Undirected cliques

Undirected cliques Average number of defects is higher for binaries in large cliques.

2 PREDICTION

Prediction Model Input metrics and measures Prediction PCA Regression

Prediction Model Input metrics and measures Prediction PCA Regression Metrics SNA Metrics+SNA

Prediction Model Input metrics and measures Prediction PCA Regression Metrics Classification SNA Metrics+SNA Ranking

Classification Has a binary a defect or not? or

Ranking Which binaries have the most defects? or or ... or

Random splits

Random splits 4×50×

Classification (logistic regression)

Classification (logistic regression) SNA increases the recall by 0.10 (at p=0.01) while precision remains comparable.

Ranking (linear regression)

Ranking (linear regression) SNA+METRICS increases the correlation by 0.10 (significant at p=0.01)

3 ESCROW

Escrow binaries • Escrowcritical binaries for Windows Server 2003 binaries -list of - development teams select binaries for escrow based on (past) experience • Special protocol for escrow binaries -involves more testing, code reviews

Predicting escrow binaries Network measures Recall GlobalInClosenessFreeman 0.60 GlobalIndwReach 0.60 EgoInSize 0.55 EgoInPairs 0.55 EgoInBroker 0.55 EgoInTies 0.50 GlobalInDegree 0.50 GlobalBetweenness 0.50 ... ... Complexity metrics Recall TotalParameters 0.30 TotalComplexity 0.30 TotalLines 0.30 TotalFanIn 0.30 TotalFanOut 0.30 ... ...

Predicting escrow binaries Network measures Recall GlobalInClosenessFreeman 0.60 GlobalIndwReach 0.60 EgoInSize 0.55 EgoInPairs 0.55 EgoInBroker 0.55 EgoInTies 0.50 GlobalInDegree 0.50 GlobalBetweenness 0.50 ... ... Complexity metrics Recall TotalParameters 0.30 TotalComplexity 0.30 TotalLines 0.30 TotalFanIn 0.30 Network measures predict twice as 0.30 many TotalFanOut ... escrow binaries as complexity metrics do. ...

CONCLUSION. . • Classification measures is 0.10 higher than for -Recall for network complexity metrics. - The precision remains comparable. • Ranking network mesures with complexity metrics -Combining increases the correlation by 0.10. • Escrow metrics fail to predict escrow binaries. - Complexity - Network measures predict 60% of escrow binaries.

Add a comment

Related presentations

Related pages

Predicting defects using network analysis on dependency graphs

In software development, resources for quality assurance are limited by time and by cost. In order to allocate resources effectively, managers need to rely ...
Read more

Predicting Defects using Network Analysis on Dependency Graphs

Predicting Defects using Network Analysis on Dependency Graphs Thomas Zimmermann+ University of Calgary Calgary, Alberta, Canada tz@acm.org Nachiappan Nagappan
Read more

Predicting Defects using Network Analysis on Dependency ...

Predicting Defects using Network Analysis ... network analysis on these dependency graphs. This allows managers to identify central program units that are ...
Read more

Predicting defects using network analysis on dependency graphs

Predicting defects using network analysis on dependency graphs on ResearchGate, the professional network for scientists.
Read more

Predicting defects using network analysis on dependency ...

Predicting defects using network analysis on dependency graphs Zimmermann, T.; Nagappan, N. In software development, resources for quality assurance are ...
Read more

CiteSeerX — Predicting Defects using Network Analysis on ...

BibTeX @MISC{Zimmermann_predictingdefects, author = {Thomas Zimmermann}, title = {Predicting Defects using Network Analysis on Dependency Graphs}, year = {}
Read more

Predicting defects using network analysis on dependency ...

Predicting defects using network analysis on ... use network analysis on these dependency graphs. ... 30th International Conference on Software ...
Read more

Predicting Subsystem Defects using Dependency Graph ...

... Predicting Subsystem Defects using Dependency Graph ... we propose to use network analysis on these dependency graphs. ... Predicting Defects with ...
Read more

Analyzing and predicting software integration bugs using ...

Analyzing and predicting software integration bugs using network analysis on requirements dependency network
Read more