Information about Predicting Subsystem Defects using Dependency Graph Complexities

Published on November 9, 2007

Author: tom.zimmermann

Source: slideshare.net

Presented at ISSRE 2007

Predicting Subsystem Defects using Dependency Graph Complexities search: ISSRE Thomas Zimmermann, University of Calgary, Canada Nachiappan Nagappan, Microsoft Research, USA

Bugs are everywhere

Bugs are everywhere

Bugs are everywhere

Quality assurance is limited... ...by time...

Quality assurance is limited... ...by time... ...and by money.

Resource allocation Spent resources on the components that need it most, i.e., are most likely to fail.

Meet Jacob • Your QA manager • Ten years knowledge of your project • Aware of its history and the hot spots

Meet Jacob • Your QA manager • Ten years knowledge of your project • Aware of its history and the hot spots • Likes extreme sports

Meet Emily • Your new QA manager (replaces Jacob) • Not much experience with your project yet • How can she allocate resources effectively?

Meet Emily • Your new QA manager (replaces Jacob) • Not much experience with your project yet • How can she allocate resources effectively?

Indicators of failures Code complexity ◦ Basili et al. 1996, Subramanyam and Krishnan 2003, ◦ Binkley and Schach 1998, Ohlsson and Alberg 1996, ◦ Nagappan et al. 2006 Code churn ◦ Nagappan and Ball 2005 Historical data ◦ Khoshgoftaar et al. 1996, Graves et al. 2000, Kim et al. 2007, ◦ Ostrand et al. 2005, Mockus et al. 2005 Code dependencies ◦ Nagappan and Ball 2007

Windows Server 2003

Windows Server 2003 2254 Binaries 28.4 MLOC

What are dependencies? Dependency = (directed) relationship between two pieces of code

What are dependencies? Dependency = (directed) relationship between two pieces of code MaX dependency analysis framework ◦ Caller-callee dependencies ◦ Imports and exports ◦ RPC ◦ COM ◦ Runtime dependencies (such as LoadLibrary) ◦ Registry access ◦ etc.

Windows Server layout

Windows Server layout

Windows Server layout

Windows Server layout

Complexity of subsystems Subsystem A

Complexity of subsystems Subsystem A Subsystem B

Complexity of subsystems Subsystem A Subsystem B Which subsystem has more defects?

Complexity of subsystems Subsystem A Subsystem B Which subsystem has more defects? Our hypothesis: the more complex one.

Observation #1: Cycles Dependency cycles: No dependency cycle:

Observation #1: Cycles Dependency cycles: No dependency cycle: Binaries that are part of a dependency cycle have on average twice as many defects.

Observation #2: Cliques

Observation #2: Cliques

Observation #2: Cliques Average number of defects is higher for binaries in large cliques.

Data collection

Data collection

Data collection defects Defects

Dependency graphs What is the dependency graph of a subsystem?

Dependency graphs INTRA =Internal dependencies

Dependency graphs OUT =Outgoing dependencies

Dependency graphs DEP =“Neighborhood” =INTRA + OUT + more

Complexity measures #Nodes |V| Multiplicity Complexity #Edges |E| |E|-|V|+|P| Degree Density |E|/|V|2 Eccentricity Radius Diameter

Spearman correlations

Spearman correlations Complexity Measures

Spearman correlations Dependency Graphs Complexity Measures

Spearman correlations Dependency Graphs Complexity Measures

Spearman correlations Dependency Graphs Complexity Measures

Spearman correlations Dependency Graphs Complexity Measures

Spearman correlations Dependency Graphs Complexity Measures

Predicting failures NODES EDGES COMPLEXITY DENSITY DEGREE_MIN DEGREE_MAX DEGREE_AVG ECCENTRICITY_MIN ECCENTRICITY_MAX ECCENTRICITY_AVG MULTI_EDGES MULTI_COMPLEXITY MULTI_DENSITY MULTI_DEGREE_MIN MULTI_DEGREE_MAX MULTI_DEGREE_AVG MULTI_MULTIPLICITY_MIN MULTI_MULTIPLICITY_MAX MULTI_MULTIPLICITY_AVG MULTI_ECCENTRICITY_MIN MULTI_ECCENTRICITY_MAX MULTI_ECCENTRICITY_AVG

Predicting failures NODES EDGES COMPLEXITY DENSITY DEGREE_MIN DEGREE_MAX DEGREE_AVG ECCENTRICITY_MIN ECCENTRICITY_MAX ECCENTRICITY_AVG MULTI_EDGES MULTI_COMPLEXITY MULTI_DENSITY INTRA MULTI_DEGREE_MIN MULTI_DEGREE_MAX OUT MULTI_DEGREE_AVG MULTI_MULTIPLICITY_MIN MULTI_MULTIPLICITY_MAX DEP COMBINED MULTI_MULTIPLICITY_AVG MULTI_ECCENTRICITY_MIN MULTI_ECCENTRICITY_MAX MULTI_ECCENTRICITY_AVG

Ranking

Ranking Rank Subsystem Actual Rank 1 K 3 2 L 95 3 C 6 4 G 2 5 F 8 6 A 3 7 Y 12 8 O 1 9 B 18 10 M 35 ... (many more)

Ranking Rank Subsystem Actual Rank 1 K 3 2 L 95 3 C 6 4 G 2 5 F 8 6 A 3 7 Y 12 8 O 1 9 B 18 10 M 35 ... (many more)

Ranking Rank Subsystem Actual Rank 1 K 3 2 L 95 3 C 6 4 G 2 5 F 8 6 A 3 7 Y 12 8 O 1 9 B 18 10 M 35 ... (many more)

Ranking Rank Subsystem Actual Rank 1 K 3 2 L 95 3 C 6 4 G 2 5 F 8 6 A 3 7 Y 12 8 O 1 9 B 18 10 M 35 ... (many more)

Ranking Rank Subsystem Actual Rank 1 K 3 2 L 95 3 C 6 4 G 2 5 F 8 6 A 3 7 Y 12 8 O 1 9 B 18 10 M 35 ... (many more) Spearman correlation

Random splits

Random splits 4×50×

Random splits 4×50×

Linear regression

Linear regression

Linear regression A higher predicted rank corresponds to a higher observed rank

Impact of granularity

Impact of granularity The predictions are more reliable for coarse granularities…

Impact of granularity The predictions are more reliable for coarse granularities… …at the cost of locality and stability.

Future work

Future work • Assemble the pieces of the puzzle • Evolution of dependencies predictors? Are churned dependencies better • Development process development? What’s the impact of, say, global • Human and social factors

Conclusion • Cycles and cliques correlate with defects. • The complexity of the dependency structure predicts the number of defects. • Defect predictions help to allocate resources for QA more effectively. Slides on Slideshare.net (search for ISSRE)

Contact Email: tz@acm.org nachin@microsoft.com Internet: www.softevo.org research.microsoft.com/esm

Predicting Subsystem Defects using Dependency Graph ... These dependencies can be construed as a low level graph of the ... analysis on these dependency ...

Read more

Predicting Subsystem Failures using Dependency Graph Complexities ... use the complexity of a subsystem’s dependency graph ... can predict defects.

Read more

Predicting Subsystem Failures using Dependency Graph Complexities ... complexity of a subsystem’s dependency graph ... the count) can predict defects.

Read more

CiteSeerX - Scientific documents that cite the following paper: Predicting Subsystem Defects using Dependency Graph Complexities

Read more

Software Engineering Chair (Prof. Zeller) ... Predicting Subsystem Defects using Dependency Graph ... Predicting Subsystem Defects using Dependency ...

Read more

Predicting Subsystem Failures using Dependency Graph ... Failures using Dependency Graph Complexities. ... and the most critical defects.

Read more

Predicting Subsystem Failures using Dependency Graph Complexities. ... of a subsystem's dependency graph to ... Predicting defects using ...

Read more

Predicting subsystem failures using dependency graph complexities (2007) ... {Predicting subsystem failures using dependency graph complexities} ...

Read more

## Add a comment