Detection Strategies
Motivation
The main issue in working with metrics is how should we
deal with measurement results? How can all those numbers help
us to improve the quality of our software? Many times a
metric alone cannot help very much in answering this question
and therefore metrics must be used together, in order to
make them efficient. But how should we group metrics
together in order to make them serve our purposes?
The main goal of the technique presented below is to provide
the customer with a mechanism that would allow him to work
with metrics on a more abstract level, which is conceptually
much closer to his real intentions in using metrics.
Definition
The mechanism we defined for this purpose is called detection strategy
and we define it as follows:
A detection strategy is the quantifiable expression of
a rule, by which design fragments that are conformant to that
rule can be detected in the source code.
Thus, a detection strategy is a generic mechanism for
analyzing a source code model using metrics.
Remarks
- In the context of the previous defintion, by " quantifiable
expression of a rule" we mean that the rule must be properly
expressable using software product metrics.
- Our main focus in this project is to use detection strategies
to express rules that would help us to detect design problems
in projects, i.e. to find those design fragments that are affected
by a particular design flaw. At this point we want to emphasize that
the detection strategy mechanism and the whole technique is not
limited to problem detection, and it can serve other purposes (e.g.
reverse engineering, detection of some design-patterns).
Elements of a Strategy
The use of metrics in the detection strategies is based on concepts of
filtering and composition . For the sake of simplicity we
express the detection rule using two simple grammar rules:
DetectionRule := MetricExpression | MetricComposition
MetricComposition := MetricExpression MetricOperator MetricExpression
In the following sections we will describe the elements of the formula
in more detail.
Metrics
Metrics are used in order to express
those internal characteristics of the programs that are involved in the
description of the rule.
Filtering Mechanisms
A filtering mechanisms is a statistical mean by which a
subset of the measurement results is extracted based on the
particular focus of the measurement, in the context
of the detection strategy. For example, if our goal is to detect
design problems we use them in order to capture those program elements
(i.e. methods, classes, subsystems) that have abnormal values for a
given metric. The quality of a detection strategy strongly depends on
the proper selection and parameterization of a filtering mechanism.
For the moment we considered using the following set of filtering
mechanisms:
- Tresholds -- HigherThan ; LowerThan . These filtering
mechanisms can be parameterized with a numerical value, representing the treshold.
They are well known and widely used in connection with metrics and therefore
we will not detail them any further.
- Extremities -- TopValues ; BottomValues . This group
of filters is useful in contexts where rather than indicationg precise tresholds
we would like to see on the highest (or lowest) values from a given data set. The
filters above can be parametrized using absolute (e.g. ``give me the 20
entities that have the highest values'') or percentile (e.g. ``give me the
10\% of all measured entities having the lowest values).
values.
- Outliers -- BoxPlots . This is a statistical mean by which the
abnormal values in a data set can be detected.
Filtering mechanisms are always related to a metric as described below:
MetricExpression := Metric "," Filter
Metric :=
Filter := HigherThan | LowerThan | TopValues |
BottomValues | BoxPlots
Composition Operators
In a detection strategy we usually need
more than one metric and one filtering mechanism. Thus, the strategy is built
as a composition of metrics and filtering mechanisms. The operators by
which the rule is "articulated"(composed), are called composition operators .
For the moment we use three operators: and , or and butnotin :
MetricOperator := "and" | "or" | "butnotin"
An Example
Let's say, we want to find a set of classes that exhibit their data in the interface.
We decided to use two metrics: the first one to count the number of public attributes (NOPA)
and the other one to count the number of accessor methods (NOAM). We decide that the classes
we want to find are those with the most public data from the project, but they should have
at least 3 public attributes or 5 accessor methods. Thus, we construct the following detection
rule:
(NOPA, HigherThan(3) and NOPA, TopValues(10%)) or
(NOAM, HigherThan(5) and NOAM, TopValues(10%))